The goal of this file is to document how to begin writing Sifter code. It
is specifically directed towards understanding the high-level 'TSLang
'
interface, which is used to write the analogy-making rules (mapper.py
) as
well as the application demos (examples/...
).
At the highest level, TSLang
programs are programs that define and operate on
a particular type of data structure, a triplet structure. Most TSLang
programs are separated into two stages:
- An initial triplet structure is defined with the data of interest, then
- The initial structure is then iteratively modified by appling pre-defined update rules.
TSLang
provides syntactic sugar to help:
- Define triplet structures (
ts_lib
), - Define rules operating on triplet structures (
ts_lib
,ts_utils
), and - Write tactics determining which rules to apply in what order (
runtime
andtactic_utils
).
We will address each of these in turn.
A triplet structure is a particular type of data structure suited to representing knowledge about the world.
At its core, a triplet structure consists of exactly two things:
- A list of nodes and
- A list of triplet facts, which are 3-tuples of nodes.
To give some meaning to the structure, we adopt the convention that a triplet
fact (A, B, C)
should be interpreted like so:
A
is represents a particular logical fact,B
represents an instance of something,C
represents the type, or roleB
plays in the fact.
For example, we might want to assert that Homer
is the Father
and Marge
is the Mother
of Lisa
with the facts:
(FamilyFact1, Homer, Father),
(FamilyFact1, Marge, Mother),
(FamilyFact1, Lisa, Daughter)
We think of FamilyFact1
as the unified 'thought' or 'fact node' representing
the fact that Homer, Marge, and Lisa together make an instance of the "Family"
type, with the given roles.
For more discussion of triplet structures and how to represent logical information with them, please see our Onward paper.
A triplet structure is represented by an instance of the TripletStructure
class.
Most programs will only have one instance of TripletStructure
, which we will
conventionally name ts
:
from ts_lib import TripletStructure
ts = TripletStructure()
Nodes in TSLang
each have a unique, string name. With few exceptions, all
names should start with /:
. Nodes are referenced by indexing notation, and
are automatically created upon reference if they do not yet exist:
# Creates nodes '/:Homer' and '/:Marge'
ts["/:Homer"], ts["/:Marge"]
To add a fact (A, B, C)
we use the notation ts[A].map({ts[B]: ts[C]})
, like
so:
# Adds fact ('/:FamilyFact1', '/:Homer', '/:Father')
ts["/:FamilyFact1"].map({ts["/:Homer"]: ts["/:Father"]})
Multiple facts with the same fact node can be expressed naturally as well:
# Adds 3 facts.
ts["/:FamilyFact1"].map({
ts["/:Homer"]: ts["/:Father"],
ts["/:Marge"]: ts["/:Mother"],
ts["/:Lisa"]: ts["/:Daughter"],
})
To prevent name collisions and to better enable the usage of macros to
automatically manipulate the structure, we have support for scopes. A scope
is simply a prefix of a node name, up to (but not including) a :
. Intuitively
we can think of node names as paths, with :
delimiting directory boundaries,
and scopes playing the role of directories. For example, we might want to place
Father
, Mother
, and Daughter
all in the /:Family
scope, and Homer
,
Marge
, and Lisa
in the Simpsons
scope:
ts["/:FamilyFact1"].map({
ts["/:Simpsons:Homer"]: ts["/:Family:Father"],
ts["/:Simpsons:Marge"]: ts["/:Family:Mother"],
ts["/:Simpsons:Lisa"]: ts["/:Family:Daughter"],
})
In fact, TSLang
has first-class support for scopes. By default, ts[...]
always indexes relative to its current scope, which can be changed using
with ts.scope(...):
. The above is equivalent, for example, to:
equivalent to:
# family[":..."] is equivalent to ts["/:Family:..."]
family = ts.scope("/:Family")
with ts.scope("/:Simpsons"):
# In this block, ts[":..."] is automatically prepended with "/:Simpsons"
# while ts["/:..."] is taken as an absolute path.
ts["/:FamilyFact1"].map({
ts[":Homer"]: family[":Father"],
ts[":Marge"]: family[":Mother"],
ts[":Lisa"]: family[":Daughter"],
})
# Outside the with block, ts[":..."] again refers to ts["/:..."].
Scopes are generally used to group logically-related nodes together, especially nodes which should be treated in a particular way by some macro (as we will see later in this document).
An update rule is a program that searches for a pattern in the triplet
structure and then makes some modification to the structure according to that
pattern. With TSLang
, we define update rules within the structure itself.
We accomplish this by adding the pattern to search for as part of the
structure, and adding extra facts which annotate which parts of the pattern
need to be searched for or inserted/removed once an assignment is found.
The below example demonstrates this with a rule expressing transitivity of the 'greater than' relation:
with ts.scope(":TransitivityRule"):
# First we describe the pattern we want to search for:
ts[":AGreaterThanB"].map({
ts[":A"]: ts["/:GreaterPair:Greater"],
ts[":B"]: ts["/:GreaterPair:Lesser"],
})
ts[":BGreaterThanC"].map({
ts[":B"]: ts["/:GreaterPair:Greater"],
ts[":C"]: ts["/:GreaterPair:Lesser"],
})
# Then what we want to insert if that pattern is found:
ts[":AGreaterThanC"].map({
ts[":A"]: ts["/:GreaterPair:Greater"],
ts[":C"]: ts["/:GreaterPair:Lesser"],
})
# Finally, we encode the details of the rule:
ts[":RuleFact"].map({
# /:TransitivityRule:_ will represent the rule as a whole.
ts[":_"]: ts["/RULE"],
# The facts implying A>B, B>C should already exist in the structure
# before we apply this rule, hence we 'must map' them.
ts[":AGreaterThanB"]: ts["/MUST_MAP"],
ts[":BGreaterThanC"]: ts["/MUST_MAP"],
ts[":A"]: ts["/MUST_MAP"],
ts[":B"]: ts["/MUST_MAP"],
ts[":C"]: ts["/MUST_MAP"],
# If they are found, we then 'insert' the A>C node and associated
# facts.
ts[":AGreaterThanC"]: ts["/INSERT"],
})
Note that the nodes /RULE
, /MUST_MAP
, and /INSERT
do not start with /:
.
This indicates they are special nodes, which will be explicitly interpreted by
the runtime. Also note that the nodes /:GreaterPair:Greater
and
/:GreaterPair:Lesser
are not mentioned in the :RuleFact
, which means they
will be treated as constants in the corresponding pattern.
Note also that, in addition to /MUST_MAP
, other quantifications are possible:
NO_MAP(#)
and TRY_MAP
. In general, potential rule applications are
discovered in three passes:
- First, we search for assignments to the constraints involving only the
/MUST_MAP
nodes. - For each of those assignments, we check to ensure that they canNOT be
extended to a satisfying assignment to the
NO_MAP1
,NO_MAP2
, ... constraints. We throw away any assignment which can be extended to also satisfy theNO_MAP
constraints. - For any remaining assignments, we attempt to extend the assignment to also
satisfy constraints involving the
TRY_MAP
nodes, if possible (otherwise the original assignment is used). Similarly, in addition to/INSERT
there are also other actions possible: /REMOVE
removes the node and all associated facts./SUBTRACT
removes only those facts explicitly mentioned by the rule (the node is removed if there are then no remaining facts).
Finally, note that by default we assume all of the /MUST_MAP
nodes must have
unique assignments. This can be explicitly weakened if desired to allow
pattern matches to assign the same node to two different /MUST_MAP
variables.
The above rule-declaration syntax can become tedious and error-prone if done by
hand. To assist in this, two macros are provided in ts_utils.py
which
significantly improve the experience.
The first, RegisterRule
, is the most flexible. It works by putting nodes with
the same role in the rule (e.g., /INSERT
or /MUST_MAP
) in the same scope.
The rule from the previous example could be rewritten:
with ts.scope(":TransitivityRule"):
with ts.scope(":MustMap") as existing:
ts[":AGreaterThanB"].map({
ts[":A"]: ts["/:GreaterPair:Greater"],
ts[":B"]: ts["/:GreaterPair:Lesser"],
})
ts[":BGreaterThanC"].map({
ts[":B"]: ts["/:GreaterPair:Greater"],
ts[":C"]: ts["/:GreaterPair:Lesser"],
})
with ts.scope(":Insert"):
ts[":AGreaterThanC"].map({
existing[":A"]: ts["/:GreaterPair:Greater"],
existing[":C"]: ts["/:GreaterPair:Lesser"],
})
RegisterRule(ts)
Notice that we need to explicitly refer to existing[":A"]
in the second
scope, as they are no longer all in the same scope.
The second useful macro, RegisterPrototype
, is useful when both:
- You only need
/MUST_MAP
and/INSERT
nodes, and - You want to define multiple rules which involve the same patterns.
An equivalent rule to the above is shown below:
with ts.scope(":TransitivityRule"):
ts[":AGreaterThanB"].map({
ts[":A"]: ts["/:GreaterPair:Greater"],
ts[":B"]: ts["/:GreaterPair:Lesser"],
})
ts[":BGreaterThanC"].map({
ts[":B"]: ts["/:GreaterPair:Greater"],
ts[":C"]: ts["/:GreaterPair:Lesser"],
})
ts[":AGreaterThanC"].map({
existing[":A"]: ts["/:GreaterPair:Greater"],
existing[":C"]: ts["/:GreaterPair:Lesser"],
})
RegisterPrototype(ts, dict({
":_": {ts["/INSERT"]: [ts[":AGreaterThanC"]]},
}))
In general, the second argument can define arbitrarily many rules using the
exact same set of nodes. Each rule lists some subset of nodes which should be
inserted (alternatively, mapped). The remaining nodes in the scope are assumed
to be mapped (alternatively, inserted). This feature is used extensively in
mapper.py
, as all of the mapper rules are essentially just different
'shadings' of the same core pattern (see our Onward paper for more details).
While our rule-declaration syntax is usually quite natural, there is one
particular case where care needs to be taken. Namely, when the rule needs to
operate on individual facts about existing nodes. For example, suppose you
want to search for nodes A
, B
with a fact (A, A, B)
, and insert a new
fact (B, B, A)
. you might try to do the following:
with ts.scope(":BadRule"):
with ts.scope(":MustMap") as exist:
ts[":A"].map({ts[":A"]: ts[":B"]})
with ts.scope(":Insert"):
exist[":B"].map({exist[":B"]: exist[":A"]})
RegisterRule(ts)
The problem is that the /:BadRule:Insert
scope is actually empty --- the line
there only refers to nodes in the /:BadRule:MustMap
scope. Thus the
underlying rule created will only have /MUST_MAP
nodes, not /INSERT
, and so
the rule is effectively a no-op. To get around this, we need to ensure that at
least one of the nodes of each fact we want to insert actually belongs to the
:Insert
scope:
with ts.scope(":BadRule"):
with ts.scope(":MustMap") as exist:
ts[":A"].map({ts[":A"]: ts[":B"]})
with ts.scope(":Insert"):
ts[":B"].map({ts[":B"]: exist[":A"]})
RegisterRule(ts)
But now it will create an entirely new node, effectively B2
, and fact
(B2, B, A)
! To resolve this, we need to explicitly tell the system that
:BadRule:Insert:B
refers to the same node as :BadRule:MustMap:B
. This can
be done using the AssertNodesEqual
macro in ts_utils.py
:
with ts.scope(":BadRule"):
with ts.scope(":MustMap") as exist:
ts[":A"].map({ts[":A"]: ts[":B"]})
with ts.scope(":Insert") as insert:
ts[":B"].map({ts[":B"]: exist[":A"]})
RegisterRule(ts)
AssertNodesEqual(ts, [exist[":B"], insert[":B"]], "/:BadRule")
In fact, if the nodes in question end in the same name (after ignoring the
:MustMap
or :Insert
scopes), as :MustMap:B
and :Insert:B
do, then
RegisterRule
can automatically assert their equality using the
auto_assert_equal
option:
with ts.scope(":BadRule"):
with ts.scope(":MustMap") as exist:
ts[":A"].map({ts[":A"]: ts[":B"]})
with ts.scope(":Insert"):
ts[":B"].map({ts[":B"]: exist[":A"]})
RegisterRule(ts, auto_assert_equal=True)
After declaring the TripletStructure
with some initial facts and update rules, we
will initialize a new TSRuntime
instance. The runtime will extract the rules
we declared earlier and provide an interface to actually modify the structure
using those rules. We initialize a TSRuntime
like so:
from runtime.runtime import TSRuntime
# ... initializing tc ...
rt = TSRuntime(ts)
rt
exposes a somewhat lower-level API for applying rules:
rt.get_rule(rule_name)
returns a representation of the desired update rule.
Given an update rule, rt.propose(rule)
will yield possible assignments to the
rule of the form (assignment, delta)
. assignment
describes the nodes
satisfying the rule's pattern while delta
describes the modification to the
structure which should occur according to the rule.
tactic_utils.py
defines a number of helpful functions, particularly
Fixedpoint
, for repeatedly applying a rule to the structure.
To assist with backtracking search, there are a number of tools available to
checkpoint and rollback the state of a structure, as well as record changes.
These are described in docstrings in ts_lib.py
.