summaryrefslogtreecommitdiff
path: root/TSLang.md
diff options
context:
space:
mode:
authorMatthew Sotoudeh <masotoudeh@ucdavis.edu>2020-11-10 14:16:20 -0800
committerGitHub <noreply@github.com>2020-11-10 14:16:20 -0800
commitde88829eccd369aa1872413d6068d5179468c47e (patch)
tree1947f69081b4f647c643b0f9d5ce8c246a9b097c /TSLang.md
parentdb99b3af36fa4687c734e1c74d83157d2f10c9ed (diff)
parentbe64046354a0451869d475e7f0d35f4eb2344c93 (diff)
Initial Code ReleaseHEADmaster
Contains code for reproducing the demos from our Onward '20 paper about Sifter.
Diffstat (limited to 'TSLang.md')
-rw-r--r--TSLang.md326
1 files changed, 326 insertions, 0 deletions
diff --git a/TSLang.md b/TSLang.md
new file mode 100644
index 0000000..e1c2113
--- /dev/null
+++ b/TSLang.md
@@ -0,0 +1,326 @@
+# TSLang
+The goal of this file is to document how to begin writing Sifter code. It
+is specifically directed towards understanding the high-level '`TSLang`'
+interface, which is used to write the analogy-making rules (`mapper.py`) as
+well as the application demos (`examples/...`).
+
+### High-Level View
+At the highest level, `TSLang` programs are programs that define and operate on
+a particular type of data structure, a _triplet structure._ Most `TSLang`
+programs are separated into two stages:
+1. An initial triplet structure is defined with the data of interest, then
+2. The initial structure is then iteratively modified by appling pre-defined
+ _update rules._
+
+`TSLang` provides syntactic sugar to help:
+1. Define triplet structures (`ts_lib`),
+2. Define rules operating on triplet structures (`ts_lib`, `ts_utils`), and
+3. Write _tactics_ determining which rules to apply in what order (`runtime`
+ and `tactic_utils`).
+
+We will address each of these in turn.
+
+### (0) What are Triplet Structures?
+A triplet structure is a particular type of data structure suited to
+representing knowledge about the world.
+
+At its core, a triplet structure consists of exactly two things:
+1. A list of _nodes_ and
+2. A list of _triplet facts_, which are 3-tuples of nodes.
+
+To give some meaning to the structure, we adopt the convention that a triplet
+fact `(A, B, C)` should be interpreted like so:
+1. `A` is represents a particular logical fact,
+2. `B` represents an instance of something,
+3. `C` represents the _type_, or role `B` plays in the fact.
+
+For example, we might want to assert that `Homer` is the `Father` and `Marge`
+is the `Mother` of `Lisa` with the facts:
+```
+(FamilyFact1, Homer, Father),
+(FamilyFact1, Marge, Mother),
+(FamilyFact1, Lisa, Daughter)
+```
+We think of `FamilyFact1` as the unified 'thought' or 'fact node' representing
+the fact that Homer, Marge, and Lisa together make an instance of the "Family"
+type, with the given roles.
+
+For more discussion of triplet structures and how to represent logical
+information with them, please see our Onward paper.
+
+### (1) Representing Triplet Structures with `TSLang`
+A triplet structure is represented by an instance of the `TripletStructure` class.
+Most programs will only have one instance of `TripletStructure`, which we will
+conventionally name `ts`:
+```python
+from ts_lib import TripletStructure
+ts = TripletStructure()
+```
+Nodes in `TSLang` each have a unique, string name. With few exceptions, all
+names should start with `/:`. Nodes are referenced by indexing notation, and
+are automatically created upon reference if they do not yet exist:
+```python
+# Creates nodes '/:Homer' and '/:Marge'
+ts["/:Homer"], ts["/:Marge"]
+```
+To add a fact `(A, B, C)` we use the notation `ts[A].map({ts[B]: ts[C]})`, like
+so:
+```python
+# Adds fact ('/:FamilyFact1', '/:Homer', '/:Father')
+ts["/:FamilyFact1"].map({ts["/:Homer"]: ts["/:Father"]})
+```
+Multiple facts with the same fact node can be expressed naturally as well:
+```python
+# Adds 3 facts.
+ts["/:FamilyFact1"].map({
+ ts["/:Homer"]: ts["/:Father"],
+ ts["/:Marge"]: ts["/:Mother"],
+ ts["/:Lisa"]: ts["/:Daughter"],
+})
+```
+To prevent name collisions and to better enable the usage of macros to
+automatically manipulate the structure, we have support for _scopes_. A scope
+is simply a prefix of a node name, up to (but not including) a `:`. Intuitively
+we can think of node names as paths, with `:` delimiting directory boundaries,
+and scopes playing the role of directories. For example, we might want to place
+`Father`, `Mother`, and `Daughter` all in the `/:Family` scope, and `Homer`,
+`Marge`, and `Lisa` in the `Simpsons` scope:
+```python
+ts["/:FamilyFact1"].map({
+ ts["/:Simpsons:Homer"]: ts["/:Family:Father"],
+ ts["/:Simpsons:Marge"]: ts["/:Family:Mother"],
+ ts["/:Simpsons:Lisa"]: ts["/:Family:Daughter"],
+})
+```
+In fact, `TSLang` has first-class support for scopes. By default, `ts[...]`
+always indexes relative to its _current scope_, which can be changed using
+`with ts.scope(...):`. The above is equivalent, for example, to:
+equivalent to:
+```python
+# family[":..."] is equivalent to ts["/:Family:..."]
+family = ts.scope("/:Family")
+with ts.scope("/:Simpsons"):
+ # In this block, ts[":..."] is automatically prepended with "/:Simpsons"
+ # while ts["/:..."] is taken as an absolute path.
+ ts["/:FamilyFact1"].map({
+ ts[":Homer"]: family[":Father"],
+ ts[":Marge"]: family[":Mother"],
+ ts[":Lisa"]: family[":Daughter"],
+ })
+# Outside the with block, ts[":..."] again refers to ts["/:..."].
+```
+Scopes are generally used to group logically-related nodes together, especially
+nodes which should be treated in a particular way by some macro (as we will see
+later in this document).
+
+### (2) Writing Update Rules to Operate on Triplet Structures
+An _update rule_ is a program that searches for a pattern in the triplet
+structure and then makes some modification to the structure according to that
+pattern. With `TSLang`, we _define update rules within the structure itself._
+We accomplish this by adding the pattern to search for as part of the
+structure, and adding extra facts which annotate which parts of the pattern
+need to be searched for or inserted/removed once an assignment is found.
+
+The below example demonstrates this with a rule expressing transitivity of the
+'greater than' relation:
+```python
+with ts.scope(":TransitivityRule"):
+ # First we describe the pattern we want to search for:
+ ts[":AGreaterThanB"].map({
+ ts[":A"]: ts["/:GreaterPair:Greater"],
+ ts[":B"]: ts["/:GreaterPair:Lesser"],
+ })
+ ts[":BGreaterThanC"].map({
+ ts[":B"]: ts["/:GreaterPair:Greater"],
+ ts[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ # Then what we want to insert if that pattern is found:
+ ts[":AGreaterThanC"].map({
+ ts[":A"]: ts["/:GreaterPair:Greater"],
+ ts[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ # Finally, we encode the details of the rule:
+ ts[":RuleFact"].map({
+ # /:TransitivityRule:_ will represent the rule as a whole.
+ ts[":_"]: ts["/RULE"],
+ # The facts implying A>B, B>C should already exist in the structure
+ # before we apply this rule, hence we 'must map' them.
+ ts[":AGreaterThanB"]: ts["/MUST_MAP"],
+ ts[":BGreaterThanC"]: ts["/MUST_MAP"],
+ ts[":A"]: ts["/MUST_MAP"],
+ ts[":B"]: ts["/MUST_MAP"],
+ ts[":C"]: ts["/MUST_MAP"],
+ # If they are found, we then 'insert' the A>C node and associated
+ # facts.
+ ts[":AGreaterThanC"]: ts["/INSERT"],
+ })
+```
+Note that the nodes `/RULE`, `/MUST_MAP`, and `/INSERT` do not start with `/:`.
+This indicates they are special nodes, which will be explicitly interpreted by
+the runtime. Also note that the nodes `/:GreaterPair:Greater` and
+`/:GreaterPair:Lesser` are not mentioned in the `:RuleFact`, which means they
+will be treated as constants in the corresponding pattern.
+
+Note also that, in addition to `/MUST_MAP`, other quantifications are possible:
+`NO_MAP(#)` and `TRY_MAP`. In general, potential rule applications are
+discovered in three passes:
+1. First, we search for assignments to the constraints involving only the
+ `/MUST_MAP` nodes.
+2. For each of those assignments, we check to ensure that they can*NOT* be
+ extended to a satisfying assignment to the `NO_MAP1`, `NO_MAP2`, ...
+ constraints. We throw away any assignment which can be extended to also
+ satisfy the `NO_MAP` constraints.
+3. For any remaining assignments, we attempt to extend the assignment to also
+ satisfy constraints involving the `TRY_MAP` nodes, if possible (otherwise
+ the original assignment is used).
+Similarly, in addition to `/INSERT` there are also other actions possible:
+1. `/REMOVE` removes the node and _all_ associated facts.
+2. `/SUBTRACT` removes only those facts explicitly mentioned by the rule (the
+ node is removed if there are then no remaining facts).
+
+Finally, note that by default we assume all of the `/MUST_MAP` nodes must have
+_unique_ assignments. This can be explicitly weakened if desired to allow
+pattern matches to assign the same node to two different `/MUST_MAP` variables.
+
+### (3) Macros for Writing Update Rules
+The above rule-declaration syntax can become tedious and error-prone if done by
+hand. To assist in this, two macros are provided in `ts_utils.py` which
+significantly improve the experience.
+
+#### `RegisterRule`
+The first, `RegisterRule`, is the most flexible. It works by putting nodes with
+the same role in the rule (e.g., `/INSERT` or `/MUST_MAP`) in the same scope.
+The rule from the previous example could be rewritten:
+```python
+with ts.scope(":TransitivityRule"):
+ with ts.scope(":MustMap") as existing:
+ ts[":AGreaterThanB"].map({
+ ts[":A"]: ts["/:GreaterPair:Greater"],
+ ts[":B"]: ts["/:GreaterPair:Lesser"],
+ })
+ ts[":BGreaterThanC"].map({
+ ts[":B"]: ts["/:GreaterPair:Greater"],
+ ts[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ with ts.scope(":Insert"):
+ ts[":AGreaterThanC"].map({
+ existing[":A"]: ts["/:GreaterPair:Greater"],
+ existing[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ RegisterRule(ts)
+```
+Notice that we need to explicitly refer to `existing[":A"]` in the second
+scope, as they are no longer all in the same scope.
+
+#### `RegisterPrototype`
+The second useful macro, `RegisterPrototype`, is useful when both:
+1. You only need `/MUST_MAP` and `/INSERT` nodes, and
+2. You want to define multiple rules which involve the same patterns.
+
+An equivalent rule to the above is shown below:
+```python
+with ts.scope(":TransitivityRule"):
+ ts[":AGreaterThanB"].map({
+ ts[":A"]: ts["/:GreaterPair:Greater"],
+ ts[":B"]: ts["/:GreaterPair:Lesser"],
+ })
+ ts[":BGreaterThanC"].map({
+ ts[":B"]: ts["/:GreaterPair:Greater"],
+ ts[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ ts[":AGreaterThanC"].map({
+ existing[":A"]: ts["/:GreaterPair:Greater"],
+ existing[":C"]: ts["/:GreaterPair:Lesser"],
+ })
+ RegisterPrototype(ts, dict({
+ ":_": {ts["/INSERT"]: [ts[":AGreaterThanC"]]},
+ }))
+```
+In general, the second argument can define arbitrarily many rules using the
+exact same set of nodes. Each rule lists some subset of nodes which should be
+inserted (alternatively, mapped). The remaining nodes in the scope are assumed
+to be mapped (alternatively, inserted). This feature is used extensively in
+`mapper.py`, as all of the mapper rules are essentially just different
+'shadings' of the same core pattern (see our Onward paper for more details).
+
+### (2,3b) Update Rule Gotchas: Inserting Only a Triplet Fact
+While our rule-declaration syntax is usually quite natural, there is one
+particular case where care needs to be taken. Namely, when the rule needs to
+operate on individual _facts_ about _existing nodes_. For example, suppose you
+want to search for nodes `A`, `B` with a fact `(A, A, B)`, and insert a new
+fact `(B, B, A)`. you might try to do the following:
+```python
+with ts.scope(":BadRule"):
+ with ts.scope(":MustMap") as exist:
+ ts[":A"].map({ts[":A"]: ts[":B"]})
+ with ts.scope(":Insert"):
+ exist[":B"].map({exist[":B"]: exist[":A"]})
+ RegisterRule(ts)
+```
+The problem is that the `/:BadRule:Insert` scope is actually empty --- the line
+there only refers to nodes in the `/:BadRule:MustMap` scope. Thus the
+underlying rule created will only have `/MUST_MAP` nodes, not `/INSERT`, and so
+the rule is effectively a no-op. To get around this, we need to ensure that at
+least one of the nodes of each fact we want to insert actually belongs to the
+`:Insert` scope:
+```python
+with ts.scope(":BadRule"):
+ with ts.scope(":MustMap") as exist:
+ ts[":A"].map({ts[":A"]: ts[":B"]})
+ with ts.scope(":Insert"):
+ ts[":B"].map({ts[":B"]: exist[":A"]})
+ RegisterRule(ts)
+```
+But now it will create an entirely new node, effectively `B2`, and fact
+`(B2, B, A)`! To resolve this, we need to explicitly tell the system that
+`:BadRule:Insert:B` refers to the same node as `:BadRule:MustMap:B`. This can
+be done using the `AssertNodesEqual` macro in `ts_utils.py`:
+```python
+with ts.scope(":BadRule"):
+ with ts.scope(":MustMap") as exist:
+ ts[":A"].map({ts[":A"]: ts[":B"]})
+ with ts.scope(":Insert") as insert:
+ ts[":B"].map({ts[":B"]: exist[":A"]})
+ RegisterRule(ts)
+ AssertNodesEqual(ts, [exist[":B"], insert[":B"]], "/:BadRule")
+```
+In fact, if the nodes in question end in the same name (after ignoring the
+`:MustMap` or `:Insert` scopes), as `:MustMap:B` and `:Insert:B` do, then
+`RegisterRule` can automatically assert their equality using the
+`auto_assert_equal` option:
+```python
+with ts.scope(":BadRule"):
+ with ts.scope(":MustMap") as exist:
+ ts[":A"].map({ts[":A"]: ts[":B"]})
+ with ts.scope(":Insert"):
+ ts[":B"].map({ts[":B"]: exist[":A"]})
+ RegisterRule(ts, auto_assert_equal=True)
+```
+
+### (4) Applying Update Rules
+After declaring the `TripletStructure` with some initial facts and update rules, we
+will initialize a new `TSRuntime` instance. The runtime will extract the rules
+we declared earlier and provide an interface to actually modify the structure
+using those rules. We initialize a `TSRuntime` like so:
+```python
+from runtime.runtime import TSRuntime
+
+# ... initializing tc ...
+rt = TSRuntime(ts)
+```
+
+`rt` exposes a somewhat lower-level API for applying rules:
+`rt.get_rule(rule_name)` returns a representation of the desired update rule.
+Given an update rule, `rt.propose(rule)` will yield possible assignments to the
+rule of the form `(assignment, delta)`. `assignment` describes the nodes
+satisfying the rule's pattern while `delta` describes the modification to the
+structure which should occur according to the rule.
+
+### (5) Tactics for Applying Update Rules
+`tactic_utils.py` defines a number of helpful functions, particularly
+`Fixedpoint`, for repeatedly applying a rule to the structure.
+
+### (6) Recording Changes to the Structure
+To assist with backtracking search, there are a number of tools available to
+checkpoint and rollback the state of a structure, as well as record changes.
+These are described in docstrings in `ts_lib.py`.
generated by cgit on debian on lair
contact matthew@masot.net with questions or feedback