summaryrefslogtreecommitdiff
path: root/TSLang.md
blob: e1c2113595b75d4eaaaa7117a1475f04aaff477f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
# TSLang
The goal of this file is to document how to begin writing Sifter code. It
is specifically directed towards understanding the high-level '`TSLang`'
interface, which is used to write the analogy-making rules (`mapper.py`) as
well as the application demos (`examples/...`).

### High-Level View
At the highest level, `TSLang` programs are programs that define and operate on
a particular type of data structure, a _triplet structure._ Most `TSLang`
programs are separated into two stages:
1. An initial triplet structure is defined with the data of interest, then
2. The initial structure is then iteratively modified by appling pre-defined
   _update rules._

`TSLang` provides syntactic sugar to help:
1. Define triplet structures (`ts_lib`),
2. Define rules operating on triplet structures (`ts_lib`, `ts_utils`), and
3. Write _tactics_ determining which rules to apply in what order (`runtime`
   and `tactic_utils`).

We will address each of these in turn.

### (0) What are Triplet Structures?
A triplet structure is a particular type of data structure suited to
representing knowledge about the world.

At its core, a triplet structure consists of exactly two things:
1. A list of _nodes_ and
2. A list of _triplet facts_, which are 3-tuples of nodes.

To give some meaning to the structure, we adopt the convention that a triplet
fact `(A, B, C)` should be interpreted like so:
1. `A` is represents a particular logical fact,
2. `B` represents an instance of something,
3. `C` represents the _type_, or role `B` plays in the fact.

For example, we might want to assert that `Homer` is the `Father` and `Marge`
is the `Mother` of `Lisa` with the facts:
```
(FamilyFact1, Homer, Father),
(FamilyFact1, Marge, Mother),
(FamilyFact1, Lisa, Daughter)
```
We think of `FamilyFact1` as the unified 'thought' or 'fact node' representing
the fact that Homer, Marge, and Lisa together make an instance of the "Family"
type, with the given roles.

For more discussion of triplet structures and how to represent logical
information with them, please see our Onward paper.

### (1) Representing Triplet Structures with `TSLang`
A triplet structure is represented by an instance of the `TripletStructure` class.
Most programs will only have one instance of `TripletStructure`, which we will
conventionally name `ts`:
```python
from ts_lib import TripletStructure
ts = TripletStructure()
```
Nodes in `TSLang` each have a unique, string name. With few exceptions, all
names should start with `/:`.  Nodes are referenced by indexing notation, and
are automatically created upon reference if they do not yet exist:
```python
# Creates nodes '/:Homer' and '/:Marge'
ts["/:Homer"], ts["/:Marge"]
```
To add a fact `(A, B, C)` we use the notation `ts[A].map({ts[B]: ts[C]})`, like
so:
```python
# Adds fact ('/:FamilyFact1', '/:Homer', '/:Father')
ts["/:FamilyFact1"].map({ts["/:Homer"]: ts["/:Father"]})
```
Multiple facts with the same fact node can be expressed naturally as well:
```python
# Adds 3 facts.
ts["/:FamilyFact1"].map({
    ts["/:Homer"]: ts["/:Father"],
    ts["/:Marge"]: ts["/:Mother"],
    ts["/:Lisa"]: ts["/:Daughter"],
})
```
To prevent name collisions and to better enable the usage of macros to
automatically manipulate the structure, we have support for _scopes_. A scope
is simply a prefix of a node name, up to (but not including) a `:`. Intuitively
we can think of node names as paths, with `:` delimiting directory boundaries,
and scopes playing the role of directories. For example, we might want to place
`Father`, `Mother`, and `Daughter` all in the `/:Family` scope, and `Homer`,
`Marge`, and `Lisa` in the `Simpsons` scope:
```python
ts["/:FamilyFact1"].map({
    ts["/:Simpsons:Homer"]: ts["/:Family:Father"],
    ts["/:Simpsons:Marge"]: ts["/:Family:Mother"],
    ts["/:Simpsons:Lisa"]: ts["/:Family:Daughter"],
})
```
In fact, `TSLang` has first-class support for scopes. By default, `ts[...]`
always indexes relative to its _current scope_, which can be changed using
`with ts.scope(...):`. The above is equivalent, for example, to:
equivalent to:
```python
# family[":..."] is equivalent to ts["/:Family:..."]
family = ts.scope("/:Family")
with ts.scope("/:Simpsons"):
    # In this block, ts[":..."] is automatically prepended with "/:Simpsons"
    # while ts["/:..."] is taken as an absolute path.
    ts["/:FamilyFact1"].map({
        ts[":Homer"]: family[":Father"],
        ts[":Marge"]: family[":Mother"],
        ts[":Lisa"]: family[":Daughter"],
    })
# Outside the with block, ts[":..."] again refers to ts["/:..."].
```
Scopes are generally used to group logically-related nodes together, especially
nodes which should be treated in a particular way by some macro (as we will see
later in this document).

### (2) Writing Update Rules to Operate on Triplet Structures
An _update rule_ is a program that searches for a pattern in the triplet
structure and then makes some modification to the structure according to that
pattern. With `TSLang`, we _define update rules within the structure itself._
We accomplish this by adding the pattern to search for as part of the
structure, and adding extra facts which annotate which parts of the pattern
need to be searched for or inserted/removed once an assignment is found.

The below example demonstrates this with a rule expressing transitivity of the
'greater than' relation:
```python
with ts.scope(":TransitivityRule"):
    # First we describe the pattern we want to search for:
    ts[":AGreaterThanB"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":B"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":BGreaterThanC"].map({
        ts[":B"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    # Then what we want to insert if that pattern is found:
    ts[":AGreaterThanC"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    # Finally, we encode the details of the rule:
    ts[":RuleFact"].map({
        # /:TransitivityRule:_ will represent the rule as a whole.
        ts[":_"]: ts["/RULE"],
        # The facts implying A>B, B>C should already exist in the structure
        # before we apply this rule, hence we 'must map' them.
        ts[":AGreaterThanB"]: ts["/MUST_MAP"],
        ts[":BGreaterThanC"]: ts["/MUST_MAP"],
        ts[":A"]: ts["/MUST_MAP"],
        ts[":B"]: ts["/MUST_MAP"],
        ts[":C"]: ts["/MUST_MAP"],
        # If they are found, we then 'insert' the A>C node and associated
        # facts.
        ts[":AGreaterThanC"]: ts["/INSERT"],
    })
```
Note that the nodes `/RULE`, `/MUST_MAP`, and `/INSERT` do not start with `/:`.
This indicates they are special nodes, which will be explicitly interpreted by
the runtime. Also note that the nodes `/:GreaterPair:Greater` and
`/:GreaterPair:Lesser` are not mentioned in the `:RuleFact`, which means they
will be treated as constants in the corresponding pattern.

Note also that, in addition to `/MUST_MAP`, other quantifications are possible:
`NO_MAP(#)` and `TRY_MAP`. In general, potential rule applications are
discovered in three passes:
1. First, we search for assignments to the constraints involving only the
   `/MUST_MAP` nodes.
2. For each of those assignments, we check to ensure that they can*NOT* be
   extended to a satisfying assignment to the `NO_MAP1`, `NO_MAP2`, ...
   constraints. We throw away any assignment which can be extended to also
   satisfy the `NO_MAP` constraints.
3. For any remaining assignments, we attempt to extend the assignment to also
   satisfy constraints involving the `TRY_MAP` nodes, if possible (otherwise
   the original assignment is used).
Similarly, in addition to `/INSERT` there are also other actions possible:
1. `/REMOVE` removes the node and _all_ associated facts.
2. `/SUBTRACT` removes only those facts explicitly mentioned by the rule (the
   node is removed if there are then no remaining facts).

Finally, note that by default we assume all of the `/MUST_MAP` nodes must have
_unique_ assignments. This can be explicitly weakened if desired to allow
pattern matches to assign the same node to two different `/MUST_MAP` variables.

### (3) Macros for Writing Update Rules
The above rule-declaration syntax can become tedious and error-prone if done by
hand. To assist in this, two macros are provided in `ts_utils.py` which
significantly improve the experience.

#### `RegisterRule`
The first, `RegisterRule`, is the most flexible. It works by putting nodes with
the same role in the rule (e.g., `/INSERT` or `/MUST_MAP`) in the same scope.
The rule from the previous example could be rewritten:
```python
with ts.scope(":TransitivityRule"):
    with ts.scope(":MustMap") as existing:
        ts[":AGreaterThanB"].map({
            ts[":A"]: ts["/:GreaterPair:Greater"],
            ts[":B"]: ts["/:GreaterPair:Lesser"],
        })
        ts[":BGreaterThanC"].map({
            ts[":B"]: ts["/:GreaterPair:Greater"],
            ts[":C"]: ts["/:GreaterPair:Lesser"],
        })
    with ts.scope(":Insert"):
        ts[":AGreaterThanC"].map({
            existing[":A"]: ts["/:GreaterPair:Greater"],
            existing[":C"]: ts["/:GreaterPair:Lesser"],
        })
    RegisterRule(ts)
```
Notice that we need to explicitly refer to `existing[":A"]` in the second
scope, as they are no longer all in the same scope.

#### `RegisterPrototype`
The second useful macro, `RegisterPrototype`, is useful when both:
1. You only need `/MUST_MAP` and `/INSERT` nodes, and
2. You want to define multiple rules which involve the same patterns.

An equivalent rule to the above is shown below:
```python
with ts.scope(":TransitivityRule"):
    ts[":AGreaterThanB"].map({
        ts[":A"]: ts["/:GreaterPair:Greater"],
        ts[":B"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":BGreaterThanC"].map({
        ts[":B"]: ts["/:GreaterPair:Greater"],
        ts[":C"]: ts["/:GreaterPair:Lesser"],
    })
    ts[":AGreaterThanC"].map({
        existing[":A"]: ts["/:GreaterPair:Greater"],
        existing[":C"]: ts["/:GreaterPair:Lesser"],
    })
    RegisterPrototype(ts, dict({
        ":_": {ts["/INSERT"]: [ts[":AGreaterThanC"]]},
    }))
```
In general, the second argument can define arbitrarily many rules using the
exact same set of nodes. Each rule lists some subset of nodes which should be
inserted (alternatively, mapped).  The remaining nodes in the scope are assumed
to be mapped (alternatively, inserted). This feature is used extensively in
`mapper.py`, as all of the mapper rules are essentially just different
'shadings' of the same core pattern (see our Onward paper for more details).

### (2,3b) Update Rule Gotchas: Inserting Only a Triplet Fact
While our rule-declaration syntax is usually quite natural, there is one
particular case where care needs to be taken. Namely, when the rule needs to
operate on individual _facts_ about _existing nodes_. For example, suppose you
want to search for nodes `A`, `B` with a fact `(A, A, B)`, and insert a new
fact `(B, B, A)`. you might try to do the following:
```python
with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        exist[":B"].map({exist[":B"]: exist[":A"]})
    RegisterRule(ts)
```
The problem is that the `/:BadRule:Insert` scope is actually empty --- the line
there only refers to nodes in the `/:BadRule:MustMap` scope. Thus the
underlying rule created will only have `/MUST_MAP` nodes, not `/INSERT`, and so
the rule is effectively a no-op. To get around this, we need to ensure that at
least one of the nodes of each fact we want to insert actually belongs to the
`:Insert` scope:
```python
with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts)
```
But now it will create an entirely new node, effectively `B2`, and fact
`(B2, B, A)`! To resolve this, we need to explicitly tell the system that
`:BadRule:Insert:B` refers to the same node as `:BadRule:MustMap:B`. This can
be done using the `AssertNodesEqual` macro in `ts_utils.py`:
```python
with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert") as insert:
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts)
    AssertNodesEqual(ts, [exist[":B"], insert[":B"]], "/:BadRule")
```
In fact, if the nodes in question end in the same name (after ignoring the
`:MustMap` or `:Insert` scopes), as `:MustMap:B` and `:Insert:B` do, then
`RegisterRule` can automatically assert their equality using the
`auto_assert_equal` option:
```python
with ts.scope(":BadRule"):
    with ts.scope(":MustMap") as exist:
        ts[":A"].map({ts[":A"]: ts[":B"]})
    with ts.scope(":Insert"):
        ts[":B"].map({ts[":B"]: exist[":A"]})
    RegisterRule(ts, auto_assert_equal=True)
```

### (4) Applying Update Rules
After declaring the `TripletStructure` with some initial facts and update rules, we
will initialize a new `TSRuntime` instance. The runtime will extract the rules
we declared earlier and provide an interface to actually modify the structure
using those rules. We initialize a `TSRuntime` like so:
```python
from runtime.runtime import TSRuntime

# ... initializing tc ...
rt = TSRuntime(ts)
```

`rt` exposes a somewhat lower-level API for applying rules:
`rt.get_rule(rule_name)` returns a representation of the desired update rule.
Given an update rule, `rt.propose(rule)` will yield possible assignments to the
rule of the form `(assignment, delta)`. `assignment` describes the nodes
satisfying the rule's pattern while `delta` describes the modification to the
structure which should occur according to the rule.

### (5) Tactics for Applying Update Rules
`tactic_utils.py` defines a number of helpful functions, particularly
`Fixedpoint`, for repeatedly applying a rule to the structure.

### (6) Recording Changes to the Structure
To assist with backtracking search, there are a number of tools available to
checkpoint and rollback the state of a structure, as well as record changes.
These are described in docstrings in `ts_lib.py`.
generated by cgit on debian on lair
contact matthew@masot.net with questions or feedback