You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

906 lines
49 KiB
Plaintext

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"metadata": {
"language_info": {
"codemirror_mode": {
"name": "text/plain"
},
"file_extension": ".txt",
"mimetype": "text/plain",
"name": "cozo",
"nbconvert_exporter": "text",
"pygments_lexer": "text",
"version": "es2017"
},
"kernelspec": {
"name": "cozo",
"display_name": "CozoScript (localhost)",
"language": "text"
}
},
"nbformat_minor": 4,
"nbformat": 4,
"cells": [
{
"cell_type": "markdown",
"source": "# The pilgrim to Mount Acid",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "## Stored relations",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "An obvious shortcoming of our previous acrobatics is that we have to carry around our love triangles network and enter it anew for every query, which leads to rapid deterioration of the `CTRL`, `C` and `V` keys. So let's fix that:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[] <- [['alice', 'eve'],\n ['bob', 'alice'],\n ['eve', 'alice'],\n ['eve', 'bob'],\n ['eve', 'charlie'],\n ['charlie', 'eve'],\n ['david', 'george'],\n ['george', 'george']]\n \n:relation create triangles",
"metadata": {
"trusted": true
},
"execution_count": 1,
"outputs": [
{
"execution_count": 1,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 9ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "We have the _query directive_ `:relation create` together with a normal query. The results will then be stored on your disk with the name `triangles` instead of returned to you.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "You will receive an error if you try to run this script twice. In which case don't worry and continue.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Stored relations are safe from restarts and power failures. Let's query against it:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[a, b] := :triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 2,
"outputs": [
{
"execution_count": 2,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td><td style=\"font-weight: bold\">b</td></tr></thead><tbody><tr><td>alice</td><td>eve</td></tr><tr><td>bob</td><td>alice</td></tr><tr><td>charlie</td><td>eve</td></tr><tr><td>david</td><td>george</td></tr><tr><td>eve</td><td>alice</td></tr><tr><td>eve</td><td>bob</td></tr><tr><td>eve</td><td>charlie</td></tr><tr><td>george</td><td>george</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "The colon `:` in front of the name tells the database that we want a _stored_ relation instead of a relation defined within the query itself.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Now, Fred finally comes to the party and Fred loves Alice and Eve. We add these facts in the following way:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[] <- [['fred', 'alice'],\n ['fred', 'eve']]\n\n:relation put triangles",
"metadata": {
"trusted": true
},
"execution_count": 3,
"outputs": [
{
"execution_count": 3,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[a, b] := :triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 4,
"outputs": [
{
"execution_count": 4,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td><td style=\"font-weight: bold\">b</td></tr></thead><tbody><tr><td>alice</td><td>eve</td></tr><tr><td>bob</td><td>alice</td></tr><tr><td>charlie</td><td>eve</td></tr><tr><td>david</td><td>george</td></tr><tr><td>eve</td><td>alice</td></tr><tr><td>eve</td><td>bob</td></tr><tr><td>eve</td><td>charlie</td></tr><tr><td>fred</td><td>alice</td></tr><tr><td>fred</td><td>eve</td></tr><tr><td>george</td><td>george</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Notice that we used `:relation put` instead of `:relation create`. In fact, you can use `:relation put` before any call to `:relation create`. The `create` op just ensures that the insertion is into a new stored relation.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Now Eve no longer loves Alice and Charlie! Let's reflect this fact by using `retract`",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[] <- [['eve', 'charlie'],\n ['eve', 'alice']]\n\n:relation retract triangles",
"metadata": {
"trusted": true
},
"execution_count": 5,
"outputs": [
{
"execution_count": 5,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[a, b] := :triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 6,
"outputs": [
{
"execution_count": 6,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td><td style=\"font-weight: bold\">b</td></tr></thead><tbody><tr><td>alice</td><td>eve</td></tr><tr><td>bob</td><td>alice</td></tr><tr><td>charlie</td><td>eve</td></tr><tr><td>david</td><td>george</td></tr><tr><td>eve</td><td>bob</td></tr><tr><td>fred</td><td>alice</td></tr><tr><td>fred</td><td>eve</td></tr><tr><td>george</td><td>george</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "It is OK to retract non-existent facts, in which case the operation does nothing.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "You can also reset the whole relation with `rederive`:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[] <- [['eve', 'charlie'],\n ['eve', 'alice']]\n\n:relation rederive triangles",
"metadata": {
"trusted": true
},
"execution_count": 7,
"outputs": [
{
"execution_count": 7,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[a, b] := :triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 8,
"outputs": [
{
"execution_count": 8,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td><td style=\"font-weight: bold\">b</td></tr></thead><tbody><tr><td>eve</td><td>alice</td></tr><tr><td>eve</td><td>charlie</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Only the `rederive`ed tuples remain.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "You can see what stored relations you currently have in your database by running the following _system directive_:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db relations",
"metadata": {
"trusted": true
},
"execution_count": 9,
"outputs": [
{
"execution_count": 9,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">arity</td></tr></thead><tbody><tr><td>triangles</td><td><span style=\"color: #307fc1;\">2</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Relations can be renamed:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db rename relation triangles love_triangles",
"metadata": {
"trusted": true
},
"execution_count": 10,
"outputs": [
{
"execution_count": 10,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": ":db relations",
"metadata": {
"trusted": true
},
"execution_count": 11,
"outputs": [
{
"execution_count": 11,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">arity</td></tr></thead><tbody><tr><td>love_triangles</td><td><span style=\"color: #307fc1;\">2</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[a, b] := :love_triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 12,
"outputs": [
{
"execution_count": 12,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td><td style=\"font-weight: bold\">b</td></tr></thead><tbody><tr><td>eve</td><td>alice</td></tr><tr><td>eve</td><td>charlie</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Now this triangles business is becoming tiring. Let's get rid of it:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db remove relation love_triangles",
"metadata": {
"trusted": true
},
"execution_count": 13,
"outputs": [
{
"execution_count": 13,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Since we do not have any queries to run when nuking relations, we use a system directive instead of a query directive. Now you can no longer query the triangles:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[a, b] := :love_triangles[a, b]",
"metadata": {
"trusted": true
},
"execution_count": 14,
"outputs": [
{
"execution_count": 14,
"output_type": "execute_result",
"data": {
"text/html": "<pre style=\"font-size: small\"><span style='color:#a00'>query::relation_not_found</span>\n\n <span style='color:#a00'>×</span> Cannot find requested stored relation &#39;love_triangles&#39;\n</pre>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "This completes all the operations on stored relations: `create`, `put`, `retract`, `rederive`. The syntax for `remove` is different from the rest for technical reasons.\n\nAll these operations are _atomic_, meaning that for all the tuples they affect, either all are affected at the same time, or the operation completely fails. There is no in-between, corrupted state.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "## A schema for data",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "The stored relation operations introduced above are simple, fast, and very raw. They can be used in exactly the same way as rules defined inline with the query. The way you use them is also not very different than in a traditional SQL database.\n\nStored relations are suitable for data that has a well-defined structure at the onset, and which is loaded and updated in bulk. For example, you may have obtained from domain experts an [ontology](https://www.wikiwand.com/en/Ontology_\\(information_science\\)) in the form of a network of metadata. The ontology comes in nice tables with clear, detailed documentation. You store this ontology as a group of stored relations, and use them to extract insights from your business data. The ontology is updated periodically, and when an update comes you just use the `rederive` operation to replace the old version. Very simple and efficient.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "But your _business_ data is mostly likely not as simple as that. At this age of BigDataⒸ, you must have one billion active users on your platforms carrying out all sorts of activities, concurrently of course. You don't want these activities to step on each other. You don't want to store the wrong thing into your user's accounts. You _especially_ don't want any money in transit to disappear in midair. To make things worse, hundreds of new activities pop up each day. \n\nStoring any of these in a stored relation is infeasible. With a traditional RDBMS, [data migrations](https://en.wikipedia.org/wiki/Data_migration) would have already killed you. And with Cozo, stored relations don't even try to support schema change (in fact, the only 'schema' for a stored relation is its arity).\n\nTo store such data and meet its query and mutation requirements, a database needs:\n\n* high concurrency;\n* fine-grained transactions;\n* checks for data integrity;\n* ability to rapidly adapt to new data shapes and requirements.\n\nBut in turn, we have to give up something. So with Cozo, we are willing to pay the following prices:\n\n* we demand that most transactions only apply _local changes_ that only touch on a tiny fraction of the data (otherwise the database cannot satisfy the high concurrency requirements);\n* we tolerate indirections (since \"all problems in computer science can be solved by another level of indirection\").",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "The solution is the [triple store](https://en.wikipedia.org/wiki/Triplestore).",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "A _triple_ is a sentence consisting of a subject, a verb, and an object. In the Cozo flavour, the subject is always an opaque identity, such as _entity42_, so it is actually an _entity-attribute-value_ triple. Examples:\n\n* _entity42_ has first name `'Alice'`.\n* _entity42_ has last name `'Liddell'`.\n* _entity42_ loves _entity81_.\n* _entity81_ is aged `20` years old.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "We schematize triples by schematizing the verbs (attributes). In our example, the schema for first name and last name should have type strings, the schema for age should have type integers, and the schema for the \"loves\" relationship should be other entities. Here the types refer to the objects in the triple, since the subject is always an entity.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "So let's put this into code:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":schema\n\nput person {\n first_name: string index,\n nick_name: string many index,\n loves: ref many,\n age: int\n}",
"metadata": {
"trusted": true
},
"execution_count": 15,
"outputs": [
{
"execution_count": 15,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">op</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000001</span></td><td>assert</td></tr><tr><td><span style=\"color: #307fc1;\">10000002</span></td><td>assert</td></tr><tr><td><span style=\"color: #307fc1;\">10000003</span></td><td>assert</td></tr><tr><td><span style=\"color: #307fc1;\">10000004</span></td><td>assert</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 1ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "The `:schema` at the top indicates that we want to manage the schema instead of run normal queries. We then `put` a _group_ of related schema. Now even though they are declared together similarly to a table definition in SQL, we need to stress that this actually defines four separate, independent attributes named `person.first_name`, `person.last_name`, `person.loves`, `person.age`. An entity can have whatever attributes associated with it, even those with different prefixes.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "The allowed types for attributes are:\n\n* `ref`\n* `bool`\n* `int`\n* `float`\n* `string`\n* `bytes`\n* `list`\n\nThe list type is heterogeneous in its elements. There is no concept of a nullable type and you can't put `null` into values of triples (other than wrapping them in lists first). To indicate missing values, you simply omit the attribute.\n\nThe `ref` type has the special meaning of refering to other entities.\n\nAfter the type comes one or more _modifiers_. The `many` modifier indicates that `loves` is a to-many relationship. If we omit it, any person can love at most one other person, which is not very realistic.\n\nThe modifier `index` indicates that we want values of this attribute to be _indexed_. Only indexed attributes support efficient value lookups and range scans. `ref` types are always implicitly indexed since the database wants to be able to traverse the graph in both directions.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Instead of `index`, we can mark attributes with the modifier `unique`, indicating there cannot be two entities with the same value for the attribute. The value then acts as an _unique identifier_ for the entity, which can be convenient when retrieving the entities since the entity ID is assigned by the database automatically and you cannot choose how it is assigned. So let's add an explicit `person.id` attribute, this time using the non-grouped syntax:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":schema\n\nput person.id: string unique;",
"metadata": {
"trusted": true
},
"execution_count": 16,
"outputs": [
{
"execution_count": 16,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">op</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000005</span></td><td>assert</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "We can see what schema are there in the database now by running a system directive:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db schema",
"metadata": {
"trusted": true
},
"execution_count": 17,
"outputs": [
{
"execution_count": 17,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">type</td><td style=\"font-weight: bold\">cardinality</td><td style=\"font-weight: bold\">index</td><td style=\"font-weight: bold\">history</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000001</span></td><td>person.first_name</td><td>string</td><td>one</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000002</span></td><td>person.nick_name</td><td>string</td><td>many</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000003</span></td><td>person.loves</td><td>ref</td><td>many</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000004</span></td><td>person.age</td><td>int</td><td>one</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000005</span></td><td>person.id</td><td>string</td><td>one</td><td>unique</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "We can rename the attribute:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db rename attr person.id person.pid",
"metadata": {
"trusted": true
},
"execution_count": 18,
"outputs": [
{
"execution_count": 18,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": ":db schema",
"metadata": {
"trusted": true
},
"execution_count": 19,
"outputs": [
{
"execution_count": 19,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">type</td><td style=\"font-weight: bold\">cardinality</td><td style=\"font-weight: bold\">index</td><td style=\"font-weight: bold\">history</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000001</span></td><td>person.first_name</td><td>string</td><td>one</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000002</span></td><td>person.nick_name</td><td>string</td><td>many</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000003</span></td><td>person.loves</td><td>ref</td><td>many</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000004</span></td><td>person.age</td><td>int</td><td>one</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000005</span></td><td>person.pid</td><td>string</td><td>one</td><td>unique</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "As well as getting rid of it (this will remove all the data associated with the attribute as well):",
"metadata": {}
},
{
"cell_type": "code",
"source": ":db remove attr person.pid",
"metadata": {
"trusted": true
},
"execution_count": 20,
"outputs": [
{
"execution_count": 20,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">status</td></tr></thead><tbody><tr><td>OK</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": ":db schema",
"metadata": {
"trusted": true
},
"execution_count": 21,
"outputs": [
{
"execution_count": 21,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">type</td><td style=\"font-weight: bold\">cardinality</td><td style=\"font-weight: bold\">index</td><td style=\"font-weight: bold\">history</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000001</span></td><td>person.first_name</td><td>string</td><td>one</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000002</span></td><td>person.nick_name</td><td>string</td><td>many</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000003</span></td><td>person.loves</td><td>ref</td><td>many</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000004</span></td><td>person.age</td><td>int</td><td>one</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 1ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "But that's about it. Except its name, an attribute is _immutable_ and you cannot change a `string` attribute to a `ref` attribute, nor can you decide that your `one` attribute should really be `many`.\n\nSo what do we mean when we said that this kind of structure can deal with new requirements? Say you initially made the `person.loves` attribute one-to-one and made `person.last_name` a unique index, and now you need to change them. But you need to change them not because the requirements have changed. You need to change them because you have made _mistakes_ at the beginning. These mistakes are fixed by, for example, first rename the offending attributes, then create a new attribute with the old name, next copy the data from the old attribute to the new attribute, and finally delete the old, wrong attribute. Fixing mistakes should be explicit, and this is procedure is very explicit.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "New requirements are not mistakes, and they do not invalidate your old data or schema. Examples of changing requirements: you now need to record the passport number and the parent-child relationships of the people in your graph. Very easy:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":schema\n\nput person.passport_no: string many index;\nput person.parent_of: ref many;",
"metadata": {
"trusted": true
},
"execution_count": 22,
"outputs": [
{
"execution_count": 22,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">op</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000006</span></td><td>assert</td></tr><tr><td><span style=\"color: #307fc1;\">10000007</span></td><td>assert</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 1ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": ":db schema",
"metadata": {
"trusted": true
},
"execution_count": 23,
"outputs": [
{
"execution_count": 23,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">name</td><td style=\"font-weight: bold\">type</td><td style=\"font-weight: bold\">cardinality</td><td style=\"font-weight: bold\">index</td><td style=\"font-weight: bold\">history</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000001</span></td><td>person.first_name</td><td>string</td><td>one</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000002</span></td><td>person.nick_name</td><td>string</td><td>many</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000003</span></td><td>person.loves</td><td>ref</td><td>many</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000004</span></td><td>person.age</td><td>int</td><td>one</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000006</span></td><td>person.passport_no</td><td>string</td><td>many</td><td>index</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr><tr><td><span style=\"color: #307fc1;\">10000007</span></td><td>person.parent_of</td><td>ref</td><td>many</td><td>none</td><td><span style=\"color: #bf5b3d;\">false</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "## Data with schema",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Let's reinstate the `person.id` attribute first:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":schema\n\nput person.id: string one unique;",
"metadata": {
"trusted": true
},
"execution_count": 24,
"outputs": [
{
"execution_count": 24,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">attr_id</td><td style=\"font-weight: bold\">op</td></tr></thead><tbody><tr><td><span style=\"color: #307fc1;\">10000008</span></td><td>assert</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "and now we add data to our database. First we add a person called Peter. Besides the `:tx` at the top indicating that we want to execute a transaction, it is just a map:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":tx\n\n{ person.first_name: 'Peter', person.nick_name: 'Pan', person.id: 'p' }",
"metadata": {
"trusted": true
},
"execution_count": 25,
"outputs": [
{
"execution_count": 25,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">entity_id</td><td style=\"font-weight: bold\">asserts</td><td style=\"font-weight: bold\">retracts</td></tr></thead><tbody><tr><td>550dad48-3501-11ed-8dc9-9aefd164fdd2</td><td><span style=\"color: #307fc1;\">3</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "You can insert multiple 'rows' at the same time, and the maps also allow some stylistic variations:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":tx\n\n{\"person.first_name\": \"Quin\", \"*person.nick_name\": [\"Q\", \"The Quick\"], \"person.id\": \"q\"}\n{\"person.first_name\": \"Rich\", \"person.id\": \"r\"}",
"metadata": {
"trusted": true
},
"execution_count": 26,
"outputs": [
{
"execution_count": 26,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">entity_id</td><td style=\"font-weight: bold\">asserts</td><td style=\"font-weight: bold\">retracts</td></tr></thead><tbody><tr><td>567c963a-3501-11ed-9c95-82ccbc12f696</td><td><span style=\"color: #307fc1;\">4</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr><tr><td>567c9806-3501-11ed-8e7b-d4c62ea1da21</td><td><span style=\"color: #307fc1;\">2</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 4ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Every entity is free to have any combination of attributes suitable for it. Note how we specified several nicknames for Quin at the same time, and Rich does not have a nickname.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "To query the triples, use _triple rules_: these look like a list of three items, except there is no comma inside. The first slot contains the _entity id_ assigned by the system, the middle symbol is the attribute name and must be explicit (can't be a variable), and the last slot contains the value for the attribute:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[eid, first_name, nick_name] := [eid person.nick_name nick_name], [eid person.first_name first_name]",
"metadata": {
"trusted": true
},
"execution_count": 27,
"outputs": [
{
"execution_count": 27,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">eid</td><td style=\"font-weight: bold\">first_name</td><td style=\"font-weight: bold\">nick_name</td></tr></thead><tbody><tr><td>550dad48-3501-11ed-8dc9-9aefd164fdd2</td><td>Peter</td><td>Pan</td></tr><tr><td>567c963a-3501-11ed-9c95-82ccbc12f696</td><td>Quin</td><td>Q</td></tr><tr><td>567c963a-3501-11ed-9c95-82ccbc12f696</td><td>Quin</td><td>The Quick</td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Besides the above _explicit querying_, there is another way to get attributes associated with an entity: you may specify an _pull directive_ which will expand an integer (interpreted as an entity ID) into a map containing its specified attributes. Observe:",
"metadata": {}
},
{
"cell_type": "code",
"source": "?[pid, eid] := [eid person.id pid]\n\n:pull eid {person.first_name, person.nick_name, person.age}",
"metadata": {
"trusted": true
},
"execution_count": 28,
"outputs": [
{
"execution_count": 28,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">pid</td><td style=\"font-weight: bold\">eid</td></tr></thead><tbody><tr><td>p</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;550dad48-3501-11ed-8dc9-9aefd164fdd2&quot;,&quot;person.age&quot;:null,&quot;person.first_name&quot;:&quot;Peter&quot;,&quot;person.nick_name&quot;:[&quot;Pan&quot;]}</span></td></tr><tr><td>q</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;567c963a-3501-11ed-9c95-82ccbc12f696&quot;,&quot;person.age&quot;:null,&quot;person.first_name&quot;:&quot;Quin&quot;,&quot;person.nick_name&quot;:[&quot;Q&quot;,&quot;The Quick&quot;]}</span></td></tr><tr><td>r</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;567c9806-3501-11ed-8e7b-d4c62ea1da21&quot;,&quot;person.age&quot;:null,&quot;person.first_name&quot;:&quot;Rich&quot;,&quot;person.nick_name&quot;:[]}</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "If you have several entry bindings that are entities, you can specify several `:pull` directives one after another, but each output binding can have at most one pull directive associated with it.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Another notable thing is that pulls always return a map, even if some of the requested attributes are missing for the entity (they are filled with `null` instead). In constrast, observe that the query not using pull directive did not return Rich, but returned Quin twice. As can be seen above, the pull also deals with to-many relationships automatically.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Pulls can have nested directives (see the manual for details) and can traverse `ref` triples in the reverse direction. But otherwise pull directives are kept deliberately simple. They are only intended for output processing. If you want recursions, non-trivial filters and the like, do it in the Datalog query instead.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Insertions in the triple store actually amounts to _assertions_ of facts. If two conflicting facts are asserted, the last one wins:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":tx\n\n{_key: ['person.id', 'p'], person.first_name: \"Pete\"}",
"metadata": {
"trusted": true
},
"execution_count": 29,
"outputs": [
{
"execution_count": 29,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">entity_id</td><td style=\"font-weight: bold\">asserts</td><td style=\"font-weight: bold\">retracts</td></tr></thead><tbody><tr><td>550dad48-3501-11ed-8dc9-9aefd164fdd2</td><td><span style=\"color: #307fc1;\">1</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[pid, eid] := [eid person.id pid], pid == 'p'\n\n:pull eid {person.first_name, person.nick_name}",
"metadata": {
"trusted": true
},
"execution_count": 40,
"outputs": [
{
"execution_count": 40,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">pid</td><td style=\"font-weight: bold\">eid</td></tr></thead><tbody><tr><td>p</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;550dad48-3501-11ed-8dc9-9aefd164fdd2&quot;,&quot;person.first_name&quot;:&quot;Pete&quot;,&quot;person.nick_name&quot;:[&quot;Pan&quot;,&quot;Ping&quot;]}</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Here we specified an existing entity by providing `_key` with an attribute name and a unique value for the attribute. You can only refer to entities this way if the attribute is uniquely indexed. You can also specify an entity by providing its `_id`, but if you have a unique key to use, it is often much clearer.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "The next transaction is superficially similar to the last one. But in this case, `person.nick_name` has cardinality `many` instead of `one`:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":tx\n\n{_key: ['person.id', 'p'], person.nick_name: \"Ping\"}",
"metadata": {
"trusted": true
},
"execution_count": 33,
"outputs": [
{
"execution_count": 33,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">entity_id</td><td style=\"font-weight: bold\">asserts</td><td style=\"font-weight: bold\">retracts</td></tr></thead><tbody><tr><td>550dad48-3501-11ed-8dc9-9aefd164fdd2</td><td><span style=\"color: #307fc1;\">1</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[pid, eid] := [eid person.id pid], pid == 'p'\n\n:pull eid {person.first_name, person.nick_name}",
"metadata": {
"trusted": true
},
"execution_count": 41,
"outputs": [
{
"execution_count": 41,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">pid</td><td style=\"font-weight: bold\">eid</td></tr></thead><tbody><tr><td>p</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;550dad48-3501-11ed-8dc9-9aefd164fdd2&quot;,&quot;person.first_name&quot;:&quot;Pete&quot;,&quot;person.nick_name&quot;:[&quot;Pan&quot;,&quot;Ping&quot;]}</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "Now the new nick name is simply recorded together with the last one. Note that if you try to add the same nickname for the same person again, you still get only one copy instead of two:",
"metadata": {}
},
{
"cell_type": "code",
"source": ":tx\n\n{_key: ['person.id', 'p'], person.nick_name: \"Ping\"}",
"metadata": {
"trusted": true
},
"execution_count": 37,
"outputs": [
{
"execution_count": 37,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">entity_id</td><td style=\"font-weight: bold\">asserts</td><td style=\"font-weight: bold\">retracts</td></tr></thead><tbody><tr><td>550dad48-3501-11ed-8dc9-9aefd164fdd2</td><td><span style=\"color: #307fc1;\">1</span></td><td><span style=\"color: #307fc1;\">0</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 2ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "?[pid, eid] := [eid person.id pid]\n\n:pull eid {person.first_name, person.nick_name}",
"metadata": {
"trusted": true
},
"execution_count": 38,
"outputs": [
{
"execution_count": 38,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">pid</td><td style=\"font-weight: bold\">eid</td></tr></thead><tbody><tr><td>p</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;550dad48-3501-11ed-8dc9-9aefd164fdd2&quot;,&quot;person.first_name&quot;:&quot;Pete&quot;,&quot;person.nick_name&quot;:[&quot;Pan&quot;,&quot;Ping&quot;]}</span></td></tr><tr><td>q</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;567c963a-3501-11ed-9c95-82ccbc12f696&quot;,&quot;person.first_name&quot;:&quot;Quin&quot;,&quot;person.nick_name&quot;:[&quot;Q&quot;,&quot;The Quick&quot;]}</span></td></tr><tr><td>r</td><td><span style=\"color: #bf5b3d;\">{&quot;_id&quot;:&quot;567c9806-3501-11ed-8e7b-d4c62ea1da21&quot;,&quot;person.first_name&quot;:&quot;Rich&quot;,&quot;person.nick_name&quot;:[]}</span></td></tr></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 3ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "As we have seen, triples abide by set semantics instead of bag semantics as well. If you really want to have duplicates, you need to disambiguate them at the level of values, by for example wrapping them in lists.",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "## The time machine",
"metadata": {}
},
{
"cell_type": "code",
"source": "a[] <- [[1]]\n?[a] := a[a], a > 100\n\n:assert none",
"metadata": {
"trusted": true
},
"execution_count": 2,
"outputs": [
{
"execution_count": 2,
"output_type": "execute_result",
"data": {
"text/html": "<div style=\"display: flex; align-items: end; flex-direction: row;\"><table><thead><tr><td style=\"font-weight: bold\">a</td></tr></thead><tbody></tbody></table><span style=\"color: darkgrey; font-size: xx-small; margin: 13px;\">Took 0ms</span></div>"
},
"metadata": {}
}
]
},
{
"cell_type": "code",
"source": "",
"metadata": {},
"execution_count": null,
"outputs": []
}
]
}