{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Air-data acrobatics" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%reload_ext pycozo.ipyext_direct\n", "%cozo_auth tutorial *******" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Hello, world!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start exploring the Cozo database by following the \"hello world\" tradition:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 abc
0helloworldCozo!
\n" ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b, c] <- [['hello', 'world', 'Cozo!']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's break that down. This query consists of two parts, the part before `<-` is called its _head_, and the part after is called its _body_. The symbol `<-` itself denotes that this is a _constant rule_, or a declaration of _facts_.\n", "\n", "The head has the special name `?`, indicating the _entry_ of the query, which has three _arguments_ `a`, `b`, and `c`.\n", "\n", "The body consists of a list of lists (in this case a list of a single inner list). Each inner list represents a _tuple_, which is similar to a row in a relational database. The length of the inner list must match the number of arguments of the head, and each argument is then _bound_ to the corresponding value in the inner list by position.\n", "\n", "Of course more than one inner list is allowed:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 abc
0helloworldCozo!
1helloworlddatabase!
\n" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b, c] <- [['hello', 'world', 'Cozo!'],\n", " ['hello', 'world', 'database!']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try the following:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 a
0Cozo!
1hello
2world
\n" ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a] <- [['hello'], ['world'], ['Cozo!']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have three inner lists of length 1 each. The returned results is also _sorted_: all relations in Cozo are sorted lexicographically by position.\n", "\n", "Cozo operates on _set semantics_ instead of _bag semantics_: observe" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 a
0Cozo!
1Cozo.
2hello
3world
\n" ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a] <- [['hello'], ['world'], ['Cozo!'], ['hello'], ['world'], ['Cozo.']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`'hello'` and `'world'` both appear only once in the result, even though they appear twice each in the input. Set semantics automatically de-duplicates based on the whole tuple." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Values and expressions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The list of lists in the body of the rules certainly look familiar to anyone who have used languages such as JavaScript or Python. In fact, with the exception of the map `{}`, valid JSON values represent valid Cozo values.\n", "\n", "As sorting is important in Cozo, study the following example, which demonstrates how different values are sorted:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 a
0None
1False
2True
3-0.000000
41.000000
53.141590
61234567
7A
8Apple juice
9apple
10['apple', 1, [2, 3]]
\n" ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a] <- [[true],\n", " [false], \n", " [null],\n", " [\"A\"], \n", " ['apple'], # single or double quotes are both OK \n", " [\"Apple juice\"], \n", " [['apple', 1, [2, 3]]], # this row consists of a list consisting of heterogeneous items!\n", " [1.0], \n", " [1_234_567], # you can separate digits with underscores for clarity\n", " [3.14159], \n", " [-8e-99]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice how comments are entered, just like in JavaScript. `/* ... */` also works.\n", "\n", "Even though the kind of rule we have been using is called the _constant rule_, you can in fact compute in them:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ia
013
1212
230.833333
341096.633158
45NUMBER 10
563.141593
\n" ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[i, a] <- [[1, 1 + 2], \n", " [2, 3 * 4], \n", " [3, 5 / 6], \n", " [4, exp(7)], \n", " [5, uppercase('number ') ++ to_string(10)], # string concatenation\n", " [6, to_float('PI')]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "for clarity we have used the index `i` to force the result to show in this order.\n", "\n", "For the full list of functions you can use in expressions, consult the Manual." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is one thing we need to make clear at this point. In CozoScript, only `true` is true, and only `false` is false. This is not a tautology: every other value, including `null`, produces error when put in a position requiring a truthy value. In this sense, `null` in CosoScript is only a _marker_. It has no inherent logical semantics associated with it, unlike `NULL` in SQL, `null` and `undefeined` in Javascript, and `None` in Python. An example:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[31meval::throw\u001b[0m\n", "\n", " \u001b[31m×\u001b[0m Evaluation of expression failed\n", " ╭────\n", " \u001b[2m1\u001b[0m │ ?[a] <- [[!null]]\n", " · \u001b[35;1m ─────\u001b[0m\n", " ╰────\n", "\u001b[36m help: \u001b[0m'negate' requires booleans\n" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a] <- [[!null]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case you really need to write" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 a
0False
\n" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a] <- [[!is_null(null)]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This may seem a nuisance in trivial cases, but will save you a lot of hair in hairy situations. Believe me." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Horn-clause rules" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Usually constant rules are used to define ad-hoc facts useful for subsequent queries:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 lovingloved
0aliceeve
1bobalice
2charlieeve
3davidgeorge
4evealice
5evebob
6evecharlie
7georgegeorge
\n" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[loving, loved] := loves[loving, loved] # Yes, this is the 'subsequent query'. In a logical sense. \n", " # The order of rules has no significance whatsoever.\n", "\n", "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The constant rule is now named `loves`, denoting a rather complicated relationship network (aren't 'relationship' and 'network' synonyms?). It reads like \"Alice loves Eve, Bob loves Alice\", \"nobody loves David, David loves George, but George only loves himself\", and so on. Note that for constant rules we can actually omit the arguments (but if explicitly given, the arity must match the actual data).\n", "\n", "The entry `?` is now a _Horn-clause rule_, signified by the symbol `:=`. Its body has a single _application_ of the rule we have just defined, with _bindings_ `loving` and `loved` for the arguments. These bindings are then carried to the output via the arguments of the entry rule.\n", "\n", "Here both bindings to the rule application of `loves` are initially _unbound_, in which case all tuples of `loves` are returned. To _bind_ an argument simply pass a constant in:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved_by_eve
0alice
1bob
2charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved_by_eve] := loves['e' ++ 'v' ++ 'e', loved_by_eve] # Eve loves dramatic entrance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Every argument position can be bound:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loves_eve
0alice
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loves_eve] := loves[loves_eve, 'eve']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Multiple clauses can appear in the body, in which case an implicit conjunction is implied, meaning that all clauses\n", "must bind for a result to return:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved_by_b_e
0alice
\n" ], "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved_by_b_e] := loves['eve', loved_by_b_e], loves['bob', loved_by_b_e]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that Alice is loved by both Bob and Eve. The variable `loved_by_b_e` appears in both clauses, in which case they are _unified_, meaning that they must bind to the _same_ value for a tuple to return." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Disjunction, meaning that _any_ clause with successful binding potentially contribute to results, must be specified explicitly:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved_by_b_e
0alice
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved_by_b_e] := loves['eve', loved_by_b_e] or loves['bob', loved_by_b_e], \n", " loved_by_b_e != 'bob', \n", " loved_by_b_e != 'eve'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, disjunctive clauses are connected by `or`. It binds more strongly than the implicit conjunction `,`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Horn clause rules (and Horn clause rules only) may have multiple definitions _having equivalent heads_. The above query is identical in every way to the following:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved_by_b_e
0alice
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved_by_b_e] := loves['eve', loved_by_b_e], loved_by_b_e != 'bob', loved_by_b_e != 'eve'\n", "?[loved_by_b_e] := loves['bob', loved_by_b_e], loved_by_b_e != 'bob', loved_by_b_e != 'eve'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If a Horn clause rule is not the entry, even the _names_ given to the arguments can differ. The bodies are not required to be of the same form, as long as they produce compatible outputs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Besides rule applications, _filters_ can also appear in the body:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 personloved
0bobalice
1davidgeorge
\n" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[person, loved] := loves[person, loved], !ends_with(person, 'e')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case only people with name not ending in `'e'` are considered for the loving position.\n", "\n", "By the way, if you are not interested in who the person in the loving position is, you can just omit it in the arguments to the entry:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved
0alice
1george
\n" ], "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved] := loves[person, loved], !ends_with(person, 'e')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... but every argument in the head of any Horn-clause rule must appear in the body, of course:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[31meval::unbound_symb_in_head\u001b[0m\n", "\n", " \u001b[31m×\u001b[0m Symbol 'the_alien' in rule head is unbound\n", " ╭─[9:1]\n", " \u001b[2m 9\u001b[0m │ \n", " \u001b[2m10\u001b[0m │ ?[the_alien, loved] := loves[person, loved], !ends_with(person, 'e')\n", " · \u001b[35;1m ─────────\u001b[0m\n", " ╰────\n", "\u001b[36m help: \u001b[0mNote that symbols occurring only in negated positions are not considered bound\n" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[the_alien, loved] := loves[person, loved], !ends_with(person, 'e')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Negation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next query finds those who are loved by Eve, but not by Bob:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loved_by_e_not_b
0bob
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loved_by_e_not_b] := loves['eve', loved_by_e_not_b], not loves['bob', loved_by_e_not_b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we are using the `not` keyword to _negate_ the rule application `loves`. This negation is at the level of Horn-clauses, which is not the same as the level of expressions. In fact, there are two sets of related but inequivalent operators:\n", "\n", "* For Horn clauses: `,` (conjunction), `or` (disjunction), `not` (negation)\n", "* For boolean expressions: `&&` (conjunction), `||` (disjunction), `!` (negation)\n", "\n", "Hopefully you are already familiar with the boolean set of operators. If you use them in the wrong way, the query compiler will yell at you. And you will comply." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Negation has to abide by the _safety rule_. Let's violate it:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[31meval::unbound_symb_in_head\u001b[0m\n", "\n", " \u001b[31m×\u001b[0m Symbol 'not_loved_by_b' in rule head is unbound\n", " ╭─[9:1]\n", " \u001b[2m 9\u001b[0m │ \n", " \u001b[2m10\u001b[0m │ ?[not_loved_by_b] := not loves['bob', not_loved_by_b]\n", " · \u001b[35;1m ──────────────\u001b[0m\n", " ╰────\n", "\u001b[36m help: \u001b[0mNote that symbols occurring only in negated positions are not considered bound\n" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[not_loved_by_b] := not loves['bob', not_loved_by_b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oh no! The query compiler rejects our perfectly reasonable query trying to determine those poor souls not loved by Bob!\n", "\n", "But is our query really reasonable? For example, should the query return a tuple containing 'gold', since according to facts at hand, Bob clearly has no interest in 'gold'? So should our query return every possible string except a select few? Do you want your computer to handle such a query?\n", "\n", "Now you understand what the help message above is trying to tell you." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make our query really reasonable, we have to explicitly give our query a _closed world_ in which to operate the negation:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 not_loved_by_b
0bob
1charlie
2david
3eve
4george
\n" ], "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", " \n", "the_population[p] := loves[p, _a]\n", "the_population[p] := loves[_a, p]\n", "\n", "?[not_loved_by_b] := the_population[not_loved_by_b], not loves['bob', not_loved_by_b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the query understands that we are asking our question _within_ the people in the love network. It then proceeds without complaints.\n", "\n", "Let's state the **safety rule for negation**: _at least one_ argument of the rule application must be bound elsewhere (otherwise the clause will produce an infinity of candidate tuples), and _all arguments_ to negated clauses are _not_ considered bound, _unless_ they also appear elsewhere in a positive context.\n", "\n", "If you can't wrap your head around the rule yet, don't worry. Just write your query. Return here and reread this section when you encounter some error messages similar to the above." ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Unification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have seen that variables with repeated appearance in rule applications and predicates are implicitly unified. You can also _explicitly_ unify a variable with the unify operator `=`:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loves_eve
0alice
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loves_eve] := eve = 'eve', loves[loves_eve, eve]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By the way, the _order_ a clause appears in a Horn-clause rule can never affect the result in any way (provided your queries do not contain random functions):" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 loves_eve
0alice
1charlie
\n" ], "text/plain": [ "" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "?[loves_eve] := loves[loves_eve, eve], eve = 'eve'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... but the performance might vary, sometimes greatly. This is an advanced topic that we will come back to in a later session. For trivial examples like ours it doesn't matter. In your own explorations, just try to put more 'restrictive' rules first (meaning that they filter out a greater number of tuples), and you will be fine most of the time." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also the spread-unify operator `in`, which unifies the left hand side with values in a list one at a time:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 u
0a
1b
2c
\n" ], "text/plain": [ "" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[u] := u in ['a', 'b', 'c']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another example: this is the \"Cartesian product\"" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 uv
0ax
1ay
2bx
3by
4cx
5cy
\n" ], "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[u, v] := u in ['a', 'b', 'c'], v in ['x', 'y']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may notice that paired with functions extracting elements from lists, we don't actually need constant rules anymore. But constant rules are more explicit when you really have _facts_ as inputs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recursion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we come to the \"poster boy\" query of classical Datalog: let's find out all the people loved by Alice, or loved by someone loved by Alice, or loved by someone loved by someone loved by Alice, _ad infinitum_:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 chained
0alice
1bob
2charlie
3eve
\n" ], "text/plain": [ "" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "alice_love_chain[person] := loves['alice', person]\n", "alice_love_chain[person] := alice_love_chain[in_person], loves[in_person, person]\n", "\n", "?[chained] := alice_love_chain[chained]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Someone \"chained\" is either loved by Alice directly, or loved by someone already in the chain. The query as written reads very naturally. This is why this \"transitive closure\" type of query is the poster-boy query of classical Datalog. \n", "\n", "Writing the same thing in SQL requires recursive CTE, and those CTEs escalate pretty quickly. On the other hand, if well written, Datalog queries can weather very demanding situations and remain readable.\n", "\n", "Recursive queries are an essential part for graphs (networks). So they had better be easy to write _and_ read in a database claiming to be optimized for graphs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We've talked about the safety rule for negation above. You may suspect that something similar is at play here. Let's retry the above query, but omit the starting condition `alice_love_chain[person] := loves['alice', person]`:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 chained
\n" ], "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loves[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", "\n", "alice_love_chain[person] := alice_love_chain[in_person], loves[in_person, person]\n", "\n", "?[chained] := alice_love_chain[chained]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Are you surprised that the compiler did not complain? Are you surprised that it returned no results? This is the _closed-world assumption_ hinted above at play again. If there is no way to _deduce_ a fact from the given facts, _then_ the fact itself is false.\n", "\n", "This so called \"least fixed point\" semantics is the semantics of Datalog queries. This semantics is actually subtly different from SQL, due to the existence of `UNKNOWN` in SQL, usually manifesting as `NULL`. In other worlds, SQL operates on [ternary logic](https://en.wikipedia.org/wiki/Three-valued_logic) whereas Datalog stays boolean all the way (under the protection of the closed world assumptions)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Still, there are _rules_ with respect to recursion. [Bertrand Russell](https://en.wikipedia.org/wiki/Russell%27s_paradox) would rush to write:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[31meval::unstratifiable\u001b[0m\n", "\n", " \u001b[31m×\u001b[0m Query is unstratifiable\n", "\u001b[36m help: \u001b[0mThe rule 'q' is in the strongly connected component [\"p\", \"q\"],\n", " and is involved in at least one forbidden dependency\n", " (negation, non-meet aggregation, or algorithm-application).\n" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "world[a] := a in [1, 2]\n", "\n", "p[a] := world[a], not q[a]\n", "q[a] := world[a], not p[a]\n", "\n", "?[a] := p[a]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above query does not violate the safety rule of negation (because he put a `world` in front of each negation), but the compiler still rejects it. Don't worry about the unworldly incantation the error makes. Instead, think for a moment what the result _could_ be." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can verify that the result could be the single tuple `[1]` with the assignment `p[a] <- [[1]]` and `q[a] <- [[2]]`, _or_ the single tuple `['q']` with the assignment `p[a] <- [[2]]` and `q[a] <- [[1]]`. The problem is, these answers contradict each other, and neither can be deduced _constructively_. So under the least fixed point semantics, this program has no _meaning_, and the compiler rejects it.\n", "\n", "Again, don't worry if you can't exactly follow what is going on. Just trust that the compiler is trying to prevent your computer from imploding. Real applications don't tend to produce these kinds of contrived, paradoxical queries anyway." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stored relations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An obvious shortcoming of our previous acrobatics is that we have to carry around our love triangles network and enter it anew for every query, which leads to rapid deterioration of the `CTRL`, `C` and `V` keys. So let's fix that:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[] <- [['alice', 'eve'],\n", " ['bob', 'alice'],\n", " ['eve', 'alice'],\n", " ['eve', 'bob'],\n", " ['eve', 'charlie'],\n", " ['charlie', 'eve'],\n", " ['david', 'george'],\n", " ['george', 'george']]\n", " \n", ":relation create triangles" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have the _query directive_ `:relation create` together with a normal query. The results will then be stored on your disk with the name `triangles` instead of returned to you." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will receive an error if you try to run this script twice. In which case don't worry and continue." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Stored relations are safe from restarts and power failures. Let's query against it:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ab
0aliceeve
1bobalice
2charlieeve
3davidgeorge
4evealice
5evebob
6evecharlie
7georgegeorge
\n" ], "text/plain": [ "" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The colon `:` in front of the name tells the database that we want a _stored_ relation instead of a relation defined within the query itself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, Fred finally comes to the party and Fred loves Alice and Eve. We add these facts in the following way:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[] <- [['fred', 'alice'],\n", " ['fred', 'eve']]\n", "\n", ":relation put triangles" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ab
0aliceeve
1bobalice
2charlieeve
3davidgeorge
4evealice
5evebob
6evecharlie
7fredalice
8fredeve
9georgegeorge
\n" ], "text/plain": [ "" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we used `:relation put` instead of `:relation create`. In fact, you can use `:relation put` before any call to `:relation create`. The `create` op just ensures that the insertion is into a new stored relation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now Eve no longer loves Alice and Charlie! Let's reflect this fact by using `retract`" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[] <- [['eve', 'charlie'],\n", " ['eve', 'alice']]\n", "\n", ":relation retract triangles" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ab
0aliceeve
1bobalice
2charlieeve
3davidgeorge
4evebob
5fredalice
6fredeve
7georgegeorge
\n" ], "text/plain": [ "" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is OK to retract non-existent facts, in which case the operation does nothing." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also reset the whole relation with `rederive`:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[] <- [['eve', 'charlie'],\n", " ['eve', 'alice']]\n", "\n", ":relation rederive triangles" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ab
0evealice
1evecharlie
\n" ], "text/plain": [ "" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Only the `rederive`ed tuples remain." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see what stored relations you currently have in your database by running the following _system directive_:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 namearity
0triangles2
\n" ], "text/plain": [ "" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ ":db relations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Relations can be renamed:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ ":db rename relation triangles love_triangles" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 namearity
0love_triangles2
\n" ], "text/plain": [ "" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ ":db relations" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 ab
0evealice
1evecharlie
\n" ], "text/plain": [ "" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :love_triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now this triangles business is becoming tiring. Let's get rid of it:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
 status
0OK
\n" ], "text/plain": [ "" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ ":db remove relation love_triangles" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we do not have any queries to run when nuking relations, we use a system directive instead of a query directive. Now you can no longer query the triangles:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\u001b[31mquery::relation_not_found\u001b[0m\n", "\n", " \u001b[31m×\u001b[0m Cannot find requested stored relation 'love_triangles'\n" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "?[a, b] := :love_triangles[a, b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This completes all the operations on stored relations: `create`, `put`, `retract`, `rederive`. The syntax for `remove` is different from the rest for technical reasons.\n", "\n", "All these operations are _atomic_, meaning that for all the tuples they affect, either all are affected at the same time, or the operation completely fails. There is no in-between, corrupted state.\n", "\n", "Stored relations are simple, fast, and very raw. They can be used in exactly the same way as rules defined inline with the query. The way you use them is also not very different than in a traditional SQL database.\n", "\n", "Stored relations are suitable for data that has a well-defined structure at the onset, and which is loaded and updated in bulk. For example, you may have obtained from domain experts an [ontology](https://www.wikiwand.com/en/Ontology_\\(information_science\\)) in the form of a network of metadata. The ontology comes in nice tables with clear, detailed documentation. You store this ontology as a group of stored relations, and use them to extract insights from your business data. The ontology is updated periodically, and when an update comes you just use the `rederive` operation to replace the old version. Very simple and efficient. But if you need more guarantees for your data, or your data shapes change rapidly, use the triple store instead: see a later tutorial for how to use it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's it! You have learned the basics of Datalog in the dialect CozoScript!\n", "\n", "If you want to play more without going further for the moment, it is recommended that you skim through the list of functions in the Manual. Those functions allow you to do much more acrobatics with pure Datalog." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }