editing the tutorial

main
Ziyang Hu 2 years ago
parent d2c3dbd243
commit b2d9e4cde3

@ -86,7 +86,7 @@
"metadata": {},
"source": [
"To run the server, you need to specify a directory to store persistent data on your file system. \n",
"In the following we will use a directory called `tutorial-data` in the same directory as the binary executable.\n",
"In the following, we will use a directory called `tutorial-data` in the same directory as the binary executable.\n",
"In the terminal, run"
]
},
@ -105,7 +105,7 @@
"id": "41820cf5-a800-4c81-8a17-227e9692aae2",
"metadata": {},
"source": [
"The same command should work in powershell as well.\n",
"The same command should work in PowerShell as well.\n",
"\n",
"If you see something like `Database web API running at ...` displayed in your terminal, \n",
"then the server is successfully started. \n",
@ -113,7 +113,7 @@
"When you are done, `CTRL-C` in the terminal will stop the server.\n",
"You can restart the server again by running the command again.\n",
"\n",
"More options when starting the server is available. Run"
"More options when starting the server are available. Run"
]
},
{
@ -143,7 +143,7 @@
"\n",
"Cozo exposes an HTTP API, so theoretically you can follow along using tools like `curl`. \n",
"If you are interested, consult the [manual](https://cozodb.github.io/current/manual/setup.html#the-query-api) for the request format the API expects.\n",
"For better user experience, though, we suggest following one of the following two subsections instead."
"For a better user experience, we suggest following one of the following two subsections instead."
]
},
{
@ -155,9 +155,9 @@
"\n",
"This option provides the best user experience but also requires you to install quite a lot of things, \n",
"though you may already have them installed on your computer if you use the python data science stack.\n",
"First you will need python installed. \n",
"First, you will need python installed. \n",
"Then install JupyterLab by following the instruction at https://jupyter.org/install.\n",
"Next, run the following to install a Jupyter extension to help querying Cozo:"
"Next, run the following to install a Jupyter extension to help query Cozo:"
]
},
{
@ -178,7 +178,7 @@
"While you are at it, go to the source of this tutorial at https://github.com/cozodb/cozo/blob/main/tutorial/tutorial.ipynb, \n",
"right-click on `Raw` and save the tutorial document to your disk.\n",
"\n",
"Then run jupyter lab, open a the saved tutorial document, and follow along."
"Then run Jupyter Lab, open the saved tutorial document, and follow along."
]
},
{
@ -209,14 +209,14 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 159,
"id": "f3dfb8a1-35f0-4dc2-b8d7-e81fa3d45b75",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<p style=\"font-size: 75%\">Completed in 1ms</p>"
"<p style=\"font-size: 75%\">Completed in 0ms</p>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
@ -229,34 +229,34 @@
"data": {
"text/html": [
"<style type=\"text/css\">\n",
"#T_50cdf_row0_col0, #T_50cdf_row0_col1, #T_50cdf_row0_col2 {\n",
"#T_154d8_row0_col0, #T_154d8_row0_col1, #T_154d8_row0_col2 {\n",
" color: black;\n",
"}\n",
"</style>\n",
"<table id=\"T_50cdf\">\n",
"<table id=\"T_154d8\">\n",
" <thead>\n",
" <tr>\n",
" <th class=\"blank level0\" >&nbsp;</th>\n",
" <th id=\"T_50cdf_level0_col0\" class=\"col_heading level0 col0\" >0</th>\n",
" <th id=\"T_50cdf_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
" <th id=\"T_50cdf_level0_col2\" class=\"col_heading level0 col2\" >2</th>\n",
" <th id=\"T_154d8_level0_col0\" class=\"col_heading level0 col0\" >0</th>\n",
" <th id=\"T_154d8_level0_col1\" class=\"col_heading level0 col1\" >1</th>\n",
" <th id=\"T_154d8_level0_col2\" class=\"col_heading level0 col2\" >2</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th id=\"T_50cdf_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
" <td id=\"T_50cdf_row0_col0\" class=\"data row0 col0\" >hello</td>\n",
" <td id=\"T_50cdf_row0_col1\" class=\"data row0 col1\" >world</td>\n",
" <td id=\"T_50cdf_row0_col2\" class=\"data row0 col2\" >Cozo!</td>\n",
" <th id=\"T_154d8_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
" <td id=\"T_154d8_row0_col0\" class=\"data row0 col0\" >hello</td>\n",
" <td id=\"T_154d8_row0_col1\" class=\"data row0 col1\" >world</td>\n",
" <td id=\"T_154d8_row0_col2\" class=\"data row0 col2\" >Cozo!</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x107f234f0>"
"<pandas.io.formats.style.Styler at 0x11889d7b0>"
]
},
"execution_count": 2,
"execution_count": 159,
"metadata": {},
"output_type": "execute_result"
}
@ -271,8 +271,8 @@
"metadata": {},
"source": [
"If you get the same words back formatted in a table, congratulations! \n",
"You can skip to the next section where we starts learning CozoScript proper.\n",
"If you want to know more about what the `pycozo` extension did, and more tricks that you can do with the extension, read the [manual](https://cozodb.github.io/current/manual/setup.html#jupyterlab)."
"You can skip to the next section where we start learning CozoScript proper.\n",
"If you want to know more about what the `pycozo` extension did and more tricks that you can do with the extension, read the [manual](https://cozodb.github.io/current/manual/setup.html#jupyterlab)."
]
},
{
@ -298,17 +298,17 @@
"id": "1147466d-d925-4c32-8b6c-28bdabc656fb",
"metadata": {},
"source": [
"Your local machine at least have a modern web browser, like a recent version of Firefox, Chrome, or Edge, right?\n",
"Your local machine at least has a modern web browser, like a recent version of Firefox, Chrome, or Edge, right?\n",
"Good. \n",
"\n",
"Use your browser to navigate to http://127.0.0.1:9070 (the address shown in your terminal when you run `cozoserver`).\n",
"You should be greeted be a page saying that the server is running.\n",
"Now open the developer tools of your browser by right clicking the page and select \"Inspect\" from the menu \n",
"You should be greeted by a page saying that the server is running.\n",
"Now open the developer tools of your browser by right-clicking the page and selecting \"Inspect\" from the menu \n",
"(if you cannot find it, you may need to fiddle with your browser settings to enable the developer tools).\n",
"Switch to the \"Console\" tab of the developer tools if it is not already open. \n",
"\n",
"If you see some messages where \n",
"the \"Cozo Makeshift Javascript Console\" welcomes you, you are ready. Run the \"hello world\" query by typing the following into the console and press enter:"
"the \"Cozo Makeshift Javascript Console\" welcomes you, you are ready. Run the \"hello world\" query by typing the following into the console and pressing enter:"
]
},
{
@ -580,7 +580,7 @@
"metadata": {},
"source": [
"The literal representations are similar to those in JavaScript. \n",
"In particular, strings in double quotes are guaranteed to be interpreted in exactly the same way as in JSON."
"In particular, strings in double quotes are guaranteed to be interpreted in the same way as in JSON."
]
},
{
@ -658,7 +658,7 @@
"metadata": {},
"source": [
"We say that relations in Cozo follow _set semantics_ where de-duplication is automatic. \n",
"By constrast, SQL usually follows _bag semantics_ (some databases does this by secretly having a unique internal key for every row, in Cozo you must do this explicitly if you need to simulate duplicate rows).\n",
"By contrast, SQL usually follows _bag semantics_ (some databases do this by secretly having a unique internal key for every row, in Cozo you must do this explicitly if you need to simulate duplicate rows).\n",
"\n",
"Why does Cozo break tradition and go with set semantics?\n",
"Set semantics is much more convenient when you have recursions between relations involved,\n",
@ -869,7 +869,7 @@
"id": "ba32d27c-0787-48d3-94fb-79e77a8cce0c",
"metadata": {},
"source": [
"If you give bindings, the number of bindings must match the actual data, otherwise you will get an error:"
"If you give bindings, the number of bindings must match the actual data, otherwise, you will get an error:"
]
},
{
@ -905,7 +905,7 @@
"id": "c71fb449-6a9c-4a7e-bed9-44230a41d5dd",
"metadata": {},
"source": [
"Now let's defines rules that use other rules:"
"Now let's define rules that use other rules:"
]
},
{
@ -1124,7 +1124,7 @@
"source": [
"Here the second atom is an _expression_ `is_num(a)`. \n",
"Only rows for which the expression evaluates to `true` are returned, so expression atoms act as filters. \n",
"By the way, we see that the order in which the rules are given are immaterial."
"By the way, we see that the order in which the rules are given is immaterial."
]
},
{
@ -1418,7 +1418,7 @@
"id": "bf4be53c-118f-46b8-9c42-cb259aed6e87",
"metadata": {},
"source": [
"The explicit unification `=` unifies with a single value. There is another kind of unification that unifies with values within a list. Observe:"
"The explicit unification `=` unifies with a single value. There is another kind of unification that unifies values within a list. Observe:"
]
},
{
@ -1569,7 +1569,7 @@
"id": "79357544-8eeb-4b33-b49c-bd6cb4de57cb",
"metadata": {},
"source": [
"There is really no ceremony required at all when you want to store data in Cozo:"
"There is no ceremony required at all when you want to store data in Cozo:"
]
},
{
@ -1637,7 +1637,7 @@
"id": "6e911e87-a80e-4b1b-a878-613b61e339bc",
"metadata": {},
"source": [
"The query itself is identical to one which we have run before, except we have added the `:create` query option, instructing the system to store the result in a stored relation named `stored`, containing the columns `l1` and `l2`.\n",
"The query itself is identical to the one which we have run before, except we have added the `:create` query option, instructing the system to store the result in a stored relation named `stored`, containing the columns `l1` and `l2`.\n",
"\n",
"By the way, if you just want to create the relation without adding any data, you can omit the queries. No need to have an empty `?` query."
]
@ -1805,7 +1805,7 @@
"id": "79d320aa-bf41-4676-a778-7ff00b704ced",
"metadata": {},
"source": [
"Stored relations can be used in a similar way to relations defined via inline rules or fixed rules. The only difference is that you prefix the relation name by a colon:"
"Stored relations can be used in a similar way to relations defined via inline rules or fixed rules. The only difference is that you prefix the relation name with a colon:"
]
},
{
@ -1869,7 +1869,7 @@
"id": "821c1a1f-ef11-4429-ba99-c2599bc18857",
"metadata": {},
"source": [
"Unlike relations defined inline, the columns of stored relations have fixed names. You can use this to your advantage by selectively refering to columns by name.\n",
"Unlike relations defined inline, the columns of stored relations have fixed names. You can use this to your advantage by selectively referring to columns by name.\n",
"This is especially useful if you have a lot of columns:"
]
},
@ -2895,7 +2895,7 @@
"id": "59d559f0-3325-4bee-b209-3dbe26c3b90c",
"metadata": {},
"source": [
"With the graph available, we have investigate competing interests:"
"With the graph available, we can investigate competing interests:"
]
},
{
@ -2957,7 +2957,7 @@
"id": "48fa9636-5727-4984-af6e-f318c3e98b85",
"metadata": {},
"source": [
"So far we have only seen body consisting of _conjunction_ of atoms. Disjunction is also available, by using the `or` keyword:"
"So far we have only seen bodies consisting of _conjunction_ of atoms. Disjunction is also available, by using the `or` keyword:"
]
},
{
@ -3096,7 +3096,7 @@
"id": "0b859b50-a748-4e20-9900-7ee053078a7f",
"metadata": {},
"source": [
"In fact, the first way of writing the query (using `or`) is just a syntax sugar for the second way. When you have multiple definitions of the same inline rule, the rule heads must be compatible. Fixed rules cannot have multiple definitions."
"The first way of writing the query (using `or`) is just syntax sugar for the second way. When you have multiple definitions of the same inline rule, the rule heads must be compatible. Fixed rules cannot have multiple definitions."
]
},
{
@ -3294,10 +3294,10 @@
"metadata": {},
"source": [
"Why is this query not allowed? Well, what can it possibly return?\n",
"For example, should the query return 'gold', since according to facts at hand, \n",
"Bob clearly has no interest in 'gold'? \n",
"For example, should the query return 'gold', since according to the facts at hand, \n",
"Bob has no interest in 'gold'? \n",
"So should our query return every possible string except a select few? \n",
"That's clearly not reasonable."
"That's not reasonable."
]
},
{
@ -3305,7 +3305,7 @@
"id": "204fbb34-293e-4c63-a1c4-14e5a48491d7",
"metadata": {},
"source": [
"To make our query really reasonable, we have to explicitly give our query a _closed world_ in which to operate the negation:"
"To make our query reasonable, we have to explicitly give our query a _closed world_ in which to operate the negation:"
]
},
{
@ -3471,7 +3471,7 @@
"id": "00db7e33-13e8-4b5f-9438-b14ec8b37bf5",
"metadata": {},
"source": [
"Someone \"chained\" is either loved by Alice directly, or loved by someone already in the chain. The query as written reads very naturally."
"Someone \"chained\" is either loved by Alice directly or loved by someone already in the chain. The query as written reads very naturally."
]
},
{
@ -3479,7 +3479,7 @@
"id": "1666bf8f-4e41-4428-bdc8-c1d05f6783df",
"metadata": {},
"source": [
"You may object that you only need to be able to refer to other rules by applying them to have recursion, and multiple definitions are not really required. Technically, true, but the resulting queries are not useful. Observe:"
"You may object that you only need to be able to refer to other rules by applying them to have recursion, and multiple definitions are not required. Technically, true, but the resulting queries are not useful. Observe:"
]
},
{
@ -3536,7 +3536,7 @@
"id": "ac4a9d13-3360-487d-81c8-8da33f781bae",
"metadata": {},
"source": [
"This is the _closed-world assumption_. If there is no way to _deduce_ a fact from the given facts, _then_ the fact itself is false. You really need multiple definitions to \"bootstrap\" the query."
"This is the _closed-world assumption_. If there is no way to _deduce_ a fact from the given facts, _then_ the fact itself is false. You need multiple definitions to \"bootstrap\" the query."
]
},
{
@ -3544,7 +3544,7 @@
"id": "18f93e8f-d0d7-4238-b057-682a2e4daa70",
"metadata": {},
"source": [
"You can do crazy things with recursion and negation. Fortunately, Cozo will try to stop you when you want to run something really unreasonable:"
"You can do crazy things with recursion and negation. Fortunately, Cozo will try to stop you when you want to run something unreasonable:"
]
},
{
@ -3583,7 +3583,7 @@
"id": "51c7dd9e-51f2-4bb0-9e9b-c912b3f87036",
"metadata": {},
"source": [
"Nevermind the error message. If you consider the query as an equation to be solved, then `p[a] <- [[1]]` and `q[a] <- [[2]]` is a solution. But there is no way to _deduce_ this solution constructively. Furthermore, `q[a] <- [[1]]` and `p[a] <- [[2]]` is also a solution which is incompatible with the first."
"Never mind the error message. If you consider the query as an equation to be solved, then `p[a] <- [[1]]` and `q[a] <- [[2]]` is a solution. But there is no way to _deduce_ this solution constructively. Furthermore, `q[a] <- [[1]]` and `p[a] <- [[2]]` is also a solution which is incompatible with the first."
]
},
{
@ -3686,7 +3686,7 @@
"id": "19f47c1e-ffb6-4531-b181-0995bfeb63df",
"metadata": {},
"source": [
"The usual `sum`, `mean`, etc. are all available. Having aggregations apply in the head of the rule instead of in the body is really powerful, as we will see later in the extended examples.\n",
"The usual `sum`, `mean`, etc. are all available. Having aggregations apply in the head of the rule instead of in the body is powerful, as we will see later in the extended examples.\n",
"\n",
"Here is the [full list](https://cozodb.github.io/current/manual/aggregations.html) of aggregations."
]
@ -3994,9 +3994,9 @@
"Well, the logic that defines how the output relation is computed is given _inline_, as a series of atoms.\n",
"\n",
"By contrast, rules defined using `<-` are called _constant_ rules, which are special cases of _fixed rules_:\n",
"rules whose logic are defined in fixed implementations hidden from the user.\n",
"rules whose logic is defined in fixed implementations hidden from the user.\n",
"\n",
"The `<-` syntax is actually syntax sugar. The full syntax is:"
"The `<-` syntax is syntax sugar. The full syntax is:"
]
},
{
@ -4064,7 +4064,7 @@
"source": [
"Here we are using the fixed rule `Constant`, which takes one _option_ named `data`. Note the curly tail of the arrow.\n",
"\n",
"Fixed rules take in a number of input relations, and by applying custom logic, produce its output relation. The `Constant` fixed rule take in zero input relations.\n",
"Fixed rules take in some input relations, and by applying custom logic, produce their output relation. The `Constant` fixed rule take in zero input relations.\n",
"\n",
"As an example of a less trivial fixed rule, let's say we want to find out who is most popular in the `love` graph. How do we define popularity? \n",
"One way is to say that the higher [PageRank](https://en.wikipedia.org/wiki/PageRank) a person has, the more popular. Calculating PageRank using inline rules\n",
@ -4236,7 +4236,7 @@
"source": [
"Now you have a basic understanding of using the various constructs of Cozo, let's deal with a less trivial dataset.\n",
"\n",
"The data we are going to use, and many examples that we will present, are adapted from the book [Practical Gremlin](https://kelvinlawrence.net/book/Gremlin-Graph-Guide.html), which teaches the Gremlin graph query language, a very different, imperative take on graphs (Datalog, by constrast, is declarative)."
"The data we are going to use, and many examples that we will present, are adapted from the book [Practical Gremlin](https://kelvinlawrence.net/book/Gremlin-Graph-Guide.html), which teaches the Gremlin graph query language, a very different, imperative take on graphs (Datalog, by contrast, is declarative)."
]
},
{
@ -4244,7 +4244,7 @@
"id": "54e3ec1d-c139-48ab-8205-49b71aba4c67",
"metadata": {},
"source": [
"First, let's import the data into our database. We will use fixed rules to do that. First we define the `airport` relation:"
"First, let's import the data into our database. We will use fixed rules to do that. First, we define the `airport` relation:"
]
},
{
@ -4334,7 +4334,7 @@
"\n",
"If your Internet connection is slow, it might help if you download the CSV file manually to your disk and load the local file. \n",
"The line commented out shows how to do it. The relative path is relative to the directory in which you run the `cozoserver` executable.\n",
"As the same file will be downloaded multiple times below, you may also want to download it just once to local disk if your connection is metered."
"As the same file will be downloaded multiple times below, you may also want to download it just once to the local disk if your connection is metered."
]
},
{
@ -4562,7 +4562,7 @@
"id": "af261df6-ce89-47a0-b7d9-2efe794767fb",
"metadata": {},
"source": [
"The `contain` relation contains information of geographical inclusion of entities:"
"The `contain` relation contains information on the geographical inclusion of entities:"
]
},
{
@ -4636,7 +4636,7 @@
"id": "ddc41b91-021a-4ab4-b92f-63013dd655bf",
"metadata": {},
"source": [
"Finally, the `route`s between the airports. This relation is much larger than the rest and contain about 60k rows, which may take a few seconds to download and process:"
"Finally, the `route`s between the airports. This relation is much larger than the rest and contains about 60k rows, which may take a few seconds to download and process:"
]
},
{
@ -5177,7 +5177,7 @@
"id": "db45f1d4-e89a-449e-beb8-a9301acfa86c",
"metadata": {},
"source": [
"How many airports in total?"
"How many airports are there in total?"
]
},
{
@ -5433,7 +5433,7 @@
"id": "34283880-46bb-4759-965e-5e3de5128c15",
"metadata": {},
"source": [
"More useful is the statistics of runways:"
"More useful are the statistics of runways:"
]
},
{
@ -5711,7 +5711,7 @@
"id": "9f037a30-6f67-4fd9-ba3a-968ebfe811ff",
"metadata": {},
"source": [
"It just records the starting and ending aiports of each route, together with the distance. This relation only becomes useful when used as a graph."
"It just records the starting and ending airports of each route, together with the distance. This relation only becomes useful when used as a graph."
]
},
{
@ -6298,7 +6298,7 @@
"id": "4be5b2d5-8aa9-4109-94f4-ccb75e109935",
"metadata": {},
"source": [
"What are the cities directly reachable from LGW, but furtherest away?"
"What are the cities directly reachable from LGW, but furthermost away?"
]
},
{
@ -6664,7 +6664,7 @@
"\n",
"Let's say we want to find the distance of the _shortest route_ between two airports. One way to calculate is to enumerate all the routes between the two airports, and then apply `min` aggregation to the results. This cannot be implemented as stated, since the routes may contain cycles and hence there can be an infinite number of routes between two airports.\n",
"\n",
"Instead, let's think recursively. If we already have all the shortest routes between all nodes, can we derive an _equation_ satisfied by the shortest route? Yes, A shortest route between `a` and `b` is either the distance of a direct route, or the sum of the shortest distance from `a` to `c` and the distance of a direct route from `c` to `d`. We apply our `min` aggregation to this recursive set instead. \n",
"Instead, let's think recursively. If we already have all the shortest routes between all nodes, can we derive an _equation_ satisfied by the shortest route? Yes, the shortest route between `a` and `b` is either the distance of a direct route or the sum of the shortest distance from `a` to `c` and the distance of a direct route from `c` to `d`. We apply our `min` aggregation to this recursive set instead. \n",
"\n",
"Let's write it out and try to find the shortest route between the airports `LHR` and `YPO`:"
]
@ -6736,7 +6736,7 @@
"id": "73d28120-ee91-4893-a796-6cddec3426ad",
"metadata": {},
"source": [
"It works. Since path-finding is such a common operation on graphs, Cozo has seveal fixed rules for that:"
"It works. Since path-finding is such a common operation on graphs, Cozo has several fixed rules for that:"
]
},
{
@ -6812,7 +6812,7 @@
"id": "af89a4d7-2260-466c-a119-bb0ae0a84ebf",
"metadata": {},
"source": [
"Not only is it more efficient, we also get a path for the shortest route.\n",
"Not only is it more efficient, but we also get a path for the shortest route.\n",
"\n",
"Not content with the shortest path, the following calculates the shortest ten paths:"
]
@ -6953,7 +6953,7 @@
"id": "5b3a398d-95ee-4e4f-91a7-5c95f82c8bd3",
"metadata": {},
"source": [
"On the other hand, if efficiency is really important for you, you can use the A* algorithm with a really good heuristic function:"
"On the other hand, if efficiency is really important to you, you can use the A* algorithm with a really good heuristic function:"
]
},
{
@ -7167,7 +7167,7 @@
"id": "c2be4f97-5173-456e-a0d9-b7cf99bf9786",
"metadata": {},
"source": [
"The following example takes a long time to run, since it calculates the betweenness centrality.\n",
"The following example takes a long time to run since it calculates the betweenness centrality.\n",
"Algorithms for calculating the betweenness centrality have high complexity."
]
},

Loading…
Cancel
Save