finish the tutorial

2 years ago · d2c3dbd243
parent 618e0f2bf6
commit d2c3dbd243
9 changed files with 6766 additions and 5777 deletions
--- a/IN_PROG.md
+++ b/IN_PROG.md
@ -71,6 +71,16 @@ This is where Cozo comes in. We want to prove, through a real database implement
 * Query rules are compiled into trees of relations (the relational algebra) before execution. Each rule is executed deterministically (no query planner).
 * The execution of the whole query follows the least fixed point semantics of stratified Datalog with negation and aggregation and is done by the bottom-up semi-naive algorithm (instead of the query/subquery top-down algorithms used by many recent datalog implementations, especially in the Clojure world). To prevent calculating unnecessary results that are only thrown away at the last stage, the magic-set rewriting technique is employed as a pre-processing step before compiling the query. This step is completely deterministic.

+## Status and contributions
+
+Cozo is currently a personal project and there is no plan for commercialization.
+We plan to keep it free forever under copyleft licenses.
+
+We intend to evolve Cozo slowly but steadily, 
+over timescale of tens of years.
+We can do this because we are not under any pressure, 
+financial or managerial, to rush things up.
+
 ## Licensing

 The original contents of this project are licensed under AGPL-3.0 or later, with the following exceptions:
--- a/docs/source/stored.rst
+++ b/docs/source/stored.rst
@ -121,7 +121,6 @@ For ``:put``, ``:remove``, ``:ensure`` and ``:ensure_not``,
 you do not need to specify all existing columns in the spec if the omitted columns have a default generator,
 in which case the generator will be used to generate a value,
 or the type of the column is nullable, in which case the value is ``null``.
-Also, the order of the columns does not matter, and neither does whether a column occurs in the key or value position.
 The spec specified when the relation was created will be consulted to know how to store data correctly.
 Specifying default values does not have any effect and will not replace existing ones.

--- a/docs/tutorial/0-getting-started.ipynb
+++ b/docs/tutorial/0-getting-started.ipynb
--- a/docs/tutorial/1-acrobatics.ipynb
+++ b/docs/tutorial/1-acrobatics.ipynb
--- a/docs/tutorial/2-storage.ipynb
+++ b/docs/tutorial/2-storage.ipynb
--- a/docs/tutorial/3-distillation.ipynb
+++ b/docs/tutorial/3-distillation.ipynb
--- a/docs/tutorial/4-harmony.ipynb
+++ b/docs/tutorial/4-harmony.ipynb
@ -1,247 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Transaction and index"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%reload_ext pycozo.ipyext_direct\n",
-    "%cozo_auth tutorial *******"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "It is true that stored relations and triples look like diametrically opposed concepts in Cozo. With triples you have to set up elaborate schema before use. With stored relations you just store your results, without even needing to pre-declare anything. Yet they are actually designed to work together. The intended way that they work together is through transactions."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We have already met transactions when we learned how to insert and mutate data in the triple store. But transactions can in fact contain many more things than `put` and `retract` requests. Let's first have the following schema in place:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<style type=\"text/css\">\n",
-       "#T_5021a_row0_col0, #T_5021a_row1_col0, #T_5021a_row2_col0, #T_5021a_row3_col0 {\n",
-       "  color: #307fc1;\n",
-       "}\n",
-       "#T_5021a_row0_col1, #T_5021a_row1_col1, #T_5021a_row2_col1, #T_5021a_row3_col1 {\n",
-       "  color: black;\n",
-       "}\n",
-       "</style>\n",
-       "<table id=\"T_5021a\">\n",
-       "  <thead>\n",
-       "    <tr>\n",
-       "      <th class=\"blank level0\" >&nbsp;</th>\n",
-       "      <th id=\"T_5021a_level0_col0\" class=\"col_heading level0 col0\" >attr_id</th>\n",
-       "      <th id=\"T_5021a_level0_col1\" class=\"col_heading level0 col1\" >op</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "    <tr>\n",
-       "      <th id=\"T_5021a_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
-       "      <td id=\"T_5021a_row0_col0\" class=\"data row0 col0\" >10000030</td>\n",
-       "      <td id=\"T_5021a_row0_col1\" class=\"data row0 col1\" >assert</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_5021a_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
-       "      <td id=\"T_5021a_row1_col0\" class=\"data row1 col0\" >10000031</td>\n",
-       "      <td id=\"T_5021a_row1_col1\" class=\"data row1 col1\" >assert</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_5021a_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
-       "      <td id=\"T_5021a_row2_col0\" class=\"data row2 col0\" >10000032</td>\n",
-       "      <td id=\"T_5021a_row2_col1\" class=\"data row2 col1\" >assert</td>\n",
-       "    </tr>\n",
-       "    <tr>\n",
-       "      <th id=\"T_5021a_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
-       "      <td id=\"T_5021a_row3_col0\" class=\"data row3 col0\" >10000033</td>\n",
-       "      <td id=\"T_5021a_row3_col1\" class=\"data row3 col1\" >assert</td>\n",
-       "    </tr>\n",
-       "  </tbody>\n",
-       "</table>\n"
-      ],
-      "text/plain": [
-       "<pandas.io.formats.style.Styler at 0x1171c8580>"
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    ":schema\n",
-    "\n",
-    ":put ap {\n",
-    "    code: string unique\n",
-    "}\n",
-    "\n",
-    ":put rt {\n",
-    "    src: ref,\n",
-    "    dst: ref,\n",
-    "    distance: float\n",
-    "}"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Now let's try to add some data, but with a check:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\u001b[31meval::assert_none_failure\u001b[0m\n",
-       "\n",
-       "  \u001b[31m×\u001b[0m Triple store transaction failed as a post-condition failed\n",
-       "\u001b[31m  ╰─▶ \u001b[0mThe query is asserted to return no result, but a tuple [-100.0] is found\n",
-       "   ╭─[6:1]\n",
-       " \u001b[2m6\u001b[0m │     ?[dist] := [r rt.distance dist], dist < 0\n",
-       " \u001b[2m7\u001b[0m │     :assert none\n",
-       "   · \u001b[35;1m    ────────────\u001b[0m\n",
-       " \u001b[2m8\u001b[0m │ }\n",
-       "   ╰────\n"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    ":tx\n",
-    "\n",
-    "{rt.src: {ap.code: 'A'}, rt.dst: {ap.code: 'B'}, rt.distance: -100}\n",
-    "\n",
-    ":after {\n",
-    "    ?[dist] := [r rt.distance dist], dist < 0\n",
-    "    :assert none\n",
-    "}"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We have used the transaction directive `:after` signifying we want the query inside its braces to be checked _after_ the main transaction is done. In the `:after` block itself, we used the directive `:assert none` to indicate the above query should return no results at all. The error message also shows us the offending tuple."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We can check that in this case, no triple is actually inserted, since the transaction failed:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/html": [
-       "<style type=\"text/css\">\n",
-       "</style>\n",
-       "<table id=\"T_e96b0\">\n",
-       "  <thead>\n",
-       "    <tr>\n",
-       "      <th class=\"blank level0\" >&nbsp;</th>\n",
-       "      <th id=\"T_e96b0_level0_col0\" class=\"col_heading level0 col0\" >r</th>\n",
-       "      <th id=\"T_e96b0_level0_col1\" class=\"col_heading level0 col1\" >dist</th>\n",
-       "    </tr>\n",
-       "  </thead>\n",
-       "  <tbody>\n",
-       "  </tbody>\n",
-       "</table>\n"
-      ],
-      "text/plain": [
-       "<pandas.io.formats.style.Styler at 0x1057ce7a0>"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "?[r, dist] := [r rt.distance dist]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Besides `:after`, there is also the `:before` directive, running before any triple transactions. In the block itself, the function `assert()` can also be very useful in the body of rules. See the manual for more information."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "You may have any number of `:put`, `:retract`, `:before` or `:after`. They can be interleaved. When a transaction is run, all `:before` blocks are run in the order they are defined, then all `:put` and `:retract`, and finally all `:after` blocks are run."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "The real power of the transaction is that you can put anything in your `:before` and `:after` blocks, including puts or retracts to stored relations!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
--- a/docs/tutorial/tutorial.ipynb
+++ b/docs/tutorial/tutorial.ipynb
--- a/src/parse/query.rs
+++ b/src/parse/query.rs
@ -14,9 +14,9 @@ use crate::algo::AlgoHandle;
 use crate::data::aggr::{parse_aggr, Aggregation};
 use crate::data::expr::Expr;
 use crate::data::program::{
-    AlgoApply, AlgoRuleArg, InputAtom, InputNamedFieldRelationApplyAtom, InputProgram,
-    InputRelationApplyAtom, InputInlineRule, InputRuleApplyAtom, InputInlineRulesOrAlgo, QueryAssertion,
-    QueryOutOptions, RelationOp, SortDir, Unification,
+    AlgoApply, AlgoRuleArg, InputAtom, InputInlineRule, InputInlineRulesOrAlgo,
+    InputNamedFieldRelationApplyAtom, InputProgram, InputRelationApplyAtom, InputRuleApplyAtom,
+    QueryAssertion, QueryOutOptions, RelationOp, SortDir, Unification,
 };
 use crate::data::relation::{ColType, ColumnDef, NullableColType, StoredRelationMetadata};
 use crate::data::symb::{Symbol, PROG_ENTRY};
@ -193,6 +193,11 @@ pub(crate) fn parse_query(
                let algo_impl = handle.get_impl()?;
                algo_impl.process_options(&mut options, span)?;
                let arity = algo_impl.arity(&options, &head, span)?;
+
+                ensure!(
+                    arity == 0 || head.len() == 0 || arity == head.len(),
+                    FixedRuleHeadArityMismatch(arity, head.len(), span)
+                );
                progs.insert(
                    name,
                    InputInlineRulesOrAlgo::Algo {
@ -726,19 +731,13 @@ fn parse_algo_rule(

    let algo = AlgoHandle::new(algo_name, name_pair.extract_span());

-    #[derive(Debug, Error, Diagnostic)]
-    #[error("Algorithm rule head arity mismatch")]
-    #[diagnostic(code(parser::algo_rule_head_arity_mismatch))]
-    #[diagnostic(help("Expected arity: {0}, number of arguments given: {1}"))]
-    struct AlgoRuleHeadArityMismatch(usize, usize, #[label] SourceSpan);
-
    let algo_impl = algo.get_impl()?;
    algo_impl.process_options(&mut options, args_list_span)?;
    let arity = algo_impl.arity(&options, &head, name_pair.extract_span())?;

    ensure!(
        head.is_empty() || arity == head.len(),
-        AlgoRuleHeadArityMismatch(arity, head.len(), args_list_span)
+        FixedRuleHeadArityMismatch(arity, head.len(), args_list_span)
    );

    Ok((
@ -755,6 +754,12 @@ fn parse_algo_rule(
    ))
 }

+#[derive(Debug, Error, Diagnostic)]
+#[error("Fixed rule head arity mismatch")]
+#[diagnostic(code(parser::fixed_rule_head_arity_mismatch))]
+#[diagnostic(help("Expected arity: {0}, number of arguments given: {1}"))]
+struct FixedRuleHeadArityMismatch(usize, usize, #[label] SourceSpan);
+
 fn make_empty_const_rule(prog: &mut InputProgram, bindings: &[Symbol]) {
    let entry_symbol = Symbol::new(PROG_ENTRY, Default::default());
    let mut options = BTreeMap::new();
@ -769,9 +774,7 @@ fn make_empty_const_rule(prog: &mut InputProgram, bindings: &[Symbol]) {
        entry_symbol.clone(),
        InputInlineRulesOrAlgo::Algo {
            algo: AlgoApply {
-                algo: AlgoHandle {
-                    name: entry_symbol,
-                },
+                algo: AlgoHandle { name: entry_symbol },
                rule_args: vec![],
                options,
                head: bindings.to_vec(),