You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
259 lines
10 KiB
ReStructuredText
259 lines
10 KiB
ReStructuredText
====================================
|
|
Stored relations and transactions
|
|
====================================
|
|
|
|
In Cozo, data are stored in *stored relations* on disk.
|
|
|
|
---------------------------
|
|
Stored relations
|
|
---------------------------
|
|
|
|
To query stored relations,
|
|
use the ``*relation[...]`` or ``*relation{...}`` atoms in inline or fixed rules,
|
|
as explained in the last chapter.
|
|
To manipulate stored relations, use one of the following query options:
|
|
|
|
.. module:: QueryOp
|
|
:noindex:
|
|
|
|
.. function:: :create <NAME> <SPEC>
|
|
|
|
Create a stored relation with the given name and spec.
|
|
No stored relation with the same name can exist beforehand.
|
|
If a query is specified, data from the resulting relation is put into the newly created stored relation.
|
|
This is the only stored relation-related query option in which a query may be omitted.
|
|
|
|
.. function:: :replace <NAME> <SPEC>
|
|
|
|
Similar to ``:create``, except that if the named stored relation exists beforehand,
|
|
it is completely replaced. The schema of the replaced relation need not match the new one.
|
|
You cannot omit the query for ``:replace``.
|
|
If there are any triggers associated, they will be preserved. Note that this may lead to errors if ``:replace``
|
|
leads to schema change.
|
|
|
|
.. function:: :put <NAME> <SPEC>
|
|
|
|
Put rows from the resulting relation into the named stored relation.
|
|
If keys from the data exist beforehand, the corresponding rows are replaced with new ones.
|
|
|
|
.. function:: :ensure <NAME> <SPEC>
|
|
|
|
Ensure that rows specified by the output relation and spec exist in the database,
|
|
and that no other process has written to these rows when the enclosing transaction commits.
|
|
Useful for ensuring read-write consistency.
|
|
|
|
.. function:: :rm <NAME> <SPEC>
|
|
|
|
Remove rows from the named stored relation. Only keys should be specified in ``<SPEC>``.
|
|
Removing a non-existent key is not an error and does nothing.
|
|
|
|
.. function:: :ensure_not <NAME> <SPEC>
|
|
|
|
Ensure that rows specified by the output relation and spec do not exist in the database
|
|
and that no other process has written to these rows when the enclosing transaction commits.
|
|
Useful for ensuring read-write consistency.
|
|
|
|
You can rename and remove stored relations with the system ops ``::relation rename`` and ``::relation remove``,
|
|
described in the system op chapter.
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Create and replace
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
The format of ``<SPEC>`` is identical for all four ops, but the semantics is a bit different.
|
|
We first describe the format and semantics for ``:create`` and ``:replace``.
|
|
|
|
A spec, or a specification for columns, is enclosed in curly braces ``{}`` and separated by commas::
|
|
|
|
?[address, company_name, department_name, head_count] <- $input_data
|
|
|
|
:create dept_info {
|
|
company_name: String,
|
|
department_name: String,
|
|
=>
|
|
head_count: Int,
|
|
address: String,
|
|
}
|
|
|
|
Columns before the symbol ``=>`` form the *keys* (actually a composite key) for the stored relation,
|
|
and those after it form the *values*.
|
|
If all columns are keys, the symbol ``=>`` may be omitted.
|
|
The order of columns matters.
|
|
Rows are stored in lexicographically sorted order in trees according to their keys.
|
|
|
|
In the above example, we explicitly specified the types for all columns.
|
|
In case of type mismatch,
|
|
the system will first try to coerce the values given, and if that fails, the query is aborted with an error.
|
|
You can omit types for columns, in which case their types default to ``Any?``,
|
|
i.e. all values are acceptable.
|
|
For example, the above query with all types omitted is::
|
|
|
|
?[address, company_name, department_name, head_count] <- $input_data
|
|
|
|
:create dept_info { company_name, department_name => head_count, address }
|
|
|
|
In the example, the bindings for the output match the columns exactly (though not in the same order).
|
|
You can also explicitly specify the correspondence::
|
|
|
|
?[a, b, count(c)] <- $input_data
|
|
|
|
:create dept_info {
|
|
company_name = a,
|
|
department_name = b,
|
|
=>
|
|
head_count = count(c),
|
|
address: String = b
|
|
}
|
|
|
|
You *must* use explicit correspondence if the entry head contains aggregation,
|
|
since names such as ``count(c)`` are not valid column names.
|
|
The ``address`` field above shows how to specify both a type and a correspondence.
|
|
|
|
Instead of specifying bindings, you can specify an expression that generates default values by using ``default``::
|
|
|
|
?[a, b] <- $input_data
|
|
|
|
:create dept_info {
|
|
company_name = a,
|
|
department_name = b,
|
|
=>
|
|
head_count default 0,
|
|
address default ''
|
|
}
|
|
|
|
The expression is evaluated anew for each row, so if you specified a UUID-generating functions,
|
|
you will get a different UUID for each row.
|
|
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
Put, remove, ensure and ensure-not
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
For ``:put``, ``:remove``, ``:ensure`` and ``:ensure_not``,
|
|
you do not need to specify all existing columns in the spec if the omitted columns have a default generator,
|
|
or if the type of the column is nullable, in which case the value defaults to ``null``.
|
|
For these operations, specifying default values does not have any effect and will not replace existing ones.
|
|
|
|
For ``:put`` and ``:ensure``, the spec needs to contain enough bindings to generate all keys and values.
|
|
For ``:rm`` and ``:ensure_not``, it only needs to generate all keys.
|
|
|
|
------------------------------------------------------
|
|
Chaining queries
|
|
------------------------------------------------------
|
|
|
|
Each script you send to Cozo is executed in its own transaction.
|
|
To ensure consistency of multiple operations on data,
|
|
You can define multiple queries in a single script,
|
|
by wrapping each query in curly braces ``{}``.
|
|
Each query can have its independent query options.
|
|
Execution proceeds for each query serially, and aborts at the first error encountered.
|
|
The returned relation is that of the last query.
|
|
|
|
The ``:assert (some|none)``, ``:ensure`` and ``:ensure_not`` query options allow you to express complicated constraints
|
|
that must be satisfied for your transaction to commit.
|
|
|
|
This example uses three queries to put and remove rows atomically
|
|
(either all succeed or all fail), and ensure that at the end of the transaction
|
|
an untouched row exists::
|
|
|
|
{
|
|
?[a, b] <- [[1, 'one'], [3, 'three']]
|
|
:put rel {a => b}
|
|
}
|
|
{
|
|
?[a] <- [[2]]
|
|
:rm rel {a}
|
|
}
|
|
{
|
|
?[a, b] <- [[4, 'four']]
|
|
:ensure rel {a => b}
|
|
}
|
|
|
|
When a transaction starts, a snapshot is used,
|
|
so that only already committed data,
|
|
or data written within the same transaction, are visible to queries.
|
|
At the end of the transaction, changes are only committed if there are no conflicts
|
|
and no errors are raised.
|
|
If any mutation activate triggers, those triggers execute in the same transaction.
|
|
|
|
------------------------------------------------------
|
|
Triggers and indices
|
|
------------------------------------------------------
|
|
|
|
Cozo does not have traditional indices on stored relations.
|
|
Instead, you define regular stored relations that are used as indices.
|
|
At query time, you explicitly query the index instead of the original stored relation.
|
|
|
|
You synchronize your indices and the original by ensuring that any mutations you do on the database
|
|
write the correct data to the "canonical" relation and its indices in the same transaction.
|
|
As doing this by hand for every mutation leads to lots of repetitions
|
|
and is error-prone,
|
|
Cozo supports *triggers* to do it automatically for you.
|
|
|
|
You attach triggers to a stored relation by running the system op ``::set_triggers``::
|
|
|
|
::set_triggers <REL_NAME>
|
|
|
|
on put { <QUERY> }
|
|
on rm { <QUERY> }
|
|
on replace { <QUERY> }
|
|
on put { <QUERY> } # you can specify as many triggers as you need
|
|
|
|
``<QUERY>`` can be any valid query.
|
|
|
|
The ``on put`` triggers will run when new data is inserted or upserted,
|
|
which can be activated by ``:put``, ``:create`` and ``:replace`` query options.
|
|
The implicitly defined rules ``_new[]`` and ``_old[]`` can be used in the triggers, and
|
|
contain the added rows and the replaced rows respectively.
|
|
|
|
The ``on rm`` triggers will run when data is deleted, which can be activated by a ``:rm`` query option.
|
|
The implicitly defined rules ``_new[]`` and ``_old[]`` can be used in the triggers,
|
|
and contain the keys of the rows for deleted rows (even if no row with the key actually exist) and the rows
|
|
actually deleted (with both keys and non-keys).
|
|
|
|
The ``on replace`` triggers will be activated by a ``:replace`` query option.
|
|
They are run before any ``on put`` triggers.
|
|
|
|
All triggers for a relation must be specified together, in the same ``::set_triggers`` system op.
|
|
If used again, all the triggers associated with the stored relation are replaced.
|
|
To remove all triggers from a stored relation, use ``::set_triggers <REL_NAME>`` followed by nothing.
|
|
|
|
As an example of using triggers to maintain an index, suppose we have the following relation::
|
|
|
|
:create rel {a => b}
|
|
|
|
We often want to query ``*rel[a, b]`` with ``b`` bound but ``a`` unbound. This will cause a full scan,
|
|
which can be expensive. So we need an index::
|
|
|
|
:create rel.rev {b, a}
|
|
|
|
In the general case, we cannot assume a functional dependency ``b => a``, so in the index both fields appear as keys.
|
|
|
|
To manage the index automatically::
|
|
|
|
::relation set_triggers rel
|
|
|
|
on put {
|
|
?[a, b] := _new[a, b]
|
|
|
|
:put rel.rev{ b, a }
|
|
}
|
|
on rm {
|
|
?[a, b] := _old[a, b]
|
|
|
|
:rm rel.rev{ b, a }
|
|
}
|
|
|
|
With the index set up, you can use ``*rel.rev{..}`` in place of ``*rel{..}`` in your queries.
|
|
|
|
Indices in Cozo are manual, but extremely flexible, since you need not conform to any predetermined patterns
|
|
in your use of ``_old[]`` and ``_new[]``.
|
|
For simple queries, the need to explicitly elect to use an index can seem cumbersome,
|
|
but for complex ones, the deterministic evaluation entailed can be a huge blessing.
|
|
|
|
Triggers can be creatively used for other purposes as well.
|
|
|
|
.. WARNING::
|
|
|
|
Loops in your triggers can cause non-termination.
|
|
A loop occurs when a relation has triggers which affect other relations,
|
|
which in turn have other triggers that ultimately affect the starting relation. |