fluidb/docs/source/stored.rst

====================================
Stored relations and transactions
====================================

Persistent databases store data on disk. As Cozo is a relational database,
data are stored in *stored relations* on disk, which is analogous to tables in SQL databases.

---------------------------
Using stored relations
---------------------------

We already know how to query stored relations: 
use the ``:relation[...]`` or ``:relation{...}`` atoms in inline or fixed rules.
To manipulate stored relations, use one of the following query options:

.. module:: QueryOp
    :noindex:

.. function:: :create <NAME> <SPEC>

    Creates a stored relation with the given name and the given spec. 
    The named stored relation must not exist before.
    If a query is specified, data from the resulting relation is put into the created stored relation.
    This is the only stored relation-related query option in which a query may be omitted.

.. function:: :replace <NAME> <SPEC>

    This is similar to ``:create``, except that if the named stored relation exists beforehand, 
    it is completely replaced. The schema of the replaced relation need not match the new one.
    You cannot omit the query for ``:replace``.

.. function:: :put <NAME> <SPEC>

    Put data from the resulting relation into the named stored relation.
    If keys from the data exist beforehand, the rows are simply replaced with new ones.

.. function:: :ensure <NAME> <SPEC>

    Ensures that rows specified by the output relation and spec already exist in the database,
    and that no other process has written to these rows at commit since the transaction starts.
    Useful for ensuring read-write consistency.

.. function:: :rm <NAME> <SPEC>

    Remove data from the resulting relation from the named stored relation.
    Only keys are used.
    If a row from the resulting relation does not match any keys, nothing happens for that row,
    and no error is raised.

.. function:: :ensure_not <NAME> <SPEC>

    Ensures that rows specified by the output relation and spec do not exist in the database,
    and that no other process has written to these rows at commit since the transaction starts.
    Useful for ensuring read-write consistency.

You can rename and remove stored relations with the system ops ``::relation rename`` and ``::relation remove``,
described in the system op chapter.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Create and replace
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The format of ``<SPEC>`` is identical for all four ops, whereas the semantics is a bit different.

We first describe the format and semantics for ``:create`` and ``:replace``.
A spec is a specification for columns, enclosed in curly braces ``{}`` and separated by commas::

    ?[address, company_name, department_name, head_count] <- $input_data

    :create dept_info {
        company_name: String,
        department_name: String,
        =>
        head_count: Int,
        address: String,
    }

Columns before the symbol ``=>`` form the *keys* (actually, a composite key) for the stored relation,
and those after it form the *values*.
If all columns are keys, the symbol ``=>`` may be omitted altogether.
The order of columns matters in the specification,
especially for keys, as data is stored in lexicographically sorted order in trees,
which has implications for data access in queries.
Each key corresponds to a single value.

In the above example, we explicitly specified the types for all columns.
Type specification is described in its own chapter.
If the types of the rows do not match the specified types,
the system will first try to coerce the values, and if that fails, the query is aborted.
You can selectively omit types for columns, and columns with types omitted will have the type ``Any?``,
which is valid for any value.
As an example, if you do not care about type validation, the above query can be written as::

    ?[address, company_name, department_name, head_count] <- $input_data

    :create dept_info { company_name, department_name => head_count, address }

In the example, the bindings for the output match the columns exactly (though not in the same order).
You can also explicitly specify the correspondence::

    ?[a, b, count(c)] <- $input_data

    :create dept_info { company_name = a, department_name = b, => head_count = count(c), address = b }

You *must* use explicit correspondence if the entry head contains aggregation.

Instead of specifying bindings, you can specify an expression to generate values::

    ?[a, b] <- $input_data

    :create dept_info { company_name = a, department_name = b, => head_count default 0, address default '' }

The expression is evaluated once for each row, so for example if you specified one of the UUID-generating functions,
you will get a different UUID for each row.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Put, remove, ensure and ensure-not
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

For ``:put``, ``:remove``, ``:ensure`` and ``:ensure_not``,
you do not need to specify all existing columns in the spec if the omitted columns have a default generator,
in which case the generator will be used to generate a value,
or the type of the column is nullable, in which case the value is ``null``.
Also, the order of the columns does not matter, and neither does whether a column occurs in the key or value position.
The spec specified when the relation was created will be consulted to know how to store data correctly.
Specifying default values does not have any effect and will not replace existing ones.

For ``:put`` and ``:ensure``, the spec needs to contain enough bindings to generate all keys and values.
For ``:rm`` and ``:ensure_not``, it only needs to generate all keys.

------------------------------------------------------
Chaining queries into a single transaction
------------------------------------------------------

You can execute multiple queries in one go,
by wrapping each query in curly braces ``{}``. Each query can have its independent query options.
Execution proceeds for each query serially, and aborts at the first error encountered.
The returned relation is that of the last query.

Multiple queries passed in one go are executed in a single transaction. Within the transaction,
execution of queries adheres to multi-version concurrency control: only data that are already committed,
or written within the same transaction, are read,
and at the end of the transaction, any changes to stored relations are only committed if there are no conflicts
and no errors are raised.


------------------------------------------------------
Triggers and ad-hoc indices
------------------------------------------------------
update docs 2 years ago			`====================================`
			`Stored relations and transactions`
			`====================================`

stored relations docs 2 years ago			`Persistent databases store data on disk. As Cozo is a relational database,`
			`data are stored in stored relations on disk, which is analogous to tables in SQL databases.`

			`---------------------------`
			`Using stored relations`
			`---------------------------`

stored relation docs 2 years ago			`We already know how to query stored relations:`
change terminology 2 years ago			use the ``:relation[...]`` or ``:relation{...}`` atoms in inline or fixed rules.
stored relation docs 2 years ago			`To manipulate stored relations, use one of the following query options:`
stored relations docs 2 years ago
stored relation docs 2 years ago			`.. module:: QueryOp`
			`:noindex:`

			`.. function:: :create <NAME> <SPEC>`

			`Creates a stored relation with the given name and the given spec.`
			`The named stored relation must not exist before.`
			`If a query is specified, data from the resulting relation is put into the created stored relation.`
			`This is the only stored relation-related query option in which a query may be omitted.`

			`.. function:: :replace <NAME> <SPEC>`

			This is similar to ``:create``, except that if the named stored relation exists beforehand,
			`it is completely replaced. The schema of the replaced relation need not match the new one.`
			You cannot omit the query for ``:replace``.

			`.. function:: :put <NAME> <SPEC>`

			`Put data from the resulting relation into the named stored relation.`
improve docs 2 years ago			`If keys from the data exist beforehand, the rows are simply replaced with new ones.`
stored relation docs 2 years ago
ensure and ensure_not 2 years ago			`.. function:: :ensure <NAME> <SPEC>`

			`Ensures that rows specified by the output relation and spec already exist in the database,`
			`and that no other process has written to these rows at commit since the transaction starts.`
			`Useful for ensuring read-write consistency.`

stored relation docs 2 years ago			`.. function:: :rm <NAME> <SPEC>`

			`Remove data from the resulting relation from the named stored relation.`
			`Only keys are used.`
improve docs 2 years ago			`If a row from the resulting relation does not match any keys, nothing happens for that row,`
stored relation docs 2 years ago			`and no error is raised.`

ensure and ensure_not 2 years ago			`.. function:: :ensure_not <NAME> <SPEC>`

			`Ensures that rows specified by the output relation and spec do not exist in the database,`
			`and that no other process has written to these rows at commit since the transaction starts.`
			`Useful for ensuring read-write consistency.`

improve docs 2 years ago			You can rename and remove stored relations with the system ops ``::relation rename`` and ``::relation remove``,
stored relation docs 2 years ago			`described in the system op chapter.`

			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
			`Create and replace`
			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

			The format of ``<SPEC>`` is identical for all four ops, whereas the semantics is a bit different.

			We first describe the format and semantics for ``:create`` and ``:replace``.
			A spec is a specification for columns, enclosed in curly braces ``{}`` and separated by commas::

			`?[address, company_name, department_name, head_count] <- $input_data`

			`:create dept_info {`
			`company_name: String,`
			`department_name: String,`
			`=>`
			`head_count: Int,`
			`address: String,`
			`}`

improve docs 2 years ago			Columns before the symbol ``=>`` form the keys (actually, a composite key) for the stored relation,
stored relation docs 2 years ago			`and those after it form the values.`
			If all columns are keys, the symbol ``=>`` may be omitted altogether.
improve docs 2 years ago			`The order of columns matters in the specification,`
stored relation docs 2 years ago			`especially for keys, as data is stored in lexicographically sorted order in trees,`
			`which has implications for data access in queries.`
improve docs 2 years ago			`Each key corresponds to a single value.`
stored relation docs 2 years ago
improve docs 2 years ago			`In the above example, we explicitly specified the types for all columns.`
stored relation docs 2 years ago			`Type specification is described in its own chapter.`
improve docs 2 years ago			`If the types of the rows do not match the specified types,`
stored relation docs 2 years ago			`the system will first try to coerce the values, and if that fails, the query is aborted.`
			You can selectively omit types for columns, and columns with types omitted will have the type ``Any?``,
			`which is valid for any value.`
			`As an example, if you do not care about type validation, the above query can be written as::`

			`?[address, company_name, department_name, head_count] <- $input_data`

			`:create dept_info { company_name, department_name => head_count, address }`

			`In the example, the bindings for the output match the columns exactly (though not in the same order).`
			`You can also explicitly specify the correspondence::`

			`?[a, b, count(c)] <- $input_data`

			`:create dept_info { company_name = a, department_name = b, => head_count = count(c), address = b }`

improve docs 2 years ago			`You must use explicit correspondence if the entry head contains aggregation.`
stored relation docs 2 years ago
			`Instead of specifying bindings, you can specify an expression to generate values::`

			`?[a, b] <- $input_data`

			`:create dept_info { company_name = a, department_name = b, => head_count default 0, address default '' }`

			`The expression is evaluated once for each row, so for example if you specified one of the UUID-generating functions,`
			`you will get a different UUID for each row.`

			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`
ensure and ensure_not 2 years ago			`Put, remove, ensure and ensure-not`
stored relation docs 2 years ago			`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^`

ensure and ensure_not 2 years ago			For ``:put``, ``:remove``, ``:ensure`` and ``:ensure_not``,
improve docs 2 years ago			`you do not need to specify all existing columns in the spec if the omitted columns have a default generator,`
stored relation docs 2 years ago			`in which case the generator will be used to generate a value,`
			or the type of the column is nullable, in which case the value is ``null``.
improve docs 2 years ago			`Also, the order of the columns does not matter, and neither does whether a column occurs in the key or value position.`
stored relation docs 2 years ago			`The spec specified when the relation was created will be consulted to know how to store data correctly.`
			`Specifying default values does not have any effect and will not replace existing ones.`

ensure and ensure_not 2 years ago			For ``:put`` and ``:ensure``, the spec needs to contain enough bindings to generate all keys and values.
			For ``:rm`` and ``:ensure_not``, it only needs to generate all keys.
stored relations docs 2 years ago
			`------------------------------------------------------`
			`Chaining queries into a single transaction`
			`------------------------------------------------------`

stored relation docs 2 years ago			`You can execute multiple queries in one go,`
improve docs 2 years ago			by wrapping each query in curly braces ``{}``. Each query can have its independent query options.
complete queries docs 2 years ago			`Execution proceeds for each query serially, and aborts at the first error encountered.`
			`The returned relation is that of the last query.`

stored relations docs 2 years ago			`Multiple queries passed in one go are executed in a single transaction. Within the transaction,`
complete queries docs 2 years ago			`execution of queries adheres to multi-version concurrency control: only data that are already committed,`
stored relations docs 2 years ago			`or written within the same transaction, are read,`
edit docs 2 years ago			`and at the end of the transaction, any changes to stored relations are only committed if there are no conflicts`
complete queries docs 2 years ago			`and no errors are raised.`
update docs 2 years ago

			`------------------------------------------------------`
stored relations docs 2 years ago			`Triggers and ad-hoc indices`
update docs 2 years ago			`------------------------------------------------------`