A sample space of 40 provides enough randomness so that the
likelihood of two tokens being the same is almost entirely
ruled out. And even if they are, it shouldn't be a problem
since they correspond to different accounts, although the
possibility itself is as good as impossible due to the
possible number of permutations being 8.560685103E+94, so
we can safely ignore this "risk".
40 is a sensible balance between what clients have to send
and the level of security we're expecting to provide, than
64 bytes.
Yeah, signal handling is not one of the best things to do, and it
definitely is one of the messiest things to do. This commit
simplifies the way we handle "termination signals" which can be
SIGTERM or SIGINT on POSIX-compliant systems or can be Ctrl+C
or Ctrl+Break on Windows systems; whenever we receive any of these,
we'll attempt to terminate the database server.
Now, instead of waiting for multiple futures to complete, we create
a different `TerminationSignal` type that is a `Future`, which on
polling checks whether either of the signals have been received
or closed. We return the poll state for either.
The `upgrade` subcommand from `skyd` which was removed in 0.7, but was
erroneously accepted in the CLI parameters has been removed. This
was silently ignored.
This might make one think that we are being outrageously strict,
but at the end of the day, it can help investigate crashes
or inspect logs without artwork all over the place.
Following some discussions, the `user` mode was renamed to `dev`
mode.
This commit also upgrades some deps, other than clap which has
deprecated yaml support (we will continue to use 2.x).
Finally, the CHANGELOG was updated.
* Add debian package generation
* Install cargo-deb on `make deb`
* Reload systemd daemon on postinst
* Add auto upload for Debian packages
* Consider using runner.os for simplicity
All deps except for `clap` has been upgraded. Due to the removal of
the `args` field in `ArgMatches` in v3, and our dependence on the
field, we cannot upgrade to the latest version.
A PR has been created and once it is merged or a workaround
suggested, we can upgrade. (see clap-rs/clap#3265)
* Move macros into module
* Add the `whereami` action to identify the current entity
* Show entity group in the skysh prompt
* Add tests and actiondoc for `whereami`
* Add changelog entry
* Upgrade deps
The pipeline impl had a bug which caused a parse error; this happened
because we directly wrote the length as an integer (with the tsymbol)
when we were supposed to only write the integer in its string form
to the stream. This was fixed.
Also, some preliminary tests were added for pipelines.
The `ParsedConfig` struct was renamed to `ConfigurationSet` because it
is more clear in contexts as parsing can be an ambiguous term in several
places.
Signed-off-by: Sayan Nandan <nandansayan@outlook.com>
43bef62a incorrectly dismissed the check for host/port config in the
case of a non-TLS setup. This commit fixes that.
Signed-off-by: Sayan Nandan <nandansayan@outlook.com>
The previous configuration handling was rather messed up,
which however is something that this commit attempts to
simplify.
The check for configuration conflict was resolved with a far
more feasible approach and the handling of CLI/config file
configuration was also simplified greatly.
Signed-off-by: Sayan Nandan <nandansayan@outlook.com>
These iterators (if you'd call them) provide safe abstractions over raw
pointers, enabling us to easily use them in the storage engine, reducing
verbosity and redundancy.
This has the ability to introduce UB because advancing the cursor might
invalidate the source pointers obtained by `protocol::Parser::parse()`
That's why we only forward the cursor after the query has been
executed. This is absolutely fine because even if another query
packet has been buffered, we will only forward by the amount we read
and not anything more.
This commit does an extreme amount of `unsafe` work to reduce the amount
of data that is copied around. Previously, we:
1. Copied in from the buffer
2. Copied again when we did a ptr::read
This changes that to:
1. Zero-copies for operations that can operate on borrowed values
(i.e values by ref like GET, KEYLEN, ..)
2. One copy for operations that need owned values
Overall, this may not immediately seem beneficial for small queries
(afterall, computer memory is very fast), but for large keys -- this can
make a very significant difference. However, a big difference can be noticed
in memory usage under heavy load (with as much as a tenfold drop in some
cases).
But to make this possible, we have to break free from Rust's borrowing
rules as we can't normally pass around refs easily. So, we just turn
everything into raw pointers and operate on them directly. This is why
we introduce a significant amount of unsafe code.
Where actions were supposed to report an action error, they silently ran
their significant part, ignoring the rest. This was fixed in:
- `DROP TABLE`
- `INSPECT KEYSPACES`
- `INSPECT KEYSPACE <ksid>`
- `INSPECT TABLE <entity>`
- `USE <entity>`
Tests for the same were added
When a new instance is created, we need to:
1. Create the tree
This ONLY creates the directories
2. Create the PRELOAD
This is critical because this is our check for a new instance
3. Flush the tables
This is important because we have never flushed the tables/ks before.
If we don't do this -- the server would fail to start with a
`directory not found` error.
Checking all encoding errors beforehand simplifies a lot of things for
us. At the same time, this saves a lot of bandwidth as we don't have to
write encoding errors for each element -- instead we just write one.
* Add forceful dropping of keyspaces
This commit also improves the reliability of `drop keyspace` in general
* Add changelog
* Add tests for `force_drop_keyspace`
* Upgrade deps
This commit adds a storage_type segment to the PARTMAP disk file. This
contains information about the storage type of the table.
Is it volatile? Is it persistent? 8-bits were added for future
improvements.
* Use our own lock instead of parking_lot::Mutex
* Account for spurious failures in cmpxchg weak
* Ignore send error because parent may have panicked
The parent thread may have already panicked, dropping the rx.
* Enable maximum connections to be configured
* Add arbiter for handling server startup
* Add handling of maxcon for command-line args
* Add changelog entry
* Remove the need for TableLockStateGuard
The htable impl uses locks under the hood making external locks
redundant.
* Use atomics instead of rwlock for poisoned state
* Simplify snapshot locking
* Remove the pid file if runtime errors occur
* Clean up error handling and fix pid file creation
The pid file was being created before evaluating the args, now it may
happen that incorrect args or --help was passed: in that event, the pid
file remains created. This was also fixed, besides some refactoring.
* Deter other processes from using the same data dir
For more information, see #167
* Don't lock `pid_file`
Windows has mandatory locking so second instance won't be able to read
the PID of the other process. We'll just keep the file descriptor/handle
open
This is very useful because it removes the need for user intervention in
the event save on termination fails. Say the save operation fails due to
'some bad daemon' changing the directory's perms. Now skyd reports this
error while trying to save upon termination. Our sysadmin now fixes the
perms issue. The previous design would force the sysadmin to _somehow_
foreground skyd and hit enter. That is silly. The new design just
attempts to do a save operation every 10 seconds. So in case the issue
is fixed, the save operation will recover on its own.
Why not exponential backoff?
That's because the issue can be fixed some long time later and we may
have reached a large backoff value so the save that could have succeeded
would have to wait for a long duration before it can do anything
meaningful.
This also fixes a bug that caused BGSAVE errors to be reported as info
class log entries.
* Explicitly fsync and relax CPU on snap busy-loop
This commit also switches to using global `VERSION` and `URL` statics
than defining it per-crate.
* Add changelog entry and bump up version
* Optimize `dbtest` macro and rm redundant allocs
* Upgrade deps
This is a silly optimization but can be significantly faster for larger
action names. Also, since there are (should be) no UTF-8 characters in
the first argument, this is absolutely fine and sensible.
* Stop accepting writes if snapshotting fails
This is an important consideration: if BGSAVE fails and poisons the
database, snapshotting can and should too. But this is debatable in some
parts. For example, users may configure snapshots to be on a network
file system (symlinked maybe) and this can fail.
Now in some cases, this failure 'may be acceptable'. This commit adds a
way to customize this behavior through the `failsafe` key in the
snapshots section of the cfg file and through the --stop-write-on-fail
option passed to `skyd` on startup. However, BGSAVE remains unchanged:
it will always poison the database if it fails. If the user doesn't want
this, they can simply disable BGSAVE.
* Add changelog
* Create a new file on writing to flock-ed file
This fix is a very important one in two ways. Say we have an user A.
They go ahead and launch skyd. skyd creates a data.bin file. Now A just
deletes the data.bin file for fun. Funny enough, this never causes flock
to error!
Why? Well because the descriptor/handle is still valid and was just
unlinked from the current directory. But this might seem silly since
the user exits with a 'successfully saved notice' only to find that the
file never existed and all of their data was lost. That's bad.
There's a hidden problem in our current approach too, apart from this.
Our writing process begins by truncating the old file and then writing
to it by placing the cursor at 0. Nice, but what if this operation just
crashes. So we lost the current data AND the old data. Not good.
This commit does a better thing: it creates a new temporary file, locks
it before writing and then flushes the current data to the temporary
file. Once that succeeds, it replaces the old data.bin file with the
newly created file.
This solves both the problems mentioned here for us:
1. No more of the silly error
2. If BGSAVE crashes in between, we can be sure that at least the last
data.bin file is in proper shape and not half truncated or so.
This commit further moves the background services into their
own module(s) for easy management.
* Fix CI scripts
Fixes:
1. Our custom runner (drone/.ci.yml) was modified to kill the skyd
process once done since this pipeline is not ephemeral.
2. GHA for some reason ignores any error in the test step and proceeds
to kill the skyd process without erroring. Since GHA runners are
ephemeral, we don't need to do this manually.