What we did in the old implementation was pure over-engineering.
We relied on CoreDB's `Drop` impl to terminate the background services.
Now this is absolutely unreliable due to the nature of async functions.
We also relied on the bgsave scheduler to release the lock upon exit
which is also unreliable because we left the service to the mercy of the
runtime. We spawned the task and didn't hold as much as a `JoinHandle`
to it. That's bad because the runtime can just abort these tasks which
may result in the lock never being released. Even though it is designed
to release the lock on Drop, the destructor may however not be called at
all.
This commit fixes all those issues by simplifying the entire impl to
use Terminator. Now the background save and snapshot services run
independently, in their own tasks. Whenever the user passes a SIGINT,
we tell everyone to quit. The listeners understand that this is the
last query they'll process and the background save tasks exit almost
immediately. But what if some data was modified by this last query...?
No worries, that is completely handled by main(). The lock that BGSAVE
leaves is immediately (almost) returned to main and main will attempt
to flush the data almost immediately. That's how we maintain reliability
This commit ensures that BGSAVE is optimistic in doing what it is doing:
If BGSAVE fails once, it will immediately poison the table. Now let's
say that some amazing sysadmin managed to SSH into the server and was
able to fix the storage issue; BGSAVE would be able to succeed.
The current implementation was flawed: firstly it prevented that and
secondly even if it succeeded in running BGSAVE, the server would refuse
to accept writes. This commit fixes this behavior.
The size part of the metaline is absolutely redundant as we're doing
double the work while reading the size and then the real thing.
Since sizes won't have escape codes, we can freely read upto the LF
If Parser::will_cursor_give_char is set to not error if a char matches
or the next line is empty, return Ok(bool). If this_if_nothing_ahead is
set to false, then return a NotEnough error if no more chars are
available.
The newly added test explains why
It is likely that we'll change the HashMap implementation in the future,
hence its best to hide away the HashMap to make sure we can easily
replace it.
The previous logic was heavily flawed; it only had to check if the path
was a dir and isn't the remote snapshot directory.
Similarly, the file name parsing should only kick in if the item is a
file
This commit adds changes so that the main process almost immediately
acquires a lock on the data file when runtime is dropped. This is just
an added precaution to try and ensure that no other process does
something silly with the data file.
The descriptor is cloned for this using `FileLock::try_clone`
8e46e62 added a block_on_process_exit function that kept on sending
`notify_one()`s in a loop until the services terminated. This was
pointless as the `Drop` impl would do it for us anyways.
(What was I thinking?)
So, in main(), we're spawning an async task that lets the DB run as long
as we don't pass a ctrl_c (or some bad panic occurs). Once the ctrl_c
is received, we start terminating all workers. `block_on` returns DB
which should be the only one holding an atomic reference to the shared
field. We assert this right after dropping `runtime`.
Finally, the ECONNRESET suppression match was fixed to remove an
unreachable branch by adding conditional compilation
This commit ensures that the workers exit before attempting a flush_db
operation. Only after block_on_process_exit finishes we return `db`.
Now we run a simple flush_db operation knowing that the lock has been
released.
To block on process termination, we introduce a new function
block_on_process_exit that does the same thing as CoreDB's Drop
implementation.
Windows is the most ingenious OS in the world where filenames can
conflict with shell commands. That's right, con is an I/O device on
Windows and cannot be used for a filename! This is why we were having
checkout errors on Windows!
Vive la POSIX!
We have introduced a trait `BufferedSocketStream` that is a 'dummy'
trait and is implemented for both `SslStream<TcpStream>` and
`TcpStream`. So, the generic `Connection` object accepts any type that
implements the `BufferedSocketStream` trait (and hence should also
implement `AsyncWrite`)
This commit does a LOT! It migrates the `queryengine::execute_simple`,
`CoreDB::execute_query` and the kvengine functions to use generic
connections.
The object dbnet::Con was removed because it isn't needed anymore.
The listeners were also upgraded to use the generic connection handler
The trait `Con` and `ConOps` were renamed to `ProtocolConnectionExt`
and `ProtocolConnection`.
This naming scheme clearly explains that the Ext version 'augments' the
non-Ext impl. This is the very case here: ProtocolConnection provides
the basic funtions needed for interfacing with net I/O while the Ext
trait enables high-level interaction with the protocol and ultimately
queries.
A generic `ConnectionHandler` object was added that will replace the
SSL and non-SSL handler objects, again reducing redundancy.
Dummy execute functions were added to CoreDB and queryengine.
This commit defines two traits: `Con` and `ConOps`. Implementors of
`ConOps` get a free implementation for `Con`. `Con` is the ultimate
object that can be used in place of the current SSL/non-SSL connection
objects. If you look at the implementations of the current connection
objects, they have a lot of repetition as they do almost the same thing
except for the fact that they have a different underlying stream.
This is exactly what we're trying to eliminate. We will also define a
generic connection handler object to reduce redundancy.
Several changes were made to accomodate for this, including the addition
of the write_to_disk function that should be used by fns which don't
have a FileLock to pass for flushing data to the disk.
BGSAVE now takes ownership of a FileLock object which it uses for
running BGSAVE.
It seems that on Windows unlocking errors if the file has already been
unlocked. To fix this, we've added a platform-specific field to see if
the FileLock object has already been used to unlock the file.
This is the unlock field. In the Drop impl for Windows, we check the
unlocked flag to determine if we need to unlock the file.
This commit implements file locks for unix-based systems and windows
systems. This is done by using platform-specific `__sys` modules for
locking, trying to lock and unlocking files.
A build script was added for unix-systems that make use of the
flock-posix.c file
This commit removes the fscposix.c file and begins implementing native
file locking mechanisms for each platform (supported platforms)
BSD-style `flock`s were added
This commit adds a basic implementation of POSIX advisory record locking
which sets a lock on the `data.bin` file when the database server starts
and releases the lock when it terminates. This is just done for
compliance to let other processes know that we don't want them to use
the file.
However, the result depends entirely on the process that wants to do
'something' with the file. It is the responsibility of the process to
ensure that it respects the file lock.
Also, exclusive locks aren't perfect on Linux, so we can't rely on them.
See discussion #123 for more information.