1
0
Fork 0
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

172 lines
10 KiB
Markdown

# Automatic clustering
This document describes various ways to dynamically form rqlite clusters, which is particularly useful for automating your deployment of rqlite.
> :warning: **This functionality was introduced in version 7.x. It does not exist in earlier releases.**
## Contents
* [Quickstart](#quickstart)
* [Automatic Boostrapping](#automatic-bootstrapping)
* [Using DNS for Bootstrapping](#using-dns-for-bootstrapping)
3 years ago
* [DNS SRV](#dns-srv)
* [Kubernetes](#kubernetes)
* [Consul](#consul)
* [etcd](#etcd)
* [Next steps](#next-steps)
* [Customizing your configuration](#customizing-your-configuration)
* [Running multiple different clusters](#running-multiple-different-clusters)
* [Design](#design)
## Quickstart
### Automatic Bootstrapping
While [manually creating a cluster](https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGMT.md) is simple, it does suffer one drawback -- you must start one node first and with different options, so it can become the Leader. _Automatic Bootstrapping_, in constrast, allows you to start all the nodes at once, and in a very similar manner. You just need to know the network addresses of the nodes ahead of time.
For simplicity, let's assume you want to run a 3-node rqlite cluster. To bootstrap the cluster, use the `-bootstrap-expect` option like so:
Node 1:
```bash
rqlited -node-id $ID1 -http-addr=$IP1:4001 -raft-addr=$IP1:4002 \
-bootstrap-expect 3 -join http://$IP1:4001,http://$IP2:4001,http://$IP2:4001 data
```
Node 2:
```bash
rqlited -node-id $ID2 -http-addr=$IP2:4001 -raft-addr=$IP2:4002 \
-bootstrap-expect 3 -join http://$IP1:4001,http://$IP2:4001,http://$IP2:4001 data
```
Node 3:
```bash
rqlited -node-id $ID3 -http-addr=$IP3:4001 -raft-addr=$IP3:4002 \
-bootstrap-expect 3 -join http://$IP1:4001,http://$IP2:4001,http://$IP2:4001 data
```
`-bootstrap-expect` should be set to the number of nodes that must be available before the bootstrapping process will commence, in this case 3. You also set `-join` to the HTTP URL of all 3 nodes in the cluster. **It's also required that each launch command has the same values for `-bootstrap-expect` and `-join`.**
After the cluster has formed, you can launch more nodes with the same options. A node will always attempt to first perform a normal cluster-join using the given join addresses, before trying the bootstrap approach.
#### Docker
With Docker you can launch every node identically:
```bash
docker run rqlite/rqlite -bootstrap-expect 3 -join http://$IP1:4001,http://$IP2:4001,http://$IP2:4001
```
where `$IP[1-3]` are the expected network addresses of the containers.
__________________________
### Using DNS for Bootstrapping
You can also use the Domain Name System (DNS) to bootstrap a cluster. This is similar to automatic clustering, but doesn't require you to specify the network addresses at the command line. Instead you create a DNS record for the host `rqlite`, with an [A Record](https://www.cloudflare.com/learning/dns/dns-records/dns-a-record/) for each rqlite node's HTTP IP address.
To launch a node using DNS boostrap, execute the following (example) command:
```bash
rqlited -node-id $ID1 -http-addr=$IP1:4001 -raft-addr=$IP1:4002 \
-disco-mode=dns -disco-config='{"name": "rqlite.local"}' -bootstrap-expect 3 data
```
3 years ago
You would launch other nodes similarly.
#### DNS SRV
3 years ago
Using [DNS SRV](https://www.cloudflare.com/learning/dns/dns-records/dns-srv-record/) gives you more control over the rqlite node address details returned by DNS, including the HTTP port each node is listening on. This means that unlike using just simple DNS records, each rqlite node can be listening on a different HTTP port. Simple DNS records are probably good enough for most situations, however.
To launch a node using DNS SRV boostrap, execute the following (example) command:
3 years ago
```bash
rqlited -node-id $ID1 -http-addr=$IP1:4001 -raft-addr=$IP1:4002 \
-disco-mode=dns-srv -disco-config='{"name": "rqlite.local", "service": "rqlite-svc"}' -bootstrap-expect 3 data
3 years ago
```
You would launch other nodes similarly.
#### Kubernetes
DNS-based approaches can be quite useful for many deployment scenarios, in particular systems like Kubernetes and Consul. A [Kubernetes _Headless Service_](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services), for example, creates the right DNS configuration automatically, allowing you to bootstrap a service using a Headless Service. The following, very simple, Headless Service definition would mean the hostname `rqlite` would resolve to the IP addresses of all rqlite Pods that were part of this service.
```yaml
apiVersion: v1
kind: Service
metadata:
name: rqlite
spec:
clusterIP: None
selector:
app: rqlite
ports:
- protocol: TCP
port: 4001
targetPort: 4001
```
This is just an example. A real Kubernetes deployment will also require [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) and (most likely) a [StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/).
__________________________
### Consul
Another approach uses [Consul](https://www.consul.io/) to coordinate clustering. The advantage of this approach is that you do need to know the network addresses of the nodes ahead of time.
Let's assume your Consul cluster is running at `http://example.com:8500`. Let's also assume that you are going to run 3 rqlite nodes, each node on a different machine. Launch your rqlite nodes as follows:
Node 1:
```bash
rqlited -node-id $ID1 -http-addr=$IP1:4001 -raft-addr=$IP1:4002 \
-disco-mode consul-kv -disco-config '{"address": "example.com:8500"}' data
```
Node 2:
```bash
rqlited -node-id $ID2 -http-addr=$IP2:4001 -raft-addr=$IP2:4002 \
-disco-mode consul-kv -disco-config '{"address": "example.com:8500"}' data
```
Node 3:
```bash
rqlited -node-id $ID3 -http-addr=$IP3:4001 -raft-addr=$IP3:4002 \
-disco-mode consul-kv -disco-config '{"address": "example.com:8500"}' data
```
These three nodes will automatically find each other, and cluster. You can start the nodes in any order and at anytime. Furthermore, the cluster Leader will continually update Consul with its address. This means other nodes can be launched later and automatically join the cluster, even if the Leader changes.
#### Docker
It's even easier with Docker, as you can launch every node almost identically:
```bash
docker run rqlite/rqlite -disco-mode=consul-kv -disco-config '{"address": "example.com:8500"}'
```
__________________________
### etcd
A third approach uses [etcd](https://www.etcd.io/) to coordinate clustering. Autoclustering with etcd is very similar to Consul. Like when you use Consul, the advantage of this approach is that you do need to know the network addresses of the nodes ahead of time.
Let's assume etcd is available at `example.com:2379`.
Node 1:
```bash
rqlited -node-id $ID1 -http-addr=$IP1:4001 -raft-addr=$IP1:4002 \
-disco-mode etcd-kv -disco-config '{"endpoints": ["example.com:2379"]}' data
```
Node 2:
```bash
rqlited -node-id $ID2 -http-addr=$IP2:4001 -raft-addr=$IP2:4002 \
-disco-mode etcd-kv -disco-config '{"endpoints": ["example.com:2379"]}' data
```
Node 3:
```bash
rqlited -node-id $ID3 -http-addr=$IP3:4001 -raft-addr=$IP3:4002 \
-disco-mode etcd-kv -disco-config '{"endpoints": ["example.com:2379"]}' data
```
Like with Consul autoclustering, the cluster Leader will continually report its address to etcd.
#### Docker
```bash
docker run rqlite/rqlite -disco-mode=etcd-kv -disco-config '{"endpoints": ["example.com:2379"]}'
```
## Next Steps
### Customizing your configuration
For detailed control over Discovery configuration `-disco-confg` can either be an actual JSON string, or a path to a file containing a JSON-formatted configuration. The former option may be more convenient if the configuration you need to supply is very short, as in the examples above.
The examples above demonstrates simple configurations, and most real deployments may require mroe detailed configuration. For example, your Consul system might be reachable over HTTPS. To more fully configure rqlite for Discovery, consult the relevant configuration specification below. You must create a JSON-formatted configuration which matches that described in the source code.
- [Full Consul configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/consul/config.go)
- [Full etcd configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/etcd/config.go)
- [Full DNS configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/dns/config.go)
3 years ago
- [Full DNS SRV configuration description](https://github.com/rqlite/rqlite-disco-clients/blob/main/dnssrv/config.go)
#### Running multiple different clusters
If you wish a single Consul or etcd key-value system to support multiple rqlite clusters, then set the `-disco-key` command line argument to a different value for each cluster. To run multiple rqlite clusters with DNS, use a different domain name per cluster.
## Design
When using Automatic Bootstrapping, each node notifies all other nodes of its existence. The first node to have a record of enough nodes (set by `-boostrap-expect`) forms the cluster. Only one node can ever form a cluster, any node that attempts to do so later will fail, and instead become Followers in the new cluster.
When using either Consul or etcd for automatic clustering, rqlite uses the key-value store of each system. In each case the Leader atomically sets its HTTP URL, allowing other nodes to discover it. To prevent multiple nodes updating the Leader key at once, nodes uses a check-and-set operation, only updating the Leader key if it's value has not changed since it was last read by the node. See [this blog post](https://www.philipotoole.com/rqlite-7-0-designing-node-discovery-and-automatic-clustering/) for more details on the design.
For DNS-based discovery, the rqlite nodes simply resolve the hostname, and use the returned network addresses, once the number of returned addresses is at least as great as the `-bootstrap-expect` value. Clustering then proceeds as though the network addresses were passed at the command line via `-join`.