1
0
Fork 0

Merge pull request #994 from rqlite/k-docs

Add doc
master
Philip O'Toole 3 years ago committed by GitHub
commit 5364eda8e9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -126,7 +126,7 @@ _This section borrows heavily from the Consul documentation._
In the event that multiple rqlite nodes are lost, causing a loss of quorum and a complete outage, partial recovery is possible using data on the remaining nodes in the cluster. There may be data loss in this situation because multiple servers were lost, so information about what's committed could be incomplete. The recovery process implicitly commits all outstanding Raft log entries, so it's also possible to commit data -- and therefore change the SQLite database -- that was uncommitted before the failure.
**You must also follow the recovery process if a cluster simply restarts, but all nodes (or a quorum of nodes) come up with different Raft IP addresses. This can happen in certain deployment configurations.**
**You may also need to follow the recovery process if a cluster simply restarts, but all nodes (or a quorum of nodes) come up with different network identitiers. This can happen in certain deployment configurations.**
To begin, stop all remaining nodes. You can attempt a graceful node-removal, but it will not work in most cases. Do not worry if the remove operation results in an error. The cluster is in an unhealthy state, so this is expected.
@ -152,7 +152,7 @@ The next step is to go to the _data_ directory of each rqlite node. Inside that
]
```
`id` specifies the node ID of the server, which must not be changed from its previous value. The ID for a given node can be found in the logs when the node starts up if it was auto-generated. `address` specifies the desired Raft IP and port for the node, which does not need to be the same as previously. `non_voter` controls whether the server is a read-only node. If omitted, it will default to false, which is typical for most rqlite nodes.
`id` specifies the node ID of the server, which must not be changed from its previous value. The ID for a given node can be found in the logs when the node starts up if it was auto-generated. `address` specifies the desired Raft IP and port for the node, which does not need to be the same as previously. You can use hostnames instead of IP addresses if you prefer. `non_voter` controls whether the server is a read-only node. If omitted, it will default to false, which is typical for most rqlite nodes.
Next simply create entries for all nodes. You must confirm that nodes you do not include here have indeed failed and will not later rejoin the cluster. Ensure that this file is the same across all remaining rqlite nodes. At this point, you can restart your rqlite cluster.

@ -0,0 +1,74 @@
# Running rqlite on Kubernetes
This document provides an example of how to run rqlite as a Kubernetes [StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/).
## Creating a cluster
### Create a Headless Service
The first thing to do is to create a [Kubernetes _Headless Service_](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services). The Headless service creates the required DNS entries, which allows the rqlite nodes to find each other, and automatically bootstrap a new cluster.
```yaml
apiVersion: v1
kind: Service
metadata:
name: rqlite-svc
spec:
clusterIP: None
selector:
app: rqlite
ports:
- protocol: TCP
port: 4001
targetPort: 4001
```
Apply the configuration above to your Kubernetes deployment. It will create a DNS entry `rqlite-svc`, which will resolve to the IP addresses of any Pods with the tag `rqlite`.
### Create a StatefuiSet
For an rqlite cluster to function properly in a production environment, the rqlite nodes require a persistent network identifier and storage. This is what a StatefulSet can provide. The example belows shows you how to configure a 3-node rqlite cluster.
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rqlite
spec:
selector:
matchLabels:
app: rqlite # has to match .spec.template.metadata.labels
serviceName: "rqlite-svc"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: rqlite # has to match .spec.selector.matchLabels
spec:
terminationGracePeriodSeconds: 10
containers:
- name: rqlite
image: rqlite/rqlite
args: ["-disco-mode=dns","-disco-config={\"name\":\"rqlite-svc\"}","-bootstrap-expect","3"]
ports:
- containerPort: 4001
name: rqlite
readinessProbe:
httpGet:
scheme: HTTP
path: /readyz?noleader
port: 4001
initialDelaySeconds: 10
periodSeconds: 5
volumeMounts:
- name: rqlite-file
mountPath: /rqlite/file
volumeClaimTemplates:
- metadata:
name: rqlite-file
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 1Gi
```
Note the `args` passed to rqlite. The arguments tell rqlite to use `dns` discovery mode, and to resolve the DNS name `rqlite-svc` to find the IP addresses of other nodes in the cluster. Furthermore it tells rqlite to wait until three nodes are available (counting itself as one of those nodes) before attempting to form a cluster.
## Scaling the cluster
You can grow the cluster at anytime, simply by increasing the replica count. Shrinking the cluster, however, will require some manual intervention. As well reducing the `replicas` value, you also need to [explicitly remove](https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGMT.md#removing-or-replacing-a-node) the deprovisioned nodes, or the Leader will continually attempt to contact those nodes.
> :warning: **Be careful not to shrink the cluster such that there is no longer a quorum of nodes available. If you do this you will render your cluster unusable, and need to perform a manual recovery.** The manual recovery process is [fully documented](https://github.com/rqlite/rqlite/blob/master/DOC/CLUSTER_MGMT.md#dealing-with-failure).
Loading…
Cancel
Save