parent
6df797b145
commit
e4fed0cee3
@ -1,75 +0,0 @@
|
||||
# rqlite Cluster Discovery Service
|
||||
_For full details on how the Discovery Service is implemented using AWS Lambda and DynamoDB check out [this blog post](http://www.philipotoole.com/building-a-cluster-discovery-service-with-aws-lambda-and-dynamodb/)._
|
||||
|
||||
> :warning: **rqlite 7.0 and later does not support this legacy Discovery service .** If you wish to use the legacy Discovery service, you must run rqlite 6.x or earlier. **The legacy Discovery service is also deprecated and may be removed in the future**. However the [source code and design](https://github.com/rqlite/rqlite-disco) for the Discovery service has been published, which means you can run your own Discovery service if you wish.
|
||||
|
||||
To form a rqlite cluster, the joining node must be supplied with the network address of some other node in the cluster. This requirement -- that one must know the network address of other nodes to join a cluster -- can be inconvenient in various environments. For example, if you do not know which network addresses will be assigned ahead of time, creating a cluster for the first time requires the following steps:
|
||||
|
||||
* First start one node and specify its network address.
|
||||
* Let it become the leader.
|
||||
* Start the next node, passing the network address of the first node to the second.
|
||||
* Repeat this previous step, until you have a cluster of the desired size.
|
||||
|
||||
To make all this easier, rqlite also supports _discovery_ mode. In this mode each node registers its network address with an external service, and learns the _join_ addresses of other nodes from the same service.
|
||||
|
||||
As a convenience, a free Discovery Service for rqlite is hosted at `discovery.rqlite.com`. Note that this service is provided on an _as-is_ basis, with no guarantees it will always be available (though it has been built in a highly-reliable manner). If you wish to run your own copy of the service, you can deploy the Discovery Service source code yourself.
|
||||
|
||||
## Creating a Discovery Service ID
|
||||
To form a new cluster via discovery, you must first generate a unique Discovery ID for your cluster. This ID is then passed to each node on start-up, allowing the rqlite nodes to automatically connect to each other. To generate an ID using the rqlite discovery service, hosted at `discovery.rqlite.com`, execute the following command:
|
||||
```
|
||||
curl -XPOST -L -w "\n" 'http://discovery.rqlite.com'
|
||||
```
|
||||
The output of this command will be something like so:
|
||||
```json
|
||||
{
|
||||
"created_at": "2017-02-20 01:25:45.589277",
|
||||
"disco_id": "809d9ba6-f70b-11e6-9a5a-92819c00729a",
|
||||
"nodes": []
|
||||
}
|
||||
```
|
||||
In the example above `809d9ba6-f70b-11e6-9a5a-92819c00729a` was returned by the service.
|
||||
|
||||
This ID is then provided to each node on start-up.
|
||||
```shell
|
||||
rqlited -disco-id 809d9ba6-f70b-11e6-9a5a-92819c00729a
|
||||
```
|
||||
When any node registers using the ID, it is returned the current list of nodes that have registered using that ID. If the nodes is the first node to access the service using the ID, it will receive a list that contains just itself -- and will subsequently elect itself leader. Subsequent nodes will then receive a list with more than 1 entry. These nodes will use one of the join addresses in the list to join the cluster.
|
||||
|
||||
### Controlling the registered join address
|
||||
By default, each node registers the address passed in via the `-http-addr` option. However if you instead set `-http-adv-addr` when starting a node, the node will instead register that address. This can be useful when telling a node to listen on all interfaces, but that is should be contacted at a specific address. For example:
|
||||
```shell
|
||||
rqlited -disco-id 809d9ba6-f70b-11e6-9a5a-92819c00729a -http-addr 0.0.0.0:4001 -http-adv-addr host1:4001
|
||||
```
|
||||
In this example, other nodes will contact this node at `host1:4001`.
|
||||
|
||||
## Caveats
|
||||
If a node is already part of a cluster, no attempt is made to contact the Discovery service, even if a Discovery ID is passed to a node at startup.
|
||||
|
||||
## Example
|
||||
Create a Discovery Service ID:
|
||||
```shell
|
||||
$ curl -XPOST -L -w "\n" 'http://discovery.rqlite.com/'
|
||||
{
|
||||
"created_at":
|
||||
"2017-02-20 01:25:45.589277",
|
||||
"disco_id": "b3da7185-725f-461c-b7a4-13f185bd5007",
|
||||
"nodes": []
|
||||
}
|
||||
```
|
||||
To automatically form a 3-node cluster simply pass the ID to 3 nodes, all of which can be started simultaneously via the following commands:
|
||||
```shell
|
||||
$ rqlited -disco-id b3da7185-725f-461c-b7a4-13f185bd5007 ~/node.1
|
||||
$ rqlited -http-addr localhost:4003 -raft-addr localhost:4004 -disco-id b3da7185-725f-461c-b7a4-13f185bd5007 ~/node.2
|
||||
$ rqlited -http-addr localhost:4005 -raft-addr localhost:4006 -disco-id b3da7185-725f-461c-b7a4-13f185bd5007 ~/node.3
|
||||
```
|
||||
_This demonstration shows all 3 nodes running on the same host. In reality you probably wouldn't do this, and then you wouldn't need to select different -http-addr and -raft-addr ports for each rqlite node._
|
||||
|
||||
## Removing registered addresses
|
||||
If you need to remove an address from the list of registered addresses, perhaps because a node has permanently left a cluster, you can do this via the following command (be sure to pass all the options shown to `curl`):
|
||||
```shell
|
||||
$ curl -XDELETE -L --post301 http://discovery.rqlite.com/<disco ID> -H "Content-Type: application/json" -d '{"addr": "<node address>"}'
|
||||
```
|
||||
For example:
|
||||
```shell
|
||||
$ curl -XDELETE -L --post301 http://discovery.rqlite.com/be0dd310-fe41-11e6-bb97-92e4c2da9b50 -H "Content-Type: application/json" -d '{"addr": "http://192.168.0.1:4001"}'
|
||||
```
|
@ -0,0 +1,191 @@
|
||||
package cluster
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"crypto/tls"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"log"
|
||||
"math/rand"
|
||||
"net/http"
|
||||
"os"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
httpd "github.com/rqlite/rqlite/http"
|
||||
)
|
||||
|
||||
func init() {
|
||||
rand.Seed(time.Now().UnixNano())
|
||||
}
|
||||
|
||||
var (
|
||||
// ErrBootTimeout is returned when a boot operation does not
|
||||
// complete within the timeout.
|
||||
ErrBootTimeout = errors.New("boot timeout")
|
||||
)
|
||||
|
||||
// AddressProvider is the interface types must implement to provide
|
||||
// addresses to a Bootstrapper.
|
||||
type AddressProvider interface {
|
||||
Lookup() ([]string, error)
|
||||
}
|
||||
|
||||
// Bootstrapper performs a bootstrap of this node.
|
||||
type Bootstrapper struct {
|
||||
provider AddressProvider
|
||||
expect int
|
||||
tlsConfig *tls.Config
|
||||
|
||||
logger *log.Logger
|
||||
Interval time.Duration
|
||||
}
|
||||
|
||||
// NewBootstrapper returns an instance of a Bootstrapper.
|
||||
func NewBootstrapper(p AddressProvider, expect int, tlsConfig *tls.Config) *Bootstrapper {
|
||||
bs := &Bootstrapper{
|
||||
provider: p,
|
||||
expect: expect,
|
||||
tlsConfig: &tls.Config{InsecureSkipVerify: true},
|
||||
logger: log.New(os.Stderr, "[cluster-bootstrap] ", log.LstdFlags),
|
||||
Interval: jitter(5 * time.Second),
|
||||
}
|
||||
if tlsConfig != nil {
|
||||
bs.tlsConfig = tlsConfig
|
||||
}
|
||||
return bs
|
||||
}
|
||||
|
||||
// Boot performs the bootstrapping process for this node. This means it will
|
||||
// ensure this node becomes part of a cluster. It does this by either joining
|
||||
// an existing cluster by explicitly joining it through one of these nodes,
|
||||
// or by notifying those nodes that it exists, allowing a cluster-wide bootstap
|
||||
// take place.
|
||||
//
|
||||
// Returns nil if the boot operation was successful, or if done() ever returns
|
||||
// true. done() is periodically polled by the boot process. Returns an error
|
||||
// the boot process encounters an unrecoverable error, or booting does not
|
||||
// occur within the given timeout.
|
||||
//
|
||||
// id and raftAddr are those of the node calling Boot. All operations
|
||||
// performed by this function are done as a voting node.
|
||||
func (b *Bootstrapper) Boot(id, raftAddr string, done func() bool, timeout time.Duration) error {
|
||||
timeoutT := time.NewTimer(timeout)
|
||||
defer timeoutT.Stop()
|
||||
tickerT := time.NewTimer(jitter(time.Millisecond))
|
||||
defer tickerT.Stop()
|
||||
|
||||
notifySuccess := false
|
||||
for {
|
||||
select {
|
||||
case <-timeoutT.C:
|
||||
return ErrBootTimeout
|
||||
|
||||
case <-tickerT.C:
|
||||
if done() {
|
||||
b.logger.Printf("boot operation marked done")
|
||||
return nil
|
||||
}
|
||||
tickerT.Reset(jitter(b.Interval)) // Move to longer-period polling
|
||||
|
||||
targets, err := b.provider.Lookup()
|
||||
if err != nil {
|
||||
b.logger.Printf("provider loopup failed %s", err.Error())
|
||||
}
|
||||
if len(targets) < b.expect {
|
||||
continue
|
||||
}
|
||||
|
||||
// Try an explicit join.
|
||||
if j, err := Join("", targets, id, raftAddr, true, 1, 0, b.tlsConfig); err == nil {
|
||||
b.logger.Printf("succeeded directly joining cluster via node at %s",
|
||||
httpd.RemoveBasicAuth(j))
|
||||
return nil
|
||||
}
|
||||
|
||||
// Join didn't work, so perhaps perform a notify if we haven't done
|
||||
// one yet.
|
||||
if !notifySuccess {
|
||||
if err := b.notify(targets, id, raftAddr); err != nil {
|
||||
b.logger.Printf("failed to notify %s, retrying", targets)
|
||||
} else {
|
||||
b.logger.Printf("succeeded notifying %s", targets)
|
||||
notifySuccess = true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (b *Bootstrapper) notify(targets []string, id, raftAddr string) error {
|
||||
// Create and configure the client to connect to the other node.
|
||||
tr := &http.Transport{
|
||||
TLSClientConfig: b.tlsConfig,
|
||||
}
|
||||
client := &http.Client{Transport: tr}
|
||||
|
||||
buf, err := json.Marshal(map[string]interface{}{
|
||||
"id": id,
|
||||
"addr": raftAddr,
|
||||
})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
for _, t := range targets {
|
||||
// Check for protocol scheme, and insert default if necessary.
|
||||
fullTarget := httpd.NormalizeAddr(fmt.Sprintf("%s/notify", t))
|
||||
|
||||
TargetLoop:
|
||||
for {
|
||||
resp, err := client.Post(fullTarget, "application/json", bytes.NewReader(buf))
|
||||
if err != nil {
|
||||
return err
|
||||
// time.Sleep(bs.joinInterval) // need to count loops....? Or this just does one loop?
|
||||
// continue
|
||||
}
|
||||
resp.Body.Close()
|
||||
switch resp.StatusCode {
|
||||
case http.StatusOK:
|
||||
break TargetLoop
|
||||
case http.StatusBadRequest:
|
||||
// One possible cause is that the target server is listening for HTTPS, but
|
||||
// an HTTP attempt was made. Switch the protocol to HTTPS, and try again.
|
||||
// This can happen when using various disco approaches, since it doesn't
|
||||
// record information about which protocol a registered node is actually using.
|
||||
if strings.HasPrefix(fullTarget, "https://") {
|
||||
// It's already HTTPS, give up.
|
||||
return fmt.Errorf("failed to notify node at %s: %s",
|
||||
httpd.RemoveBasicAuth(fullTarget), resp.Status)
|
||||
}
|
||||
fullTarget = httpd.EnsureHTTPS(fullTarget)
|
||||
default:
|
||||
return fmt.Errorf("failed to notify node at %s: %s",
|
||||
httpd.RemoveBasicAuth(fullTarget), resp.Status)
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
type stringAddressProvider struct {
|
||||
ss []string
|
||||
}
|
||||
|
||||
func (s *stringAddressProvider) Lookup() ([]string, error) {
|
||||
return s.ss, nil
|
||||
}
|
||||
|
||||
// NewAddressProviderString wraps an AddressProvider around a string slice.
|
||||
func NewAddressProviderString(ss []string) AddressProvider {
|
||||
return &stringAddressProvider{ss}
|
||||
}
|
||||
|
||||
// jitter adds a little bit of randomness to a given duration. This is
|
||||
// useful to prevent nodes across the cluster performing certain operations
|
||||
// all at the same time.
|
||||
func jitter(duration time.Duration) time.Duration {
|
||||
return duration + time.Duration(rand.Float64()*float64(duration))
|
||||
}
|
@ -0,0 +1,169 @@
|
||||
package cluster
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"io"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"reflect"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func Test_AddressProviderString(t *testing.T) {
|
||||
a := []string{"a", "b", "c"}
|
||||
p := NewAddressProviderString(a)
|
||||
b, err := p.Lookup()
|
||||
if err != nil {
|
||||
t.Fatalf("failed to lookup addresses: %s", err.Error())
|
||||
}
|
||||
if !reflect.DeepEqual(a, b) {
|
||||
t.Fatalf("failed to get correct addresses")
|
||||
}
|
||||
}
|
||||
|
||||
func Test_NewBootstrapper(t *testing.T) {
|
||||
bs := NewBootstrapper(nil, 1, nil)
|
||||
if bs == nil {
|
||||
t.Fatalf("failed to create a simple Bootstrapper")
|
||||
}
|
||||
}
|
||||
|
||||
func Test_BootstrapperBootDoneImmediately(t *testing.T) {
|
||||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
t.Fatalf("client made HTTP request")
|
||||
}))
|
||||
|
||||
done := func() bool {
|
||||
return true
|
||||
}
|
||||
p := NewAddressProviderString([]string{ts.URL})
|
||||
bs := NewBootstrapper(p, 1, nil)
|
||||
if err := bs.Boot("node1", "192.168.1.1:1234", done, 10*time.Second); err != nil {
|
||||
t.Fatalf("failed to boot: %s", err)
|
||||
}
|
||||
}
|
||||
|
||||
func Test_BootstrapperBootTimeout(t *testing.T) {
|
||||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
}))
|
||||
|
||||
done := func() bool {
|
||||
return false
|
||||
}
|
||||
p := NewAddressProviderString([]string{ts.URL})
|
||||
bs := NewBootstrapper(p, 1, nil)
|
||||
bs.Interval = time.Second
|
||||
err := bs.Boot("node1", "192.168.1.1:1234", done, 5*time.Second)
|
||||
if err == nil {
|
||||
t.Fatalf("no error returned from timed-out boot")
|
||||
}
|
||||
if !errors.Is(err, ErrBootTimeout) {
|
||||
t.Fatalf("wrong error returned")
|
||||
}
|
||||
}
|
||||
|
||||
func Test_BootstrapperBootSingleNotify(t *testing.T) {
|
||||
tsNotified := false
|
||||
var body map[string]string
|
||||
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.URL.Path == "/join" {
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
return
|
||||
}
|
||||
|
||||
tsNotified = true
|
||||
b, err := io.ReadAll(r.Body)
|
||||
if err != nil {
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
|
||||
if err := json.Unmarshal(b, &body); err != nil {
|
||||
w.WriteHeader(http.StatusBadRequest)
|
||||
return
|
||||
}
|
||||
}))
|
||||
|
||||
n := -1
|
||||
done := func() bool {
|
||||
n++
|
||||
if n == 5 {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
p := NewAddressProviderString([]string{ts.URL})
|
||||
bs := NewBootstrapper(p, 1, nil)
|
||||
bs.Interval = time.Second
|
||||
|
||||
err := bs.Boot("node1", "192.168.1.1:1234", done, 60*time.Second)
|
||||
if err != nil {
|
||||
t.Fatalf("failed to boot: %s", err)
|
||||
}
|
||||
|
||||
if tsNotified != true {
|
||||
t.Fatalf("notify target not contacted")
|
||||
}
|
||||
|
||||
if got, exp := body["id"], "node1"; got != exp {
|
||||
t.Fatalf("wrong node ID supplied, exp %s, got %s", exp, got)
|
||||
}
|
||||
if got, exp := body["addr"], "192.168.1.1:1234"; got != exp {
|
||||
t.Fatalf("wrong address supplied, exp %s, got %s", exp, got)
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
func Test_BootstrapperBootMultiNotify(t *testing.T) {
|
||||
ts1Join := false
|
||||
ts1Notified := false
|
||||
ts1 := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.URL.Path == "/join" {
|
||||
ts1Join = true
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
return
|
||||
}
|
||||
ts1Notified = true
|
||||
}))
|
||||
|
||||
ts2Join := false
|
||||
ts2Notified := false
|
||||
ts2 := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||
if r.URL.Path == "/join" {
|
||||
ts2Join = true
|
||||
w.WriteHeader(http.StatusServiceUnavailable)
|
||||
return
|
||||
}
|
||||
ts2Notified = true
|
||||
}))
|
||||
|
||||
n := -1
|
||||
done := func() bool {
|
||||
n++
|
||||
if n == 5 {
|
||||
return true
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
p := NewAddressProviderString([]string{ts1.URL, ts2.URL})
|
||||
bs := NewBootstrapper(p, 2, nil)
|
||||
bs.Interval = time.Second
|
||||
|
||||
err := bs.Boot("node1", "192.168.1.1:1234", done, 60*time.Second)
|
||||
if err != nil {
|
||||
t.Fatalf("failed to boot: %s", err)
|
||||
}
|
||||
|
||||
if ts1Join != true || ts2Join != true {
|
||||
t.Fatalf("all join targets not contacted")
|
||||
}
|
||||
|
||||
if ts1Notified != true || ts2Notified != true {
|
||||
t.Fatalf("all notify targets not contacted")
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue