etcd
Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data.
refer
etcd backup&resotre
refer
ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshotdb
# exit 0
# verify the snapshot
ETCDCTL_API=3 etcdctl --write-out=table snapshot status snapshotdb
write progress
- client write to leader
- leader receive and put to kv store
- kv store write to raftNode
- leader write raft.node to log entry and send to raft.raft
- raft.raft send them to each follower
- leader write its own Ready info to store
- follower write message and create Ready message to response
- leader get > 1/2 Ready info log entry and committed index
TLS
▪ etcd supports encrypted communication through TLS
▪ TLS can be used between peers and clients
▪ To start a cluster with TLS each cluster member should have:
▪ ca-client.crt: the CA (certificate authority) trusted by the server to sign client certs
▪ Only used to auth clients if the –client-cert-auth switch is set, else any client can connect
▪ node-client.crt: the server public key certificate signed by a CA for use with clients
▪ node-client.key: the server private key for use with clients
▪ ca-peer.crt: the CA trusted by the server to sign peer certs
▪ Only used to auth peers if the –peer-client-cert-auth switch is set, else any peer can
connect
▪ node-peer.crt: the server public key certificate signed by a CA for use with peers
▪ node-peer.key: private key associated with the node-peer.crt for use with peers
etcd migrate/add/remove
take care of quorum
(°0°)
etcd maintenance
refer
monitoring
- is running up ?
- has a leader ?
- leader changes
- Consensus proposal
a proposal is a request that needs to go through raft protocol. it has four different types:committed applied pending failed(two reasons: leader election is failing or there is loss of the quorum)
When etcd’s committed index is greater than the applied index threshold is greater than 5000, it will reject all requests and return the error ErrTooManyRequests.
- disk sync duration
wal_fsync_duration_seconds backend_commit_duration_seconds
- gRPC stats
etcd uses gRPC to communicate between each of the nodes in the cluster.