Quick start
- download from https://github.com/strimzi/strimzi-kafka-operator/releases or git clone https://github.com/strimzi/strimzi-kafka-operator.git
- change default namespace to your customised
sed -i 's/namespace: .*/namespace: kafka/' install/cluster-operator/*RoleBinding*.yaml
- change kafka namespace in install/cluster-operator/060-Deployment-strimzi-cluster-operator.yaml
env: - name: STRIMZI_NAMESPACE value: sam-strimzi-kafka
- deploy CRDs and RBAC resources
kubectl create -f install/cluster-operator/ -n sam-strimzi-kafka
- create kafka cluster
apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: sam-strimzi-kafka spec: kafka: replicas: 1 listeners: \- name: plain port: 9092 type: internal tls: false \- name: tls port: 9093 type: internal tls: true authentication: type: tls \- name: external port: 9094 type: nodeport tls: false storage: type: persistent-claim size: 100Gi deleteClaim: false class: standard config: offsets.topic.replication.factor: 1 transaction.state.log.replication.factor: 1 transaction.state.log.min.isr: 1 default.replication.factor: 1 min.insync.replicas: 1 zookeeper: replicas: 1 storage: type: persistent-claim size: 100Gi deleteClaim: false class: standard entityOperator: topicOperator: {} userOperator: {}
- create topic
apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaTopic metadata: name: my-topic labels: strimzi.io/cluster: "my-cluster" spec: partitions: 3 replicas: 1
- ingress or NodePort to access kafka in k8s or kubectl exec
kubectl exec -i -t sam-testing-kafka-cluster-kafka-0 -n sam-kafka
Strimzi
Strimzi supports kafka using Operators to deploy and manage the componenets and edependencies of Kafka to Kubernetes.
Components
- zookeeper
brokers registration, heartbeat to keep the broker list updated
maintaining a list of topics
performs leader election
access control(topics, consumer groups, users) - kafka cluster
- kafka connect
source connector: pushes external data into kafka
sink connector: extracts data out of Kafka - kafka exporter
extracts data for analysis as prometheus metrics, including offsets, consumer groups, consumer lgs and topics - kafka mirrormaker
mirror or replicate topics from one Kafka cluster to another, rely on kafka connect framework -
curise control
- cluster operator
kafka (including ZooKeeper, Topic Operator, User Operator, Kafka Exporter, and Cruise Control)
kafka connect
kafka mirrormaker
kafka bridge - entity operator
- topic operator
- user operator
on AWS
- storage optimized, i3, d3, h1
- node affinity to let kafka brokers running on these nodes
- anti-affnity to let kafka brokers and zookeeper run on separate nodes
Volume sizing and retention period
- JBOD
use multiple disks in each Kafka broker for storing commit log - Persistent-claim
High availability
- replication
- ack:0,1,2
- rack awareness
zookeeper performance
- low latency
- SSD
- separate disk for snapshots and logs
- high performance network
- reasonsoble number of zk servers
- isolation of zk from other processes
JVM
-
only requests , without limits
refer -
disaster recovery
multi-region strategy that services are deployed with backup in geographically distributed data centers.
optimum partition count
the creation of more partition for a topic is directly dependent on available threads and disk
security configs
- encrypted
Communication is always encrypted between brokers, zookeepers, operators, exporter - authZ, authN
other configs
- open file handlers
- max message size
- compression.type
Kafka auto-scaler in KEDA
- rely on consumer groups and message retention by brokers
- consumer scaler thing if too many qps on producer side, trigger is lagRhreshold
Monitoring
- JMX
- prometheus JMX Exporter
it takes the JMX metrics and exposes them as prometheus endpoint - kafka exporter
- strimzi canary
- burrow for consumer lag