Job Description:
- 10+ years of experience in infrastructure, 3+ years as Kafka Administrator or in a similar role managing Kafka clusters.
- Proven experience as a Kafka Administrator or in a similar role managing Kafka clusters.
- Strong understanding of distributed systems, data streaming, and event-driven architectures.
- Experience with Linux/Unix operating systems and shell scripting.
- Familiarity with cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes) is a plus.
Job Responsibilities:
1. **Cluster Management:**
- Install, configure, and maintain Kafka clusters across various environments (development, testing, production).
- Perform upgrades and patching of Kafka and related components (e.g., Zookeeper).
- Ensure optimal performance and reliability of Kafka clusters.
2. **Monitoring and Troubleshooting:**
- Monitor Kafka cluster health and performance using tools like Prometheus, Grafana, or proprietary monitoring solutions.
- Diagnose and resolve issues related to Kafka brokers, topics, partitions, and consumers.
- Implement proactive measures to prevent potential issues.
3. **Security and Compliance:**
- Implement and manage security protocols for Kafka, including SSL/TLS encryption, Kerberos authentication, and access control policies.
- Ensure compliance with organizational and industry standards for data security and privacy.
4. **Capacity Planning and Scalability:**
- Perform capacity planning to ensure the Kafka infrastructure can handle current and future workloads.
- Optimize Kafka configurations for performance and scalability based on application requirements.
5. **Backup and Recovery:**
- Develop and maintain disaster recovery plans for Kafka environments.
- Implement and test backup and restore procedures to ensure data integrity and availability.
6. **Collaboration and Support:**
- Work closely with development teams to understand Kafka usage patterns and provide guidance on best practices.
- Provide support for Kafka-related issues, including on-call support as needed.
- Document Kafka infrastructure, configurations, and operational procedures.
7. **Automation and Scripting:**
- Develop automation scripts for routine tasks such as cluster provisioning, monitoring, and maintenance using tools like Ansible, Puppet, or custom scripts.
- Implement CI/CD pipelines for Kafka-related deployments and updates.
Any graduate