Job Overview
We are seeking a highly skilled Telemetry Engineer to join our dynamic team. The ideal candidate will have expert knowledge in Prometheus, Grafana and Git. This role involves developing and managing telemetry for large-scale datasets and implementing strategies to reduce Mean Time to Resolution (MTTR).
Key Skills
Must Have:
Prometheus Proficiency: Develop, configure, and maintain monitoring solutions using Prometheus. Must have in-depth knowledge of Prometheus metrics, alerts, and query language.
Grafana: Design and implement dashboards in Grafana for real-time data visualization and monitoring. Customize and extend Grafana as per the project requirements.
Telemetry Skills: Create scalable telemetry solutions using Prometheus and Grafana to monitor and analyze large-scale datasets.
Reducing MTTR: Experience in developing telemetry solutions focused on reducing Mean Time to Resolution, enhancing system reliability and performance.
Git: Strong understanding in Git.
Good To Have
Thanos: Proficient knowledge of Thanos components with demonstrated hands-on experience in configuring, deploying, and managing mutliple Thanos components.
Python: Practical proficiency in Python scripting is a key requirement.
Qualifications
Bachelor's degree in computer science, Information Technology, or related field
Proven experience as a Telemetry Observability Engineer or similar role
Extensive knowledge of Prometheus and Grafana.
Strong understanding of telemetry and observability principles
Excellent analytical and problem-solving skills
Strong communication and teamwork abilities
Experience with Splunk and Kubernetes is an added advantage
Bachelor's