Description

Job Overview

We are seeking a highly skilled Telemetry Engineer to join our dynamic team. The ideal candidate will have expert knowledge in Prometheus, Grafana and Git. This role involves developing and managing telemetry for large-scale datasets and implementing strategies to reduce Mean Time to Resolution (MTTR).

Key Skills

Must Have:

Prometheus Proficiency: Develop, configure, and maintain monitoring solutions using Prometheus. Must have in-depth knowledge of Prometheus metrics, alerts, and query language.

Grafana: Design and implement dashboards in Grafana for real-time data visualization and monitoring. Customize and extend Grafana as per the project requirements.

Telemetry Skills: Create scalable telemetry solutions using Prometheus and Grafana to monitor and analyze large-scale datasets.

Reducing MTTR: Experience in developing telemetry solutions focused on reducing Mean Time to Resolution, enhancing system reliability and performance.

Git: Strong understanding in Git.

Good To Have

Thanos: Proficient knowledge of Thanos components with demonstrated hands-on experience in configuring, deploying, and managing mutliple Thanos components.

Python: Practical proficiency in Python scripting is a key requirement.

Qualifications

Bachelor's degree in computer science, Information Technology, or related field

Proven experience as a Telemetry Observability Engineer or similar role

Extensive knowledge of Prometheus and Grafana.

Strong understanding of telemetry and observability principles

Excellent analytical and problem-solving skills

Strong communication and teamwork abilities

Experience with Splunk and Kubernetes is an added advantage

Education

Bachelor's