Roles and Responsibilities:
• Own and create Monitoring solutions to monitor all Platforms/applications/jobs/IA ecosystem.
• APM (Application Performance Monitoring) solution implementations in cloud/on-prem infrastructures
• Design, implement, and maintain monitoring and observability solutions to provide real-time visibility into the performance and health of our applications and infrastructure.
• Configure and customize APM tools such as Dynatrace to monitor application performance, identify bottlenecks, and optimize resource utilization.
• Develop and maintain dashboards, alerts, and reports to track key metrics and KPIs and enable proactive monitoring and alerting.
• Collaborate with software development teams to instrument applications for monitoring, tracing, and logging, and integrate with observability platforms.
• Troubleshoot and resolve complex technical issues related to performance, scalability, and reliability, leveraging monitoring and APM tools.
• Evangelize best practices for monitoring, APM, and observability across the organization and provide training and support to teams as needed.
Skills:
• Proficiency in configuring and customizing APM tools such as Dynatrace, New Relic, Datadog, or AppDynamics to monitor application performance and troubleshoot issues.
• Experience with monitoring tools such as Prometheus, Grafana, or ELK stack for infrastructure and application monitoring.
• Understanding of data platforms, batch jobs and ETL processes Experience and Qualification
• Prior experience of working in AWS Cloud and Environment with Data Platforms were being monitored.
• Training or certification in Dynatrace and AWS Cloud
• Experience in monitoring integration of Infrastructure components such as Servers, Storage, Network and security components
Bachelor's degree