Responsibilities:
- Lead the design and implementation of Datadog solutions to monitor and manage the organization's diverse infrastructure, applications, and services; collaborate with IT and development teams to ensure effective integration of Datadog into the existing technology stack
- Integrate Datadog SaaS with cloud platforms, container orchestration tools, as well as on-premises hosts to provide comprehensive monitoring across the entire technology stack; participate with Datadog Agent-based integrations, deployments, custom YAML configurations, and revision control options; coordinate with DevOps teams to automate monitoring and response processes
- Serve as the primary point of contact for Datadog-related inquiries, providing expert guidance and best practices; stay updated on the latest Datadog features, integrations, and industry trends to optimize monitoring strategies; highly knowledgeable specialist in one or more Datadog product area(s); reproduce technical issues and dive into Datadog's integrations
- Lead a team of monitoring engineers, providing mentorship, training, and technical guidance; collaborate with cross-functional teams to align monitoring strategies with business goals and objectives
- Design and implement performance monitoring solutions using Datadog to identify and address potential bottlenecks and inefficiencies; work closely with system administrators and engineers to optimize resource utilization and enhance overall system performance
- Maintain detailed documentation of Datadog configurations, monitoring policies, and incident response procedures; conduct regular knowledge-sharing sessions to empower the broader technical team with Datadog expertise
Qualifications:
- Minimum 10+ years of recent experience in the DevOps SRE space with at least 7+ years of Datadog specific experience
- relevant certifications such as Datadog Certified Associate or Datadog Certified Professional are preferred
- Experience with SIEM (Security Information and Event Management) migration (from tools such as New Relic, Splunk, AppDynamics, etc)
- Extensive hands-on experience with Datadog, including dashboards, alerts, and log analysis; scripting experience using Python, Powershell, and/or Bash
- Possesses an explorer character with excellent knowledge of Windows and Linux administration
Excellent communication and collaboration skills