Description

Responsibilities:

  • Lead the design and implementation of Datadog solutions to monitor and manage the organization's diverse infrastructure, applications, and services; collaborate with IT and development teams to ensure effective integration of Datadog into the existing technology stack
  • Integrate Datadog SaaS with cloud platforms, container orchestration tools, as well as on-premises hosts to provide comprehensive monitoring across the entire technology stack; participate with Datadog Agent-based integrations, deployments, custom YAML configurations, and revision control options; coordinate with DevOps teams to automate monitoring and response processes
  • Serve as the primary point of contact for Datadog-related inquiries, providing expert guidance and best practices; stay updated on the latest Datadog features, integrations, and industry trends to optimize monitoring strategies; highly knowledgeable specialist in one or more Datadog product area(s); reproduce technical issues and dive into Datadog's integrations
  • Lead a team of monitoring engineers, providing mentorship, training, and technical guidance; collaborate with cross-functional teams to align monitoring strategies with business goals and objectives
  • Design and implement performance monitoring solutions using Datadog to identify and address potential bottlenecks and inefficiencies; work closely with system administrators and engineers to optimize resource utilization and enhance overall system performance
  • Maintain detailed documentation of Datadog configurations, monitoring policies, and incident response procedures; conduct regular knowledge-sharing sessions to empower the broader technical team with Datadog expertise

Qualifications:

  • Minimum 10+ years of recent experience in the DevOps SRE space with at least 7+ years of Datadog specific experience
  • relevant certifications such as Datadog Certified Associate or Datadog Certified Professional are preferred
  • Experience with SIEM (Security Information and Event Management) migration (from tools such as New Relic, Splunk, AppDynamics, etc)
  • Extensive hands-on experience with Datadog, including dashboards, alerts, and log analysis; scripting experience using Python, Powershell, and/or Bash
  • Possesses an explorer character with excellent knowledge of Windows and Linux administration

Excellent communication and collaboration skills

Education

Any Graduate