We are seeking an experienced AWS Observability Stack Lead who will be responsible for designing, building, and maintaining an observability platform using Terraform. The ideal candidate will implement monitoring and logging solutions using AWS OpenSearch, Grafana, and Prometheus, ensuring robust observability for our cloud infrastructure and applications. Key Responsibilities: Design, implement, and maintain an observability solution on AWS using OpenSearch, Grafana, and Prometheus. Develop and manage infrastructure as code using Terraform to provision observability resources. Configure OpenSearch for centralized logging and build search indices for monitoring and alerting. Set up Prometheus for collecting and storing metrics from various AWS services and applications. Integrate Grafana for visualization, dashboards, and alerting mechanisms based on data from Prometheus and OpenSearch. Develop automation scripts and CI/CD pipelines for deploying and updating the observability stack. Collaborate with DevOps, SRE, and development teams to ensure seamless integration of the observability stack with other systems. Continuously optimize the observability stack for performance, scalability, and cost efficiency. Provide technical documentation and knowledge transfer to relevant teams. Ensure security, compliance, and best practices in all observability solutions deployed in AWS. Required Skills and Qualifications: Strong experience in AWS services: OpenSearch (Elasticsearch), CloudWatch, EC2, VPC, IAM, Lambda, etc. Proficiency in Terraform: Ability to develop infrastructure-as-code for AWS, including managing state, modules, and resources. Hands-on experience with monitoring and observability tools: Grafana, Prometheus, and OpenSearch. Knowledge of Prometheus exporters and scraping metrics from AWS and application workloads. Strong expertise in managing logs, metrics, and tracing with observability solutions. Proficiency in creating dashboards, visualizations, and alerts using Grafana. Familiarity with continuous integration and continuous deployment (CI/CD) pipelines. Good understanding of networking and security within AWS (VPC, subnets, security groups, etc.). Experience in scripting (e.g., Python, Bash) to automate tasks. Certifications: AWS Certified Solutions Architect, AWS Certified DevOps Engineer (Preferred). Desired Experience: Experience working in large-scale AWS environments with complex cloud architecture. Familiarity with AWS Lambda for custom log processing or metrics extraction. Experience with Helm charts for deploying Prometheus and Grafana in a containerized environment. Experience with cloud cost management and optimization strategies