Job Description:
Mandatory Skills
The AWS Cloud Lead is responsible for leading and coordinating cloud-related activities, incident management, problem management, change management, monitoring, infrastructure deployment, vulnerability management, service requests, general cloud service maintenance, cost optimization, and support for target state architecture adhering to the ITSM process.
Key Responsibilities:
Incident Management:
Address task issues related to slow processing or non-processing.
Troubleshoot issues after container image updates.
Coordinate with the application team to troubleshoot networking and web UI issues.
Problem Management:
Conduct RCA and implement fixes for recurring job failures.
Perform RCA and tuning for performance issues.
Change Management:
Plan and execute periodic OS patching for EC2 hosts.
Manage periodic maintenance for metadata databases, including patching and upgrades.
Secure, encrypt, and manage container images for rapid deployment.
Coordinate domain certificate keys rotation.
Monitoring:
Set up CloudWatch alerts and log monitoring.
Configure CloudTrail for unified logging.
Monitor Prometheus graphs for performance and health metrics.
Infrastructure Deployment:
Maintain and enhance Infra Code using Terraform scripts.
Develop and maintain CI/CD pipeline using Jenkins for continuous deployment.
Vulnerability Management:
Generate periodic vulnerability scan reports for containers using Wiz tool.
Mitigate any open vulnerabilities identified.
Service Requests:
Maintain pipelines for Route53 DNS registration.
Modify and maintain Terraform infrastructure code as per requirements.
Deploy container images containing new features as per SDLC process using CI/CD Jenkins pipelines.
Establish peer connections with new nodes and third-party networks.
Deploy and maintain container images using Helm chart.
Handle IAM key rotation, pool creation requests, connection requests, and resource upgrades to support EKS cluster nodes.
Manage container logs and filesystems.
Create and modify custom roles.
Maintain VPC networking.
General Cloud Service Maintenance:
Update and modify cloud services based on recommendations from central teams.
Cost Optimization:
Review cloud cost usage, identify cost-saving opportunities, and implement optimizations.
Support Target State Architecture Review:
Create standard deployment architecture diagrams.
Document architecture and follow required processes for submission.
Process:
Adhere to the ITSM process for incident, change, and problem management.
Reviews and enforces adherence to the Client Architecture, as well as database design and usage standards.
Performs other duties as assigned.
Qualifications
Bachelor's Degree