• Oversee the management and maintenance of cloud infrastructure, ensuring high availability and reliability. Act as the primary point of contact for all Cloud infrastructure related issues and escalations.
• Ensure cloud resources are optimally configured and managed to meet performance and cost objectives.
• Implement and maintain monitoring solutions to track the health and performance of cloud infrastructure.
• Drive the major incidents and potential incidents end to end with periodic updates to client stake holders for approvals/recommendations.
• Ensure due diligence and impact analysis for all the changes that get implemented in the cloud platforms.
• Lead and mentor a team of cloud engineers and administrators, fostering a collaborative and high-performing work environment.
• Provide guidance and support to team members, facilitating their professional development and growth.
• Coordinate and manage the team's daily activities, ensuring alignment with organizational goals and priorities.
• Lead the response to cloud-related incidents, ensuring timely resolution and minimal impact on business operations.
• Develop and implement incident management processes and procedures.
• Perform root cause analysis and implement preventive measures to avoid recurrence of issues.
• Identify opportunities to automate repetitive tasks and processes to improve efficiency and reduce operational overhead.
• Develop and implement automation scripts and tools, leveraging Infrastructure as Code (IaC) practices.
• Continuously evaluate and improve cloud operations processes and procedures.
• Ensure cloud infrastructure adheres to security policies, standards, and best practices.
• Implement and maintain security controls to protect cloud resources and data.
• Ensure compliance with regulatory requirements and industry standards (e.g., GDPR, HIPAA).
• Monitor and analyze cloud resource usage, ensuring efficient utilization and avoiding over-provisioning.
• Conduct capacity planning to support future growth and demand.
• Implement cost management strategies to optimize cloud spending.
• Develop and implement disaster recovery and business continuity plans for cloud infrastructure.
• Ensure regular testing and validation of disaster recovery procedures.
• Ensure cloud infrastructure is resilient and can recover quickly from failures or disruptions.
• Work closely with other IT teams, business units, and stakeholders to understand requirements and deliver cloud solutions that meet their needs.
• Collaborate with vendors and service providers to evaluate and integrate new cloud technologies and services.
• Communicate effectively with stakeholders, providing regular updates on cloud operations and performance.
• Maintain comprehensive documentation of cloud infrastructure, configurations, processes, and procedures.
• Generate regular reports on cloud performance, incidents, and operational metrics.
• Ensure documentation is up-to-date and accessible to relevant stakeholders.
• Here ae some of the detailed responsibilities primarily from AWS environment followed by Azure and OCI environments.
IAM and User Management
• IAM administration for new and existing users.
• Managing IAM and cloud SSO/Organization/Permission Sets.
Cloud Services Management
• Managing Cloud Services (EC2, EKS, ELB, etc.).
• Managing Cloud Native Network (VPC, Transit Gateway, Route 53, API Gateways, CDN).
• Managing Cloud Native Storage (FSX, EFS, Lustre, EBS, and other options).
• Managing cloud native autoscaling and load balancers.
• Managing public Cloud WAF/Imperva administration.
• Managing Cloud Trail, Event Hub, Guard Rails.
• Managing Cloud STS Token Services.
• Managing cloud Management Services (ARM, CFT, system Manager functions).
• Managing cloud Config.
• Managing SQS, SNS, Kinesis.
• Managing cloud SIEM integrations.
• Managing Cloud Patching.
Cloud Resource and Cost Management
• Managing Cloud Cost Management.
• Managing Cost Explorers.
• Managing Cloud Management Group, IDCS, Organizations, user Groups.
• Managing Auto Scaling Policies.
• Managing AWS/Azure/Oracle Backups.
• Managing cloud Log insights, Log group, cloud Watch services, other cloud Monitoring Services.
• Managing Step functions.
• Managing Code Pipeline, Code Build.
• Managing application migration services.
• Managing AWS Lambda.
• Managing Dynamo DB and cloud Database Management (infrastructure).
• Managing Cloud Object Storages.
Cloud Automation and DevOps
• Managing cloud Tags.
• Managing Cloud Rekognition services.
• Managing Cloud Transcribe Services.
• Managing Cloud Comprehend Services.
• Managing EKS and other Micro services across the platform.
• Managing cloud Monitoring and Logging across cloud (AWS/AZ/OCI).
• Managing Cloud Elastic Search.
• Managing Cloud API management.
• Managing Serverless Computing.
• Managing cloud CDN.
• Managing Machine learning and AI.
• Managing Cloud Data Management and Analytics.
Project Activities
• Disaster Recovery Test activities.
• Additional DR Test activities due to Customer Application Requirements.
• Automation on cloud infra Build/Image Build Process.
• Automation on Patch/Inventory/Tag Management.
• Additional Application Deployment.
• Implementing, Managing, and Automating all Cloud enabled Services.
• Managing Image Builder – Create Process for Regular Updates.
• Managing Azure Automation for creating and managing automated tasks and runbooks.
• Managing Automation via Azure DevOps and GitHub.
• Managing Azure DevOps Pipelines.
• Managing Azure DevOps, Repos, Projects and Organizations.
• Managing GitHub GitLab.
Cloud Security Services
• AWS Guard Duty
• AWS WAF / Imperva WAF
• AWS Inspector
• Key Vault + KMS + Secrets management
• AWS Macie
• Security groups
• Rapid 7
• Azure – NSG, Routing tables.
• OCI – VCN, Native Firewalls
OCI Cloud Operations
• OCI Infrastructure Management, Golden gate, Rack Clusters, Guard Duty and Essbase Services support.
Azure Cloud Operations
• Azure Subscription, Resource groups, Storage Accounts, Networking and License Management, SSO, App proxy, NDES, AAD Sync.
Qualifications we seek in you!
Minimum Qualifications / Skills
• Bachelor’s degree in computer science, Information Technology, Electrical Engineering, or a related field. Advanced degrees or relevant professional training are a plus.
• Good experience in System administration, and good experience in Cloud operations and leadership/senior technical role.
• Proficiency in AWS cloud platforms. Strong working experience in Azure, OCI clouds platforms.
• Strong understanding of cloud architecture, services, and best practices.
• Experience with cloud management and monitoring tools.
• Proficiency in scripting and automation (e.g., PowerShell, Python, Terraform, Ansible Playbooks/Ansible Tower, Cloud Formation, Puppet, Chef).
• Strong knowledge of cloud security principles and practices.
• Proficiency in Windows/Linux Server administration and management.
• Proficiency and working experience in VMWare/AD and Azure AD SSO platforms.
• Strong networking skills - DNS, DHCP, PKI and LAN/WAN protocol understanding.
• Effective communication and interpersonal skills, with the ability to interact with stakeholders at all levels.
• Experience in vendor management and contract negotiations.
• A proactive approach to continuous improvement and innovation in data center operations.
Preferred Certifications and experience
• Cloud certifications such as AWS Certified Solutions Architect – Associate or Professional. Microsoft Certified: Azure Architect Certified,
• Experience with DevOps practices and tools (CI/CD, Jenkins, Git).
• Familiarity with ITIL or other IT service management frameworks.
• Excellent communication and collaboration skills, with the ability to effectively interact with technical and non-technical stakeholders at all levels of the organization.
• Strong analytical and problem-solving skills, with the ability to identify root causes of issues and implement effective solutions in a timely manner.
• Proven ability to work independently as well as part of a team, with a proactive and self-motivated attitude towards achieving project goals.
Any Graduate