Description:
The Health and Human Services Delivery Center Cloud Operations Center and Support Engineer will be responsible for providing maintenance and support activities including but not limited to monitoring, troubleshooting, compliance, and optimization of the agency level cloud environments including Azure and AWS.
Responsibilities include the following:
- Serves as the primary contact for all agency cloud operations-related queries and issues.
- Provides after-hours support to ensure the seamless operation of agency cloud services.
- Creates, modifies, and deletes cloud alerts to monitor system performance.
- Monitors application workloads to ensure optimal performance.
- Proactively detects problems, manages events, and handles notifications and escalations.
- Manages Major Incident Management (MIM) for agency priority 1 incidents, including coordination, documentation, root cause analysis (RCA), and notifications.
- Implements automated remediation for recurring incidents.
- Updates agency’s hosting and design documents as needed.
- Manage activity documentation and approval chains for both agency-specific and enterprise activities.
- Resolves agency cloud security alerts, ensures compliance with security requirements, and monitors certificate expiration and renewals.
- Implements agency security controls according to organizational standards.
- Performs agency patch management on cluster environments including Azure Kubernetes Service clusters (AKS) and Kubernetes versions.
- Develops and monitors automated advanced sequences.
- Conducts file-level restorations as needed .
- Identifies cost-saving opportunities, monitors, remediates excessive resource expenditures, and escalates cost-related issues.
- Implements billing and cost management tags for better resource allocation.
- Maintains IT Service Management Knowledge Base portal for reporting and investigation.
- Develops detailed management procedural manuals for each agency.
Requirements:
- Must pass an extensive background check.
- 3-5 years of experience as a System Engineer and/or Cloud Engineer with hands on experience dealing with implementation, security, and standards/best practices in a cloud environment including Azure and AWS.
- In depth knowledge of networking as well as the connectivity to AWS and/or Azure (via Direct Connect and/or ExpressRoute)
- Hands on experience with Microsoft Azure and Amazon Web Services (AWS) cloud services.
- Administrator certifications in Azure and/or AWS (Preferred)
- Strong working knowledge of Azure and/or AWS
- Ability to work closely with multiple diverse delivery center and its agencies while anticipating their needs and exceeding their expectations.
- Consistently demonstrate strong organizational, communication, change management, and problem-solving skills.
- Strong knowledge of all cloud technologies and the ability to keep abreast and deep technical understanding of current and emerging technologies.
- Proven operational experience with large Enterprise environments.
- Excellent communication skills both oral and written to clearly communicate with clients.
Bachelor's degree in Computer Science