Job Code : EWC - 1462
Primary Responsibilities:
- Work in tandem with our internal and external customers to identify and implement proper cloud based solutions.
- Demonstrate exceptional problem solving skills
- Help ensure cloud compute uptime, scalability, and maintainability.
- Champion managing cloud environments in accordance with cloud best practices and security guidelines.
- Assist with automated backup and recovery of systems running in the cloud environments
- Participate in an on-call rotation that provides 24x7 platform support.
- Lead vendors through feature tracking and prioritization
- Stay technically curious, research and implement new technical solutions
- Support integration with multiple cloud service providers
Preferred Skills:
- Operational experience with scripting languages (Python, Bash)
- Operational experience with s3 CLI and API scripting
- Operational experience with middleware and telecom
- Operational experience with Linux distributions
- Operation experience with Ethernet, TCP/IP, routing protocols and general network communications
- Operational experience using or migrating continuous integration (CI) and continuous integration (CD) pipeline solutions or tools, including Git, Jenkins, CodePipeline, and CodeCommit
- Operational experience of virtualization and containerized orchestration technologies, such as Kubernetes and Docker (including disconnected installation, Kubernetes administration, and Kubernetes troubleshooting)
- Operational experience with Kubernetes networking, load balancing, pod security
- Operational experience of Kubernetes operational building blocks (Kube API, Kube Scheduler, Kube Controller Manager and etcd)
- Operational experience with Linux systems administration, networking, and troubleshooting
- Operational experience with, but not limited to, EC2, ALB/NLB, S3, IAM, Auto-Scaling, CloudWatch, RDS, CloudFront, API Gateway, Lambda, DynamoDB, RDS, Aurora, Elastic Search, SSM, KMS, Kinesis, SNS, and SQS
- Operational experience with infrastructure or application automation
- Operational experience with IaC technologies like Terraform and Cloudformation, including activities around automated server and network configuration
- Operational experience supporting cloud hosted systems in a 24x7 environment including troubleshooting incidents, identifying root causes, fix and document problems, and implement preventative measures
- Operational experience with observability tools such as Datadog, ELK, Grafana, and Prometheus
Qualifications:
- Bachelors or Master's degree in Computer Science, Computer Engineering, or a related technical degree; six years related experience; or equivalent combination of education and experience
- 2 or more years experience supporting public cloud platforms
- Must have excellent verbal and written communications
- Operational understanding of EC2, RDS, S3, Lambda, SSM, SG, VPC, R53
- Operational experience with Infrastructure as code solutions and tools, such as: Ansible, Terraform, and Cloudformation
- Conceptual knowledge of DevOps and agile methodologies
- 2 or more years experience with cloud system integration, support, and automation
- Ability to work well under pressure and manage tight deadlines
- Proven track record of operational process change and improvement
- AWS certifications (Associate level and higher) a plus
- Kubernetes certifications a plus