Job Description:
We are looking for an experienced Medium Level Cloud DevOps Engineer to join the Client's Digital Customer Mobile App Channel. In this position, you will be a cloud Dev Ops Engineer for Client's Customer Mobile App API journey to the cloud. Keen attention to detail, problem-solving abilities, and validated knowledge of secure DevOps is crucial. This role reports to Client's Customer Mobile App Development Team Tech lead.
JOB OVERVIEW AND RESPONSIBILITIES:
- Architect secure, reusable and highly available + scalable DevOps Cloud infrastructure and code pipelines.
- DevOps engineer for Client's Mobile App Cloud DevOps infrastructure
- Working with Developers to determine and prioritize requirements
- Build new and reusable pipelines for development teams to support self-sufficiency in infrastructure and code deployment
- Work/Partner within the internal Plat Form Ops teams to solve organizational needs
- Fully focused on Cloud Migration (little to no on-prem support besides understanding of existing infrastructure)
- Architect reusable and agile pipelines, automate to enable developer independence
- Design and implement build, deployment, and configuration management
- Continuously review and improve implemented designs
- Build and test automation tools for infrastructure provisioning
- Handle code deployments in all environments
- Ensure capacity management of specific computing platforms, network engineering and storage management is established
- Ensure engineering solutions are clearly communicated for implementation
- Ensure compliance of engineering policies, standards, and procedures within functional team
- Take ownership of issues and act with high sense of urgency when required
- Participate in problem management processes to resolve root cause of failures and improve performance and reliability of systems and networks
- Participate in troubleshooting of infrastructure and/or application related issues
- Experience with site monitoring and log monitoring tools, specifically Datadog.
- Build and operate tools for monitoring performance and security
- Demonstrate an innovative mentality by staying aware of new developments within the technology space and identify which new technologies to adapt to provide value to our business.
- Produce lean and well-written technical project documentation and operational runbooks
- Maintains knowledge on current technology by reading technology periodicals, evaluating new technologies and attending trade-shows, technical seminars and training sessions.
- Performs other duties as assigned and required. Duties and responsibilities may change from time to time without notice and include but are not limited to the duties described above
Top 5 Skill sets
- Strong understanding of AWS - Cloud formation scripts, github-actions, building CI/CD pipelines
- Should have experience in managing and monitoring large setup with over NA+ services
- Practical experience with High Availability and Cross Region DR Setup from scratch
- Expertise in Python scripting and ability to learn and adapt new technology
- Ability to work On Call
REQUIRED QUALIFICATIONS - KNOWLEDGE/SKILLS
- BS/BA in Information Technology, engineering, software engineering, related field or equivalent work experience required.
- 4+ years combined experience in DevOps and operations/or relevant in IT
- Cloud Infrastructure as Code, Automated Deployment Pipelines, Code Version Control, Infrastructure provisioning
- Docker Containerization and hosting (building images, hosting in scalable cloud platforms such as Kubernetes, AWS EKS or ECS Fargate)
- .NET Core, Linux, Windows, Database SQL (MS SQL) or NoSQL technologies in the cloud
- CI/CD tools (CloudFormation, GitActions, GitHub, Terraform, TeamCity/Jenkins, Harness, Artifactory)
- Automated Code Scanning (SaaS and DaaS, Wiz, Veracode)
- Scripting languages such as JavaScript, PHP, Python, Shell, PowerShell, .NET
- High volume logging solutions (OpenSearch, Kafka, fluentbit/FluentD)
- Manage monitoring of overall application availability, latency and system health
- Determine alert standards for production environments and implement them
- Work with the development team and management to ensure high availability
- Experience in automated build pipeline, and continuous integration. Source control, branching, & merging: git/svn/etc (Repository Management)
- Supporting 24/7 high volume business critical infrastructure
- Integrity and Trust- Involves being widely trusted, being seen as a direct, truthful individual, can present the unvarnished truth in an appropriate and helpful manner, keeps confidences, admits mistakes, and doesn’t misrepresent him/herself for personal gain.
- Teamwork- Works well in a collaborative setting, volunteering for and completing assignments, acting as a positive team member by contributing to discussions, developing and maintaining relations