Job Purpose
As a Development and Operations Engineer IV (P4), you will be technically leading a team that builds and maintains cutting edge enterprise systems and services responsible for digital transformation within Trimble. This role is ideal for someone passionate about building and maintaining highly scalable cloud native backend systems and services that provides a transformative customer experience.
A DevOps Engineer IV is a subject matter expert with enough experience building responsive, customer-facing applications using some of the most recent technologies and frameworks. Is ready to lead the team, perform support operations, lean into Multi-Cloud discussions, and have an interest in staying abreast of the constantly changing technologies.
A DevOps Engineer IV works closely with the Project / Product Manager / Architects to assimilate systems requirements, conduct a technical study of the requirement independently, or as a taskforce to arrive at the work estimate to deliver the requirements.
The DevOps Engineer IV is responsible for conducting performance reviews independently or in joint fashion with the manager for the team members belonging to her/his organization.
Key Responsibilities
Provisioning and maintaining infrastructure in AWS/Azure cloud.
Maintain and improve the current CI/CD pipeline (Github Actions/workflow & Jenkins)
Strong Python Knowledge.
Developing new terraform modules as per requirement.
Application deployment automation using Ansible
Review and maintain IAC repositories in bitbucket
Fix the AWS/Azure non-compliance wherever possible - Security controls and best practices
Strong emphasis on DevOps as an engineering discipline with a focus on automation
Handle escalations from internal stakeholders and manage critical issues to resolution
Manage and provide leadership and guidance to a high performing global team of Site Reliability Engineers.
Teaching how to adopt reliability engineering practices such as error budgets, blameless retrospectives, chaos engineering, etc.
Identify problems and opportunities for improvements that are common across many teams and services.
Develop services to handle automatic recovery from incidents and disasters.
Participate in troubleshooting, capacity analysis and planning, and performance analysis
Design cost controls and rollout the cost optimization strategy
Respond on-call to incidents with quick and effective resolutions
Responsible for fixing compliance issues and requirements raised by SecOps tools
Required Skills And Experience
Minimum 10+ years experience in technical and people management.
History of supporting applications, services and infrastructure in Production
Experience in Capacity planning and Cost optimization
Experience with AWS/Azure services
Deep understanding of Linux/Unix operating systems
Experience using a high-level scripting language (Python preferred) and IaC tools(Terraform, CloudFormation)
Containerization experience, from the deployment of the platform and maintenance to creating and deploying services to it
Experience with SaaS monitoring toolsets (Datadog, SumoLogic, PagerDuty, ELK, InfluxDB, Grafana)
Solid experience in cloud server infrastructure setup and configuration
Excellent written and verbal communication and interpersonal skills in a strong-matrix organization environment.
Familiarity with commonly accepted software development processes and methodologies.
Desirable Skills And Experience
Azure/AWS Certification (or equivalent in another public cloud)
Experience with microservice architecture
Above-average skills in Python or another high-level programming language
Experience in Atlassian tools: Bitbucket, Jira, and Confluence
Experience in Ansible and Packer
Experience using SQL and NoSQL databases
Experience with Jenkins/Bamboo and Gradle for CI/CD
Experience in Kubernetes is an added advantage
Any Graduate