Description

Description Role and Responsibilities:

• Deploy, Maintain, Enhance and Monitor a highly scalable infrastructure for data processing platform using Kubernetes

• Using AWS Cloud and open-source services to address critical business needs

• Ensure the 24/7 availability of the system, with proper alerting and monitoring

• Identify and fix bugs and performance issues in the platform.

• Work with agile teams on setting error budgets, root cause analysis exercises, and blameless post-mortems

• Utilize continuous delivery (CI/CD) with Gitlab CI, Jenkins, ArgoCD, Artifactory, Docker

• Data pipeline and application monitoring and failure recovery

• Setup and monitor application access and connectivity

• Advocate for a DevOps culture of automation, self-service, and engineering best practices to enable development teams

• Autoscaling and monitoring performance for Kubernetes and running applications using Prometheus and Grafana or similar tools

• Performing all SRE activities such as availability and reliability monitoring and reports

• Tune, Monitor and configure tools such as Kaaa, Spark, Presto, Airflow, MQTT

• Use infrastructure as a service with Terraform

• Operate and maintain code repository with GitLab.

Required Qualification:

• Bachelor’s degree in Computer Science OR Computer Engineer

• Minimum 5+ years of experience in DevOps engineering or software development.

• Strong coding and scripting experience with Bash, Python, Go or similar languages.

• Comprehensive experience with AWS including a solid understanding of CI/CD, Amazon S3, EC2, IAM, CloudFormation and Route 53

• Experience with user access, authentication, user permission management and security, LDAP, AD, OIDC, Kerberos

• Experience with secure infrastructure networking with AWS using different types of Load Balancers, setting up VPCs, subnets, and routing tables

• Experience with auto scaling, performance testing and capacity planning.

• Experience with tools such as Jenkins, Artifactory, etc. to build automation, CI/CD, Self- Service pipelines.

• Experience owning infrastructure in production, as well as designing and creating build/deploy & monitoring systems using CloudFormation/Terraform

• Experience with restful services, pub/sub communication model, service-oriented architecture, distributed systems, cloud system (AWS) and micro-services architecture platform.

 

Requirements Preferred Qualifications:

• Master’s degree in Computer Science OR Computer Engineer

• Experience with configuration management tools kit Puppet, Chef, Kustomize, or Ansible

• Experience with containerization and scheduling, with Docker and Kubernetes.

• Strong distributed systems implementation experience

• Experience with AWS Direct Connect or setting up and maintaining a hybrid cloud

• Experience with optimizing storage classes, lifecycle rules, instance classes, and throughput tuning to optimize for cost without sacrificing performance

• Experience in backend services deployment and management

Education

Bachelor’s Degree