Description

Job Description

· Experience with the Linux environments and Bash scripting.

· Experience with Kafka operations and administration.

· The individual will be asked to lead the triage sessions with multiple tenants.

· Must have ability to review client application configuration and architecture.

· Provide onboarding support to prospective tenants on 1-O-1 setting

· Participate and Lead Regular engagement forum to interact with tenants and answer any questions

· Individual will be part of L3 support team and expected to handle support activities during weekend and off business hours.

· Experience with monitoring tools: Grafana, Prometheus, ELK, CloudWatch, and others.

· Must have ability to query for patterns in log and dashboard querying(promQL)

· Experience with developer tools: Git, Maven, Gradle

· Must have experience in contributing to code repository to meet automation requirements (pull, push, merge, rebase, clone and other git essentials)

· Familiar working in an Agile structure (Scrum or Kanban) with project management tools: Jira, BitBucket.

· Ability to produce quality documentation and diagrams.

· Regular activity will include creation/revision of Runbooks on Confluence Page

 

Essential Skills/Basic Qualifications:

All skills listed are deemed necessary:

· Ability to breakdown complex problems using appropriate design patterns and best practices demonstrating a strong knowledge foundation and a sound approach to problem-solving.

· Ability to build automated pipelines that integrate and transport software through reliable, secure, and governed channels from conception to production.

· Ability to build a suite of real-time monitoring and automation software that encompass resilient and self-healing platform operations.

· Ability to conduct effective triage and remediation of production incidents with accompanying customer service and support for all clients of the platform

Education

Any Graduate