Job Description
· Experience with the Linux environments and Bash scripting.
· Experience with Kafka operations and administration.
· The individual will be asked to lead the triage sessions with multiple tenants.
· Must have ability to review client application configuration and architecture.
· Provide onboarding support to prospective tenants on 1-O-1 setting
· Participate and Lead Regular engagement forum to interact with tenants and answer any questions
· Individual will be part of L3 support team and expected to handle support activities during weekend and off business hours.
· Experience with monitoring tools: Grafana, Prometheus, ELK, CloudWatch, and others.
· Must have ability to query for patterns in log and dashboard querying(promQL)
· Experience with developer tools: Git, Maven, Gradle
· Must have experience in contributing to code repository to meet automation requirements (pull, push, merge, rebase, clone and other git essentials)
· Familiar working in an Agile structure (Scrum or Kanban) with project management tools: Jira, BitBucket.
· Ability to produce quality documentation and diagrams.
· Regular activity will include creation/revision of Runbooks on Confluence Page
Essential Skills/Basic Qualifications:
All skills listed are deemed necessary:
· Ability to breakdown complex problems using appropriate design patterns and best practices demonstrating a strong knowledge foundation and a sound approach to problem-solving.
· Ability to build automated pipelines that integrate and transport software through reliable, secure, and governed channels from conception to production.
· Ability to build a suite of real-time monitoring and automation software that encompass resilient and self-healing platform operations.
· Ability to conduct effective triage and remediation of production incidents with accompanying customer service and support for all clients of the platform
Any Graduate