Job Description:
• Responsible for setting up OnePOC (Operations Center) and running it effectively with SRE led approach.
• Enable POC with all Required dashboards from OnePOC products like (Grafana, ELK, NetBrain, SevOne, AppD, ServiceNow ITOM etc.)
• Ensure Full Stack Observability with narrative insights and proactive remediation of anomalies.
• Adoption of Automation Bots and Conversational AI to avoid manual efforts.
• Understand Service Mapping, Event Management and Correlation concepts.
• Ensure TAT reduction in MTTD, MTTI, MTTr, MTTR and Responsible for
preventing & proactively capturing Major incidents at OnePOC.
• Ensure the OnePOC team is fully equipped with all dashboards, telemetry data to analyze the alerts, metrices, logs.
• Collaborate with Service Operations ( Compute, Cloud, Network) team to provide input on root cause and ensuring permanent Fix.
• Collaborate with Tools team to ensure monitoring covers entire footprint
and identify finetuning & optimization opportunities
• Ensuring Zero reactive MIMs & collaborate with other stakeholders for
capturing & channelize Proactive MIM prior reporting by user.
Skills:
• ITIL v3 or ITIL 4 Foundation, plus certification. MUST
Any Gradute