Description

Responsibilities

  • Solve key problems that potentially may take with the production systems and create solutions to prevent incidents from occurring again
  • Ability to understand the root cause of errors and exceptions and be able to isolate it to the line of code that may be causing the problem.
  • Ability to understand the health of various tools used in the product/ platform and be able to isolate the root cause of a customer reported alert to the tool that may be unhealthy and perform level 1 and level 2 of service restoration.
  • Driving the P1 incident to its restoration (including sending out communications to stakeholders).
  • Preparing operations report.
  • Documentation for a known issue.

Qualifications

  • BS in computer science or equivalent with 6+ years or MS in computer science or equivalent with 4+ years of professional software support experience.
  • Experience with object-oriented languages: Python, Java, etc.
  • Experience with the database.
  • Experienced in Incident Management process and ability to resolve level 1, and level 2 issues within agreed organization SLO.
  • A sense of ownership and pride in your performance and its impact on the company’s success
  • Critical thinker and problem-solving skills
  • Team player
  • Good time-management skills
  • Great interpersonal and communication skills

Nice to have

  • Knowledge of advanced networking technologies and services including MPLS, VPLS/VPWS, Ethernet, IP/VPN routing protocols and architectures, IP security/SSL, IP multicast, IPv6, and wired/wireless LAN infrastructures is a strong plus.
  • Hands-on knowledge of Linux operating system (Ubuntu, CentOS, etc.)
  • Real-world experience with cloud technology such as AWS, Azure, or Google Cloud Platform is nice to have.

Education

Any Graduate