Description

Job Title: Cloud Operations Engineer

Location: Fort worth, TX   (Hybrid)

Duration: Contract (1 year +)

              

Key Responsibilities:

  • Incident and System Management: Collaborate with internal teams and suppliers to analyze and resolve critical IT and Telecom service interruptions, and protect system availability through incident, problem, and change management.
  • System Monitoring and Optimization: Monitor systems for faults, identify optimization opportunities, and implement tools and process changes to improve monitoring and alerting.
  • Incident Response and Root Cause Analysis: Work with major incident response teams for escalations and monitoring during major incidents

 

Qualifications & Experience:

  • Bachelor’s degree in computer science, Information Systems, or Engineering preferred.
  • a solid understanding of cloud architecture and DevOps principles.
  • Strong exp in Event monitoring and alerting, DevOps, Infrastructure Support, or IT Major Incident Management
  • Experience with monitoring tools (Dynatrace, CloudWatch, Zabbix, SCOM).
  • DevOps application performance tuning.
  • Strong writing skills for documentation.
  • Proficient in distributed systems/administration (Windows, Unix, Linux, VMWare, etc.).
  • Knowledge of ITIL best practices (certification is a plus).
  • Familiarity with SDLC lifecycle.
  • Experience in SLA/KPI-driven environments.
  • ServiceNow proficiency.
  • General scripting/programming skills (Python, Node.js, Ruby, Perl, Bash/sh)
  • Availability: Able to work in a 24/7 environment and provide on-call support.


 

Education

Bachelor's degree