Description

Job Description

As a Site Reliability Engineer, you’ll play a critical role in designing scalable and reliable systems that meet the operational requirements of our organization.

 

  • Leverage your understanding of our system to assist in resolving production issues in real time
  • Bring your knowledge and perspective on reliability to preemptive (design reviews) and corrective (post mortems) discussions
  • Lead service reliability reviews and audits and present findings to stakeholders
  • Find patterns and pain points that hinder Wix’s availability, and produce large-scale solutions
  • Generate and prioritize tasks for infrastructure teams to aid in improving uptime and reducing blast radius
  • Analyze system performance and scalability requirements, identify bottlenecks, and then propose and implement solutions to optimize system capacity


     

Qualifications:

 

  • A Senior R&D employee with 5+ years of experience managing large engineering projects
  • You’re experienced with monitoring, logging, and tracing mechanisms
  • You have an excellent understanding of how web applications work - from browsers and caches to the database, and back
  • You’re skilled in site reliability engineering principles, including scalability, availability, performance, and fault tolerance
  • You're highly motivated by the idea of automating failure remediation processes
  • You’re great at jumping between multiple tasks and you know how to analyze risk and prioritize accordingly
  • You enjoy critical thinking and problem solving and are seasoned in conflict resolution
  • 3+ years experience in coding and/or running production systems over the cloud - a significant advantage
     

Education

ANY GRADUATE