Description

Maintain the production environment by monitoring availability and taking a holistic view of system health

Ensure highly resilient, low latency, business continuity designs in multi regions application deployments

Build software and systems to manage platform infrastructure and applications

Improve reliability, quality, and time-to-market of our suite of software solutions

Monitor at the application level, troubleshoot performance bottlenecks, and recommend application configurations

Optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve

Provide primary operational support and engineering for multiple large distributed software applications

Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding

Partner with development teams to improve services through rigorous testing and release procedures

Participate in system design consulting, platform management, and capacity planning

Create sustainable systems and services through automation and uplifts

Balance feature development speed and reliability with well-defined service level objectives

Basic Qualifications:
Bachelors Degree

6+ years of related experience

Ability to program (structured and OO) with Python or Java

Strong with AWS network services (e.g., Route 53, CloudFront, Elastic Load Balancing)

Experience with Cloud Formation

 

Education

ANY GRADUATE