Maintain the production environment by monitoring availability and taking a holistic view of system health
Ensure highly resilient, low latency, business continuity designs in multi regions application deployments
Build software and systems to manage platform infrastructure and applications
Improve reliability, quality, and time-to-market of our suite of software solutions
Monitor at the application level, troubleshoot performance bottlenecks, and recommend application configurations
Optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational support and engineering for multiple large distributed software applications
Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Create sustainable systems and services through automation and uplifts
Balance feature development speed and reliability with well-defined service level objectives
Basic Qualifications:
Bachelors Degree
6+ years of related experience
Ability to program (structured and OO) with Python or Java
Strong with AWS network services (e.g., Route 53, CloudFront, Elastic Load Balancing)
Experience with Cloud Formation
ANY GRADUATE