this is the last updated SRE job description refined by the client, in case you didn't have:
- General SRE experience, comfortable in development, DevOps, and Cloud Engineering areas
- Comfort in leading these areas, mentoring others and guiding the team into better practices (e.g. runbooks, documentation, automation)
- Observability, Monitoring, Incident Management and Response
- Handling incidents to resolution and post-mortem follow-ups
- Proactively identifying and improving areas of largest impact on system and application stability
- Experience with tools like Datadog, Pagerduty
- Infrastructure and CI/CD Technologies
- Spinnaker, Terraform, Jenkins
- Containerization tooling and practices
- Docker
- More nice-to-have: Kubernetes
- Comfort with AWS technology
- Lambda, databases (Aurora, DynamoDB, RDS), SQS, Kinesis, S3
- Ability to be hands-on, both independently and collaborating with the development team to identify, define, and implement improvements across all the above areas
- An eye for security and related best-practices
- Identifying and prioritizing security concerns, implementing remediations and guiding the team into secure software practices
- Proper handling of PII
- Experience with Java software development
- Nice-to-have: Python, Typescript/Node.js, Scala