Responsibilities:
- Lead and implement analytical data lakes and accepted best practices in AWS
- Intimate knowledge of all phases of AWS infrastructure and concepts such as VPC, subnets, security groups, S3 object stores, RDS, EC2, Glacier, Lambda, IAM, enterprise security, data security, encryption, DevOps, replication and disaster recovery
- Deep experience with security in cloud environments around HIPAA, PHI/PII data, data encryption at rest and in transit as well security concepts like tokens, federated security models and secrets management
- Support development a highly scalable and extensible Big Data platform on cloud which enables collection, storage, modeling, and analysis of massive data sets from numerous channels with an emphasis on proper security and data governance best practices
- Familiarity with DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira and Confluence
- Provide support to Data Engineering teams on deployment of Hadoop/Spark jobs, work with IT Operations and
- Information Security teams on monitoring and troubleshooting of incidents to maintain service levels and coordinate with Vendor teams on installation, bus fixes, upgrades and escalations.
- Benchmark systems and analyze system bottlenecks in a hybrid environment and propose solutions to eliminate them.
- Research and provide recommendations on automating administration tasks and contribute to the evolving systems architecture to meet changing requirements for scaling, reliability, performance and manageability.
- Provide mentoring, knowledge transfer and assist in training for other team members.
- Understand business goals and drivers and translate those into an appropriate technical solution
- Identify explicit/implicit technical assumption and devise corresponding experimentation to validate the key resiliency assumptions.
- Participate in the definition of high availability and resilience standard and best practices for services from AWS and other internal/external providers.
- Participate in the assessment of service readiness for tiered business application and service adoption.
- Participate in the establishment of appropriate monitoring and alerting of service events related to performance, scalability, availability, and reliability.
- Contribute to cloud strategy discussions and decisions on overall Cloud design and best approach for implementing cloud solutions.
- Focus on continuous improvement practices as required to meet system resiliency imperatives
- Act as a liaison with other architects (security, infrastructure, data, etc.) and with delivery teams working primarily within an Agile (Scrum) methodology
- Establishes relationships with IT leaders, architects, and technical specialists in advancing proposed resiliency solutions
- Support for the adoption of DevOps methodology and Agile project management
- Communicates complicated technical concepts effectively to a broad group of stakeholders.
- Engage with Technical Architects and technical staff to determine the most appropriate technical strategy and designs to meet business needs.
- Demonstrate broad solutions technical leadership, impacting significant technical direction, exerting influence outside of the immediate team and driving change.
Qualifications:
- Extensive experience in administering Hadoop and Spark cluster environments with Cloudera’s distribution on-premise is required as well as experience in administering a hybrid environment.
- Intimate knowledge of all phases of AWS infrastructure and concepts such as VPC, subnets, security groups, S3, RDS, EC2, Glacier, Lambda, IAM, security, encryption, DevOps, replication and disaster recovery.
- Knowledge of analytical workspaces such as Jupyter, Zepellin, RSudio as well as data science analytical methodologies.
- Familiarity with DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira and Confluence.
- Ability to elicit requirements and communicate clearly with non-technical individuals, development teams, and other ancillary project members.
- Proficient in authoring, editing and presenting technical documents.
- Strong written and oral communication skills; Ability to communicate effectively with technical and non-technical staff.
- Desire to mentor younger team members and develop their skills.
- Experience working on multiple concurrent projects.
- Excellent problem-solving skills.
- Be independent and self-driven.
- Bachelor’s degree in Computer Science or related field.
- Must be open to travel.
Preferred skills and education:
- Master’s degree in Computer Science or related field
- Certification in AWS preferred
- Ability to work independently, with minimal supervision.
- Strong knowledge of AWS cloud environment
- Knowledge of AWS cloud monitoring tools (cloud watch, cloud trail, splunk, and other application monitoring.)
- Knowledge of AWS Identity Access Management (IAM)
- In-depth, hands-on expertise in Java, MySQL, Linux
- Ability to estimate the financial impact of various solution architecture alternatives
- Must be comfortable working in an open, highly collaborative team
- Strong troubleshooting skills
- Ability to write scripts (Bash, PHP, Python) for automation of solution resiliency validation and verification.
- Excellent oral and written communication skills along with and ability to communicate at all levels.
- Experience with provisioning and configuration tools like AWS CloudFormation/Terraform a plus
- Chaos engineering experience a huge plus.