Description

Required:
• Strong experience in relational and non-relational data architecture
• Strong experience in data classification based on data type
• Strong experience in working with Data Scientists, Architects, and business experts to clarify requirements for components and features of the Data Lake. Ability to garner requirements for varying consumers of Data Lake is strongly desired.
• Strong expertise in designing, and developing metadata-driven Data Lake solutions for ingesting historic and near real-time structured/semi-structured/unstructured data
• 10 years or more experience in data management
• Minimum 2 years of experience managing changes in enterprise-grade operational Data Lake  
• Hands-on experience developing systems that leverage multiple AWS services including CloudFormation, Lambda, API Gateway, S3, DynamoDB/ any NoSQL, Relational Database Service (RDS), Glue Catalog, Crawler, Glue ETL job, Athena and QuickSight for data management and analysis.
• Strong experience in design & implement ETL  and ELT pipelines is a must.
• Experience in AWS EMR is plus
• Expertise in using tools like Gliffy to create diagrams representing Business Process, ERD and AWS Architecture is a PLUS
• Experience in developing solutions for data Ingestion, Transformation, Cataloging, In-Place Querying, Storage, and Security using AWS tools and best practices
• Experience in Relational/NOSQL database into an Enterprise Data Lake is a Plus
• Experience in setting up AWS CloudWatch, CloudTrail for monitoring and Optimizing the Data Lake environments is a PLUS
• Proficient in Python, PySpark , AWS Glue ETL jobs
• SQL queries using AWS Athena and mapping of various relational DB queries to Athena
• Experience in de-normalizing/flattening data structures in Parquet/ORC is nice to have
• Good understanding of AWS IAM policies to implement security best practices in DataLake
• Follow and enforce strict standards for code quality, automated testing, infrastructure-as-code and code maintainability
• Robust debugging skills and knowledge of automated testing platforms and unit tests
• Ability to work in an agile collaborative environment, take lead on developing Stories and translating requirements to Problem Statement
• Experience using Agile collaboration tools like Jira, Confluence
• Strong analytical skills, problem solving aptitude and strong presentation/communications skills.
• AWS Certification plus

Education

Bachelor's degree