Role:
Design, implement and lead Data Architecture, Data Quality, Data Governance across
Defining data modeling standards and foundational best practices
Develop and evangelize data quality standards and practices
Establish data governance processes, procedures, policies, and guidelines to maintain the integrity and security of the data.
Drive the successful adoption of organizational data utilization and self-serviced data platforms.
Create and maintain critical data standards and metadata that allows data to be understood and leveraged as a shared asset
Develop standards and write template codes for sourcing, collecting, and transforming data for streaming or batch processing data.
Design data schemes, object models, and flow diagrams to structure, store, process, and integrate data
Provide architectural assessments, strategies, and roadmaps for data management.
Apply hands-on subject matter expertise in the Architecture and administration of Big Data platforms, Data Lake Technologies (AWS S3/Hive), and experience with ML and Data Science platforms.
Implement and manage industry best practice tools and processes such as Data Lake, Databricks, Delta Lake, S3, Spark ETL, Airflow, Hive Catalog, Redshift, Kafka, Kubernetes, Docker, CI/CD
Translate big data and analytics requirements into data models that will operate at a large scale and high performance and guide the data analytics engineers on these data models.
Define templates and processes for the design and analysis of data models, data flows, and integration.
Lead and mentor Data Analytics team members in best practices, processes, and technologies in Data platforms
Mandatory Qualifications:
B.S. or M.S. in Computer Science, or equivalent degree
10+ years of hands-on experience in Data Warehouse, ETL, Data Modeling & Reporting.
7+ years of hands-on experience in productionizing and deploying Big Data platforms and applications, Hands-on experience working with: Relational/SQL, distributed columnar data stores/NoSQL databases, time-series databases, Spark streaming, Kafka, Hive, Delta Parquet, Avro, and more
extensive experience in understanding a variety of complex business use cases and modeling the data in the data warehouse.
Highly skilled in SQL, Python, Spark, AWS S3, Hive Data Catalog, Parquet, Redshift, Airflow, and Tableau or similar tools.
Proven experience in building a Custom Enterprise Data Warehouse or implementing tools like Data Catalogs, Spark, Tableau, Kubernetes, and Docker
Knowledge of infrastructure requirements such as Networking, Storage, and Hardware Optimization with Hands-on experience with Amazon Web Services (AWS)
Strong verbal and written communications skills are a must and work effectively across internal and external organizations and virtual teams.
Demonstrated industry leadership in the fields of Data Warehousing, Data Science, and Big Data related technologies.
Strong understanding of distributed systems and container-based development using Docker and Kubernetes ecosystem
Deep knowledge of data structures and algorithms.
Experience in working in large teams using CI/CD and agile methodologies.
Bachelor's degree in Computer Science