Description

Job Description

We are seeking a talented Data Science Engineer to join our team and contribute to the development and implementation of advanced data solutions using technologies such as AWS Glue, Python, Spark, Snowflake Data Lake, S3, SageMaker, and machine learning (M/L).

As a Data Science Engineer, you will play a crucial role in designing, building, and optimizing data pipelines, machine learning models, and analytics solutions. You will work closely with cross-functional teams to extract actionable insights from data and drive business outcomes.

Key Responsibilities

Develop and maintain ETL pipelines using AWS Glue for data ingestion, transformation, and integration from various sources

Utilize Python and Spark for data preprocessing, feature engineering, and model development

Design and implement data lake architecture using Snowflake Data Lake, Snowflake data warehouse and S3 for scalable and efficient storage and processing of structured and unstructured data

Leverage SageMaker for model training, evaluation, deployment, and monitoring in production environments

Collaborate with data scientists, analysts, and business stakeholders to understand requirements, develop predictive models, and generate actionable insights

Conduct exploratory data analysis (EDA) and data visualization to communicate findings and trends effectively

Stay updated with advancements in machine learning algorithms, techniques, and best practices to enhance model performance and accuracy
Ensure data quality, integrity, and security throughout the data lifecycle by implementing robust data governance and compliance measures

Qualifications
Bachelor's degree or higher in Computer Science, Data Science, Statistics, or related field

Proficiency in AWS services such as Glue, S3, SageMaker, and Snowflake Data Lake with 5-6 years of experience
Strong programming skills in Python for data manipulation, analysis, and modeling

Experience with distributed computing frameworks like Spark for big data processing

Knowledge of machine learning concepts, algorithms, and tools for regression, classification, clustering, and recommendation systems

Familiarity with SQL, NoSQL databases, and data warehousing concepts

 

Education

Bachelor's degree