Job Description
We are seeking a talented Data Science Engineer to join our team and contribute to the development and implementation of advanced data solutions using technologies such as AWS Glue, Python, Spark, Snowflake Data Lake, S3, SageMaker, and machine learning (M/L).
As a Data Science Engineer, you will play a crucial role in designing, building, and optimizing data pipelines, machine learning models, and analytics solutions. You will work closely with cross-functional teams to extract actionable insights from data and drive business outcomes.
Key Responsibilities
Develop and maintain ETL pipelines using AWS Glue for data ingestion, transformation, and integration from various sources
Utilize Python and Spark for data preprocessing, feature engineering, and model development
Design and implement data lake architecture using Snowflake Data Lake, Snowflake data warehouse and S3 for scalable and efficient storage and processing of structured and unstructured data
Leverage SageMaker for model training, evaluation, deployment, and monitoring in production environments
Collaborate with data scientists, analysts, and business stakeholders to understand requirements, develop predictive models, and generate actionable insights
Conduct exploratory data analysis (EDA) and data visualization to communicate findings and trends effectively
Stay updated with advancements in machine learning algorithms, techniques, and best practices to enhance model performance and accuracy
Ensure data quality, integrity, and security throughout the data lifecycle by implementing robust data governance and compliance measures
Qualifications
Bachelor's degree or higher in Computer Science, Data Science, Statistics, or related field
Proficiency in AWS services such as Glue, S3, SageMaker, and Snowflake Data Lake with 5-6 years of experience
Strong programming skills in Python for data manipulation, analysis, and modeling
Experience with distributed computing frameworks like Spark for big data processing
Knowledge of machine learning concepts, algorithms, and tools for regression, classification, clustering, and recommendation systems
Familiarity with SQL, NoSQL databases, and data warehousing concepts
Bachelor's degree