Description
As a Data Engineer at Technix-Technology you will be responsible for designing, building, and maintaining scalable data pipelines and data architectures on the AWS cloud platform. You will work closely with cross-functional teams to ensure data availability, quality, and performance for analytics and business intelligence solutions.
Responsibility
- Design, develop, and maintain ETL (Extract, Transform, Load) processes and data pipelines on AWS for Batch & Realtime data.
- Implement data storage solutions, including data lakes and data warehouses, leveraging AWS services such as S3, Redshift, Dynamo DB, RDS, Glue, EC2 and EMR or 3rd party managed services like Snowflake, Databricks, Redis, Neo4j
- Implement the best data format (Parquet, ORC, Avro, Json) for processing and accessing the data.
- Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business objectives
- Optimize and tune data processing and transformation jobs for performance and scalability.
- Ensure data security, compliance, and governance practices are followed.
- Monitor data pipeline health, troubleshoot issues, and implement proactive solutions.
- Implement Data Model in consumption layer, which is standardized, governed and optimized
- Work and implement by selecting a CI/CD tool for cloud environment, consider factors such as preferred cloud provider, integration capabilities, ease of use, scalability, and pricing for e.g. Jenkins, GitLab CI/CD, AWS CodePipeline, Google Cloud Build, Bamboo, Drone, Travis, check, puppet, bash scripts.
- Maintain documentation and best practices for data engineering processes and standards. Stay up to date with emerging AWS technologies and best practices in data engineering.
Requirements
- Proven experience as a data engineer with a focus on AWS cloud technologies.
- Strong knowledge of AWS data services, including S3, Redshift, Glue, EMR, EC2, Dynamo DB, RDS, Lambda, and Athena.
- Proficiency in programming languages such as Python, Scala, or Java.
- Experience with ETL tools and frameworks, such as Apache Spark.
- Experience with data modeling and SQL.
- Knowledge of data governance, security, and compliance practices.
- Excellent problem-solving and communication skills.
- AWS certifications (e.g., AWS Certified Data Engineer) are a plus