Description

 

Job Description:

We are seeking a skilled Data Engineer with over 4 years of experience in data engineering and development expertise in Java and Python. The ideal candidate will have hands-on experience in designing, building, and maintaining data pipelines, as well as working with a wide range of data technologies, including Kafka, S3, CDC (Change Data Capture), Spark, and SQL. The candidate should also be proficient in container technologies like Docker and have a strong understanding of best practices for scalable and efficient data processing.

Responsibilities:

● Design, develop, and maintain robust data pipelines using a combination of Java, Python, and relevant data technologies.

● Implement and manage real-time data streaming using Kafka and CDC processes to ensure efficient data flow.

● Work with S3 for scalable storage solutions and manage the integration with data pipelines.

● Develop and optimize big data processing using Spark for distributed data processing and SQL for querying.

● Ensure smooth setup and maintenance of containerized environments using Docker.

● Collaborate with cross-functional teams to understand data needs and provide data solutions that align with business requirements.

● Troubleshoot and resolve issues in existing pipelines and ensure data quality and integrity.

● Stay up-to-date with the latest industry trends in data engineering and recommend best practices.

Required Skills and Qualifications:

● 4+ years of experience in data engineering with a focus on data pipeline setup and management.

● Strong development experience in Java and Python.

● Hands-on experience with Kafka, S3, CDC, Spark, and SQL.

● Proficient with container technologies like Docker.

● Experience working in cloud environments and handling large datasets.

● Solid understanding of data architecture and real-time processing.

● Strong problem-solving skills and the ability to work independently as well as part of a team

 


 

Education

Any Graduate