Job Description:
We are seeking a skilled Data Engineer with over 4 years of experience in data engineering and development expertise in Java and Python. The ideal candidate will have hands-on experience in designing, building, and maintaining data pipelines, as well as working with a wide range of data technologies, including Kafka, S3, CDC (Change Data Capture), Spark, and SQL. The candidate should also be proficient in container technologies like Docker and have a strong understanding of best practices for scalable and efficient data processing.
Responsibilities:
● Design, develop, and maintain robust data pipelines using a combination of Java, Python, and relevant data technologies.
● Implement and manage real-time data streaming using Kafka and CDC processes to ensure efficient data flow.
● Work with S3 for scalable storage solutions and manage the integration with data pipelines.
● Develop and optimize big data processing using Spark for distributed data processing and SQL for querying.
● Ensure smooth setup and maintenance of containerized environments using Docker.
● Collaborate with cross-functional teams to understand data needs and provide data solutions that align with business requirements.
● Troubleshoot and resolve issues in existing pipelines and ensure data quality and integrity.
● Stay up-to-date with the latest industry trends in data engineering and recommend best practices.
Required Skills and Qualifications:
● 4+ years of experience in data engineering with a focus on data pipeline setup and management.
● Strong development experience in Java and Python.
● Hands-on experience with Kafka, S3, CDC, Spark, and SQL.
● Proficient with container technologies like Docker.
● Experience working in cloud environments and handling large datasets.
● Solid understanding of data architecture and real-time processing.
● Strong problem-solving skills and the ability to work independently as well as part of a team
Any Graduate