Description

About the job

Designing, building and deploying data systems, pipelines, and applications

Playing a key part in defining and establishing data pipelines to produce reliable feature sets for Data Analytics and Reporting

Ingesting data from a variety of data sources from relational databases to unstructured data such as text, CSV documents etc

To setup database connection with various on cloud/ on premises databases using connection methods as guided/ defined by different tools and client technical requirement guidelines

Work closely with the customers on everything including problem scoping, infrastructure provisioning, execution, deployment, maintenance Skills

Excellent hands-on experience with PySpark, Python, Scala,

Excellent hands-on experience of using Data Lake and Databricks, EMR

Worked on AWS Data orchestration tools and technologies

Ability to grasp challenges for business stakeholders and looking for their solutions

Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modelling decisions and data engineering strategy

Has passion to work on new technologies.
 

Desired Skills and Experience

Pyspark, Python, Scala, AWS, Data Lake, Healthcare Domain

Education

Any Graduate