Core skills needed :
Data Engineering tools and techniques (any ETL/ ELT skills are good to have )
Python, XML, XPath
SQL
AWS Lambda, SQS, Fargate, S3,
Dev Ops
10-12 year of exp must have
Proficiency in Python
Proficiency in Linux/Unix environments
Experience building applications for public cloud environments (AWS preferred)
Experience with AWS DevOps tools (git, Cloud Development Kit, CDK Pipeline) is highly desired
Experience building applications using AWS Serverless technologies such as Lambda, SQS, Fargate, S3 is highly desired
Experience with Apache Airflow is a plus
Experience building containerized applications (Docker, Kubernetes) is a plus
Skills:
Position Responsibilities
Data Engineer is responsible for developing Life Sciences content curation and delivery system for the purpose of building life sciences databases.
Develops data transformation and integration pipelines and infrastructure foundations of life sciences content in support of scientific databases and data curation
Combines strong software development and data engineering skills with a working knowledge of basic biology/chemistry/physics to develop sophisticated informatics solutions that drive efficiencies in content curation and workflow process. Applies data transformation and other data-engineering software development capabilities to contribute to the building of new scientific information management systems supporting scientific database building activities.
Competencies/Technologies
Experience with databases technologies (NoSQL, relational, property graph, RDF/triple store)
Experience with data engineering tools and techniques is highly desired
Experience working with XML and XPath is highly desired
Experience with MarkLogic/Xquery is a plus
Any Graduate