As a Senior Data Engineer, you will work with the team to build complex data pipelines using a new cutting-edge Data Ingestion Platform as well as several new and exciting products in the Big Data/Advanced Analytics space. You will operate in a fast-paced environment where multiple project deliverables are coordinated within specified deadlines. The primary responsibility of this role is to deliver accurate and timely data across various products to enable revenue for the company. The ideal candidate would have data analysis and data pipeline development experience using big data tools (like Spark, Scala, Hive), relational databases, intermediate SQL skills. Healthcare knowledge is a plus.
Required Skills -
• MUST: 5+ years experience with data aggregation, standardization, linking, quality check mechanisms, and reporting.
• MUST: 5+ years experience with big data technologies like Hadoop and Spark.
• MUST: 5+ years experience with RDBMS (Oracle, MS SQL Server) and using SQL or other data integration/ETL tools.
• MUST: Solid understanding of Linux environments; strong knowledge of shell scripting and file systems.
• Bachelor’s degree in relevant field such as Computer Science, Engineering, a related field, with 8-10 years of industry experience.
Job Duties -
Principal Responsibilities and Essential Duties:
• Build data pipelines as per data transformation specifications to convert source data to be loaded into data lake using proprietary big data processing platform
• Supports and improves current data ingestion processes for our proprietary healthcare data applications and systems
• Develop and maintain data engineering processes using a variety of tools including T-SQL, Spark and Scala, and shell scripting. Generally focused on data ingestion for healthcare data management, data validation, statistical report generation, and program validation.
• Develop tools and techniques for improving process efficiencies and data performance.
• Review & test the data to ensure accuracy & validity of the data prior to uploading the data to the data lake.
• Data Troubleshooting and Analysis
• Perform data analysis, data mining and investigations and identify root cause of issues using several cutting-edge data analysis tools.
• Work with Technical Operations to troubleshoot complex database issues related to the entire environment including OS, storage, and servers. Provide off hours support to resolve production issues when necessary
• Mentor junior team members in data engineering and quality best practices
• Oversees the delivery of business priorities in a Scaled Agile Framework (SAFe) environment
Bachelor’s Degree