Description

Data Engineer
Our client is seeking a skilled and experienced Data Engineer with expertise in the Microsoft technology stack and Pentaho integration to join our dynamic team. The ideal candidate will be passionate about data and possess strong technical skills to design, develop, and maintain data pipelines, databases, and infrastructure using Microsoft technologies while leveraging Pentaho for advanced data integration and analytics. The Data Engineer will collaborate closely with cross-functional teams to ensure efficient data flow, integrity, and accessibility across various systems and platforms.
Responsibilities:
Design and Develop Data Pipelines: Architect, build, and optimize scalable data pipelines using Microsoft technologies such as Azure Data Factory, Azure Databricks, and SQL Server Integration Services (SSIS), integrating Pentaho components for advanced data transformations, orchestration, and scheduling. Ingest, process, and transform structured and unstructured data from multiple sources, ensuring reliability, efficiency, and scalability.
Database Management: Design, implement, and manage databases using Microsoft SQL Server and Azure SQL Database, integrating Pentaho for enhanced data warehouse automation, management, and optimization. Ensure data integrity, security, and performance optimization.
Data Modeling: Develop and maintain data models, schemas, and metadata definitions using tools like Azure Data Studio and SQL Server Management Studio, with Pentaho integration for advanced modeling and schema design capabilities to support business requirements and analytical insights. Implement best practices for data modeling and schema design.
Data Integration: Integrate data from various internal and external sources using Microsoft technologies such as Azure Data Factory and Azure Logic Apps, augmented with Pentaho for comprehensive data integration, cleansing, enrichment, and synchronization. Develop connectors and interfaces to facilitate seamless data integration.
Data Transformation and ETL: Perform data transformation, cleansing, and enrichment tasks using SQL queries, stored procedures, and SSIS packages within the Microsoft stack, with Pentaho integration for advanced ETL (Extract, Transform, Load) processes, data quality checks, and validation mechanisms to ensure accuracy and consistency.
Infrastructure Management: Manage and optimize infrastructure resources using Microsoft Azure services such as Azure Resource Manager (ARM) templates, Azure Virtual Machines, and Azure Kubernetes Service (AKS), incorporating Pentaho components for infrastructure orchestration, scalability, and monitoring.
Monitoring and Troubleshooting: Implement monitoring, logging, and alerting systems using Azure Monitor and Azure Log Analytics to track data pipeline performance within the Microsoft environment, with Pentaho integration for comprehensive monitoring and troubleshooting of Pentaho-based workflows and processes. Develop and maintain documentation for data engineering processes and systems.

Documentation: Develop and maintain comprehensive documentation for data engineering processes, systems, and solutions, including architectural diagrams, data flow diagrams, technical specifications, and user guides. Document configuration details, deployment procedures, and troubleshooting steps for data pipelines, databases, and infrastructure components. Ensure documentation is up-to-date, accessible, and understandable for both technical and non-technical stakeholders. Collaborate with team members to review and refine documentation standards and templates, fostering a culture of documentation excellence.
Collaboration and Communication: Collaborate closely with data scientists, analysts, software engineers, and stakeholders to understand data requirements, define technical solutions using Microsoft Azure services and Pentaho integration capabilities, and deliver insights-driven products and services. Communicate effectively with both technical and non-technical stakeholders.
Qualifications:
Bachelor's or Master's degree in Computer Science, Information Technology, or related field.
Proven experience as a Data Engineer or similar role, with a strong focus on building and optimizing data pipelines and infrastructure using the Microsoft technology stack.
Proficiency in programming languages such as SQL, Python, or C#, with experience in Microsoft Azure services and tools.
In-depth knowledge of Microsoft SQL Server, Azure SQL Database, Azure Data Factory, Azure Databricks, and SQL Server Integration Services (SSIS), with proficiency in integrating Pentaho for advanced data integration and analytics.
Experience with Pentaho Data Integration (PDI) tool for designing and executing ETL processes, data integration, and workflow orchestration.
Familiarity with big data technologies and distributed computing frameworks on the Microsoft Azure platform, coupled with Pentaho integration for comprehensive big data analytics and processing.
Strong understanding of data modeling, schema design, and data governance principles, with expertise in leveraging Pentaho for advanced data modeling capabilities.
Excellent analytical, problem-solving, and troubleshooting skills, with a keen attention to detail.
Ability to work independently and collaboratively in a fast-paced environment, with strong communication and interpersonal skills.

Education

ANY GRADUATE