Description

Job Code : EWC - 1459

Responsibilities: 

 Hands-on building of ETL pipelines using our internal framework written in Java and
Python 
 Hands-on solutioning of real time REST APIs or other solutions for streaming data
from Graph 
 Modify existing application code or interfaces or build new application components
from detailed requirements. 
 Analysis of requirements, support of the design, development of the code, testing,
debugging, deployment, and maintenance of those programs and interfaces.
Documentation of the work is essential 
 Participation in most aspects of programming and application development,
including file design, update, storage, and retrieval 
 Enhance processes to resolve operational problems and add new functions taking
into consideration schedule, resource constraints, process complexity, dependencies,
assumptions and application structure 
 Ability to maintain the developed solution on an on-going basis is essential 
 Ability to follow the existing development methodology and coding standards, and
ensure compliance with the internal and external regulatory requirements 
 Develop and implement databases, data collection systems, data analytics and other
strategies that optimize statistical efficiency and quality 
 Acquire data from primary or secondary data sources and maintain databases/data
systems 
 Work with management to prioritize business and information needs 
 Locate and define new process improvement opportunities 
 Document design and data flow for existing and new applications being built.  
 Co-ordinate with multiple different teams QA, Operations and other development
team within the organization.  
 Testing methods, including unit and integration testing (PyTest, PyUnit) 
 Ability to integrate with large teams, demonstrating strong verbal and written
communication skills 
 Utilization of software configuration management tools 
 Code deployment and code versioning tools 
 Excellent Communication Skills 

 
 
 
 
Qualifications: 
 

 Bachelor’s degree preferably with Computer Science background.  
 At least 5+ years of experience implementing complex ETL pipelines preferably with
Spark toolset. 
 At least 5+ years of experience with Python particularly within the data space 

 Technical expertise regarding data models, database design development, data
mining and segmentation techniques 
 Good experience writing complex SQL and ETL processes 
 Excellent coding and design skills, particularly either in Scala or Python. 
 Strong practical working experience with Unix scripting in at least one of Python,
Perl, Shell (either bash or zsh). 
 Experience in AWS technologies such as EC2, Redshift, Cloud formation, EMR, AWS
S3, AWS Analytics required. 
 Experience designing and implementing data pipelines in a onprem/cloud
environment is required. 
 Experience building/implementing data pipelines using Databricks/On prem or
similar cloud database. 
 Expert level knowledge of using SQL to write complex, highly optimized queries
across large volumes of data. 
 Hands-on object-oriented programming experience using Python is required. 
 Professional work experience building real-time data streams using Spark and
Experience in Spark. 
 Knowledge or experience in architectural best practices in building data lakes 
 Develop and work with APIs 
 Develop and maintain scalable data pipelines and build out new API integrations to
support continuing increases in data volume and complexity. 
 Collaborate with analytics and business teams to improve data models that feed
business intelligence tools, increase data accessibility, and foster data-driven decision
making across the organization. 
 Implement processes and systems to monitor data quality, to ensure production
data accuracy, and ensure key stakeholder and business process access. 
 Write unit/integration tests, contribute to engineering wiki, and documents. 
 Perform data analysis required to troubleshoot data related issues and assist in the
resolution of data issues. 
 Experience developing data integrations and data quality framework based on
established requirements. 
 Experience with CI/CD processes and tools (e.g., concourse, Jenkins). 
 Experience with test driven development writing unit tests, test coverage using
PyTest, PyUnit, pytest-cov libraries. 
 Experience working in an Agile environment. 
 Good understanding & usage of algorithms and data structures 
 Good Experience building reusable frameworks. 
 Experience working in an Agile Team environment. 
 AWS certification is preferable: AWS Developer/Architect/DevOps/Big Data 
 Excellent communication skills both verbal and written

Education

ANY GRADUATE