Job Code : EWC - 1459
Responsibilities:
Hands-on building of ETL pipelines using our internal framework written in Java and
Python
Hands-on solutioning of real time REST APIs or other solutions for streaming data
from Graph
Modify existing application code or interfaces or build new application components
from detailed requirements.
Analysis of requirements, support of the design, development of the code, testing,
debugging, deployment, and maintenance of those programs and interfaces.
Documentation of the work is essential
Participation in most aspects of programming and application development,
including file design, update, storage, and retrieval
Enhance processes to resolve operational problems and add new functions taking
into consideration schedule, resource constraints, process complexity, dependencies,
assumptions and application structure
Ability to maintain the developed solution on an on-going basis is essential
Ability to follow the existing development methodology and coding standards, and
ensure compliance with the internal and external regulatory requirements
Develop and implement databases, data collection systems, data analytics and other
strategies that optimize statistical efficiency and quality
Acquire data from primary or secondary data sources and maintain databases/data
systems
Work with management to prioritize business and information needs
Locate and define new process improvement opportunities
Document design and data flow for existing and new applications being built.
Co-ordinate with multiple different teams QA, Operations and other development
team within the organization.
Testing methods, including unit and integration testing (PyTest, PyUnit)
Ability to integrate with large teams, demonstrating strong verbal and written
communication skills
Utilization of software configuration management tools
Code deployment and code versioning tools
Excellent Communication Skills
Qualifications:
Bachelor’s degree preferably with Computer Science background.
At least 5+ years of experience implementing complex ETL pipelines preferably with
Spark toolset.
At least 5+ years of experience with Python particularly within the data space
Technical expertise regarding data models, database design development, data
mining and segmentation techniques
Good experience writing complex SQL and ETL processes
Excellent coding and design skills, particularly either in Scala or Python.
Strong practical working experience with Unix scripting in at least one of Python,
Perl, Shell (either bash or zsh).
Experience in AWS technologies such as EC2, Redshift, Cloud formation, EMR, AWS
S3, AWS Analytics required.
Experience designing and implementing data pipelines in a onprem/cloud
environment is required.
Experience building/implementing data pipelines using Databricks/On prem or
similar cloud database.
Expert level knowledge of using SQL to write complex, highly optimized queries
across large volumes of data.
Hands-on object-oriented programming experience using Python is required.
Professional work experience building real-time data streams using Spark and
Experience in Spark.
Knowledge or experience in architectural best practices in building data lakes
Develop and work with APIs
Develop and maintain scalable data pipelines and build out new API integrations to
support continuing increases in data volume and complexity.
Collaborate with analytics and business teams to improve data models that feed
business intelligence tools, increase data accessibility, and foster data-driven decision
making across the organization.
Implement processes and systems to monitor data quality, to ensure production
data accuracy, and ensure key stakeholder and business process access.
Write unit/integration tests, contribute to engineering wiki, and documents.
Perform data analysis required to troubleshoot data related issues and assist in the
resolution of data issues.
Experience developing data integrations and data quality framework based on
established requirements.
Experience with CI/CD processes and tools (e.g., concourse, Jenkins).
Experience with test driven development writing unit tests, test coverage using
PyTest, PyUnit, pytest-cov libraries.
Experience working in an Agile environment.
Good understanding & usage of algorithms and data structures
Good Experience building reusable frameworks.
Experience working in an Agile Team environment.
AWS certification is preferable: AWS Developer/Architect/DevOps/Big Data
Excellent communication skills both verbal and written
ANY GRADUATE