Title: Data Engineer (BigQuery, DataProc, DataFlow)
Location: Dallas, TX (Hybrid)
Hire Type : Contract
Additional Job Details :
- The ideal resource would be local to the Dallas, TX area so they could be in office 1-2 days per week.
- They would have extensive experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform.
- Having experience on Azure Databricks is an added advantage (not mandatory).
- Programming experience on Python, Shell scripting, PySpark and other data programming language.
- Programming experience on Apache Beam Java SDK for building effective heavy data pipelines and deploying them in GCP DataFlow.
- CICD process to deploy these pipelines in GCP.
Description:
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Extensive Experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform. Having experience on Azure Databricks is an added advantage (not mandatory).
- Experience on Cluster capacity configurations and cloud optimization to meet application demand.
- Programming experience on Python, Shell scripting, PySpark and other data programming language.
- Programming experience on Apache Beam Java SDK for building effective heavy data piplines and deploying them in GCP DataFlow. CICD process to deploy these pipelines in GCP.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with Data Visualization Dashboard, Metrics and etc.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
- Familiar with Deployment tool like Docker and building CI/CD pipelines.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- 8+ years' experience in software development, Data engineering, and
- Bachelor's degree in computer science, Statistics, Informatics, Information Systems or another quantitative field. Postgraduate/master's degree is preferred.
- Experience in Machine Learning and Data Modeling is a plus.
What are the top 3 skills needed/required?
- Extensive Experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform. Having experience on Azure Databricks is an added advantage (not mandatory).
- Programming experience on Python, Shell scripting, PySpark and other data programming language.
- Programming experience on Apache Beam Java SDK for building effective heavy data piplines and deploying them in GCP DataFlow. CICD process to deploy these pipelines in GCP.
What makes a resource profile stand out to you?
- Previous experience tenure with prior clients
- Good hands-on experience on prior assignments
What will this person’s day-to-day responsibilities be?
- Handling business requirements and delivering them on Agile methodology.
How will they contribute to the project?
- Handling business requirements and delivering them on Agile methodology
If hybrid or in office role, how many days a week will the resource need to come into the office?
- 1 or 2 days per week
- Please note that resources who will be working in Bentonville, AR, Reston, VA or some Texas locations must have a VendorSAFE background check completed.
Does this contract have the opportunity to extend or convert to an FTE?
- Yes
- Along with the required skills, it would be great if we can have profiles that are currently working somewhere or have just recently finished their assignment.
- We are also looking for candidates with at least 7-8 years or experience or more in Big Data.