Description

Title: Data Engineer (BigQuery, DataProc, DataFlow) 

Location: Dallas, TX (Hybrid)

Hire Type : Contract

Additional Job Details :

  • The ideal resource would be local to the Dallas, TX area so they could be in office 1-2 days per week.
  • They would have extensive experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform.
  • Having experience on Azure Databricks is an added advantage (not mandatory).
  • Programming experience on Python, Shell scripting, PySpark and other data programming language.
  • Programming experience on Apache Beam Java SDK for building effective heavy data pipelines and deploying them in GCP DataFlow.
  • CICD process to deploy these pipelines in GCP.

Description:

  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Extensive Experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform. Having experience on Azure Databricks is an added advantage (not mandatory).
  • Experience on Cluster capacity configurations and cloud optimization to meet application demand.
  • Programming experience on Python, Shell scripting, PySpark and other data programming language.
  • Programming experience on Apache Beam Java SDK for building effective heavy data piplines and deploying them in GCP DataFlow. CICD process to deploy these pipelines in GCP.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with Data Visualization Dashboard, Metrics and etc.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
  • Familiar with Deployment tool like Docker and building CI/CD pipelines.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • 8+ years' experience in software development, Data engineering, and
  • Bachelor's degree in computer science, Statistics, Informatics, Information Systems or another quantitative field. Postgraduate/master's degree is preferred.
  • Experience in Machine Learning and Data Modeling is a plus.

What are the top 3 skills needed/required?

  • Extensive Experience on BigQuery, DataProc and DataFlow platforms on Google Cloud platform. Having experience on Azure Databricks is an added advantage (not mandatory).
  • Programming experience on Python, Shell scripting, PySpark and other data programming language.
  • Programming experience on Apache Beam Java SDK for building effective heavy data piplines and deploying them in GCP DataFlow. CICD process to deploy these pipelines in GCP.

What makes a resource profile stand out to you?

  • Previous experience tenure with prior clients
  • Good hands-on experience on prior assignments

What will this person’s day-to-day responsibilities be?

  • Handling business requirements and delivering them on Agile methodology.

How will they contribute to the project?

  • Handling business requirements and delivering them on Agile methodology

If hybrid or in office role, how many days a week will the resource need to come into the office?

  • 1 or 2 days per week
  • Please note that resources who will be working in Bentonville, AR, Reston, VA or some Texas locations must have a VendorSAFE background check completed.

Does this contract have the opportunity to extend or convert to an FTE?

  • Yes
  • Along with the required skills, it would be great if we can have profiles that are currently working somewhere or have just recently finished their assignment.
  • We are also looking for candidates with at least 7-8 years or experience or more in Big Data.

 

Education

Any Graduate