Job ID: JOB_ID_4563
Job Overview
We are seeking a highly skilled and experienced GCP Data Engineer with deep, hands-on architectural and development expertise in the Google Cloud Platform’s big data ecosystem. The ideal candidate will be responsible for designing, building, and optimizing a modern data lakehouse architecture. A primary focus will be on leveraging BigLake, BigQuery, Google Cloud Storage (GCS), and Vertex AI to create seamless, scalable data pipelines and machine learning integrations that drive business intelligence and predictive analytics.
Key Responsibilities
- Lakehouse Architecture & Development: Architect and maintain a scalable data lakehouse using Google Cloud Storage (GCS) as the foundational data lake and BigLake to unify data warehouses and data lakes. Implement fine-grained security (row-level and column-level access controls) and data governance across open file formats (Parquet, Iceberg, ORC) using BigLake.
- Data Warehousing & Optimization: Design and manage complex, highly scalable data models within BigQuery. Perform deep performance tuning and cost optimization of BigQuery jobs utilizing clustering, partitioning, materialized views, and slot capacity management.
- AI/ML Integration & MLOps: Collaborate with Data Scientists to operationalize machine learning models using Vertex AI. Build robust data pipelines to feed Vertex AI Feature Store, manage model training workflows, and deploy ML models into production. Utilize BigQuery ML (BQML) for in-database predictive modeling and analytics where appropriate.
- Data Pipeline Engineering: Design, develop, and orchestrate batch and streaming data pipelines (using tools like Dataflow, Dataproc, or Cloud Composer/Airflow) to ingest data from diverse sources into GCS and BigQuery.
- Data Governance & Best Practices: Establish data lifecycle management policies in GCS. Ensure data quality, reliability, and security compliance across the entire GCP big data stack. Mentor junior engineers and lead code/architecture reviews.
Required Qualifications
- Experience: 5+ years of dedicated Data Engineering experience, with at least 3+ years focused exclusively on the Google Cloud Platform (GCP).
- Deep GCP Big Data Expertise:
- BigQuery: Expert-level knowledge of BigQuery architecture, advanced SQL, analytical functions, query profiling, and optimization techniques.
- BigLake: Proven experience utilizing BigLake for multi-cloud or lakehouse architectures, managing open-source formats (e.g., Apache Iceberg/Parquet), and enforcing unified security policies.
- GCS: Deep understanding of GCS storage classes, object lifecycle management, and optimizing GCS for big data workloads.
- Vertex AI: Hands-on experience with Vertex AI pipelines, endpoints, feature stores, or deploying ML models into scalable data environments.
- Programming Skills: Advanced proficiency in Python and SQL. Familiarity with Java, Scala, or Go is a plus.
- Data Orchestration & CI/CD: Experience with orchestration tools (e.g., Apache Airflow, Cloud Composer) and modern CI/CD pipelines (e.g., GitHub Actions, Terraform, Cloud Build).
Keywords
continuous integration continuous deployment artificial intelligence machine learning golang
Compensation & Location
Salary: $60 – $85 per year (Estimated)
Location: Irving, TX
Recruiter / Company – Contact Information
Recruiter / Employer: Veridian Tech Solutions, Inc.
Email: shant@veridiants.com
Recruiter Notice:
To remove this job posting, please send an email from
shant@veridiants.com with the subject:
DELETE_JOB_ID_4563