Job ID: JOB_ID_4868
Role Overview
The Data Engineer will play a critical role in building scalable, reliable data pipelines to support real-time and batch processing workflows. You will work closely with cross-functional teams to integrate multiple data sources, build Operational Data Stores, transformations, and enable timely data availability for reporting and analytics through dashboards.
Key Responsibilities
Data Ingestion & Integration
- Develop and maintain data ingestion pipelines for service and repair data using Confluent Kafka for event streaming.
- Implement connectors and integrations between Kafka, AWS S3, Google Dataflow, and Snowflake to facilitate batch and real-time data flows.
- Work with APIs and Apigee to securely ingest and distribute data across internal and external systems, including dealer networks.
Data Cleansing & Transformation
- Build and optimize data cleansing, normalization, and transformation pipelines in Google Dataflow for real-time processing.
- Design and implement batch transformation jobs within Snowflake, building and maintaining the Operational Data Store (ODS).
- Ensure data quality, consistency, and integrity across all processing stages.
Data Publishing & Reporting Support
- Publish transformed and aggregated data to internal and external dashboards using APIs, Kafka topics, and Tableau.
- Collaborate with data analysts and business stakeholders to support reporting and analytics requirements.
- Monitor and troubleshoot data pipelines to ensure high availability and performance.
Collaboration & Documentation
- Partner with data architects, analysts, and external dealer teams to understand data requirements and source systems.
- Document data workflows, processing logic, and integration specifications.
- Adhere to best practices in data security, governance, and compliance.
Required Technologies & Skills
- Event Streaming: Confluent Kafka (proficiency), Kafka Connectors
- API Management: Apigee (proficiency)
- Cloud Storage & Data Warehousing: AWS S3, Snowflake
- Data Processing: Google Dataflow
- Programming: SQL, Python (proficiency)
- Batch & Real-Time Pipeline Development
- Data Visualization Support: Tableau (basic understanding for data publishing)
- Experience building Operational Data Stores (ODS) and data transformation pipelines in Snowflake
- Familiarity with truck industry aftersales or automotive service and repair data is a plus
Qualifications
- Bachelors or Masters degree in Computer Science, Data Engineering, Information Systems, or related field.
- 3+ years of proven experience in data engineering, especially with streaming and batch data pipelines.
- Hands-on experience with Kafka ecosystem (Confluent Kafka, Connectors) and cloud data platforms (Snowflake, AWS).
- Skilled in Python programming for data processing and automation.
- Experience with Google Cloud Platform services, especially Google Dataflow, is highly desirable.
- Strong understanding of data modeling, ETL/ELT processes, and data quality principles.
- Ability to work collaboratively in cross-functional teams and communicate technical concepts effectively.
Special Requirements
Only Local Candidate in WA location, Onsite, Long Term Contract role, truck industry aftersales or automotive service and repair data is a plus
Compensation & Location
Salary: $90,000 – $130,000 per year
Location: Renton, WA
Recruiter / Company – Contact Information
Email: ad@arborteksys.com
Recruiter Notice:
To remove this job posting, please send an email from
ad@arborteksys.com with the subject:
DELETE_JOB_ID_4868