NEWPosted 2 hours ago

Job ID: JOB_ID_4427

Job Summary

We are seeking a highly experienced Data Architect with strong Databricks expertise to design, build, and govern scalable data platforms for insurance operations (policies, claims, invoices). The ideal candidate will drive end-to-end data architecture on Databricks Lakehouse, ensuring high data quality, governance, and performance while enabling advanced analytics and business insights.

Key Responsibilities

Architect and implement scalable Databricks Lakehouse solutions for insurance data (claims, policies, invoices)
Design and optimize data pipelines using Databricks, Delta Lake, and Spark/PySpark
Develop and manage Databricks notebooks, workflows, and job orchestration for batch and real-time processing
Define and enforce data governance, data quality frameworks, and KPI monitoring across datasets
Design data models (star/snowflake), data mapping, and data lineage frameworks
Build and maintain Delta Tables with performance optimization (partitioning, indexing, Z-ordering)
Perform root cause analysis for data inconsistencies and ensure data integrity across systems
Collaborate with business stakeholders, insurers, MGAs, and product teams to translate requirements into scalable solutions
Optimize performance tuning, cost optimization, and query efficiency in Databricks
Implement data ingestion frameworks using tools like Auto Loader, Kafka, or batch ingestion pipelines
Lead cloud data architecture initiatives across AWS / Azure / GCP
Maintain documentation for data standards, governance policies, and architecture design

Required Skills

Strong hands-on experience with Databricks (mandatory)
- Delta Lake, Unity Catalog, Databricks SQL, Workflows
Expertise in Apache Spark / PySpark for large-scale data processing
Strong proficiency in SQL and Python
Experience in Databricks Lakehouse architecture & medallion architecture (Bronze, Silver, Gold layers)
Strong knowledge of data modeling, ETL/ELT frameworks, and data warehousing concepts
Experience with real-time and batch data processing
Hands-on experience with data orchestration tools (Airflow preferred)
Strong understanding of data governance, data quality, and lineage tools
Experience working with cloud platforms (Azure, AWS, or GCP)
Knowledge of performance tuning and optimization in Databricks environments
Experience handling large-scale structured & semi-structured datasets
Strong stakeholder communication and problem-solving skills.

Special Requirements

Remote

Compensation & Location

Salary: $150,000 – $200,000 per year (Estimated)

Location: San Jose, CA

Recruiter / Company – Contact Information

Email: ana@vtekis.com

Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
ana@vtekis.com with the subject:

DELETE_JOB_ID_4427

to delete@join-this.com.