NEWPosted 2 hours ago
Job ID: JOB_ID_4427
Job Summary
We are seeking a highly experienced Data Architect with strong Databricks expertise to design, build, and govern scalable data platforms for insurance operations (policies, claims, invoices). The ideal candidate will drive end-to-end data architecture on Databricks Lakehouse, ensuring high data quality, governance, and performance while enabling advanced analytics and business insights.
Key Responsibilities
- Architect and implement scalable Databricks Lakehouse solutions for insurance data (claims, policies, invoices)
- Design and optimize data pipelines using Databricks, Delta Lake, and Spark/PySpark
- Develop and manage Databricks notebooks, workflows, and job orchestration for batch and real-time processing
- Define and enforce data governance, data quality frameworks, and KPI monitoring across datasets
- Design data models (star/snowflake), data mapping, and data lineage frameworks
- Build and maintain Delta Tables with performance optimization (partitioning, indexing, Z-ordering)
- Perform root cause analysis for data inconsistencies and ensure data integrity across systems
- Collaborate with business stakeholders, insurers, MGAs, and product teams to translate requirements into scalable solutions
- Optimize performance tuning, cost optimization, and query efficiency in Databricks
- Implement data ingestion frameworks using tools like Auto Loader, Kafka, or batch ingestion pipelines
- Lead cloud data architecture initiatives across AWS / Azure / GCP
- Maintain documentation for data standards, governance policies, and architecture design
Required Skills
- Strong hands-on experience with Databricks (mandatory)
- Delta Lake, Unity Catalog, Databricks SQL, Workflows
- Expertise in Apache Spark / PySpark for large-scale data processing
- Strong proficiency in SQL and Python
- Experience in Databricks Lakehouse architecture & medallion architecture (Bronze, Silver, Gold layers)
- Strong knowledge of data modeling, ETL/ELT frameworks, and data warehousing concepts
- Experience with real-time and batch data processing
- Hands-on experience with data orchestration tools (Airflow preferred)
- Strong understanding of data governance, data quality, and lineage tools
- Experience working with cloud platforms (Azure, AWS, or GCP)
- Knowledge of performance tuning and optimization in Databricks environments
- Experience handling large-scale structured & semi-structured datasets
- Strong stakeholder communication and problem-solving skills.
Special Requirements
Remote
Compensation & Location
Salary: $150,000 – $200,000 per year (Estimated)
Location: San Jose, CA
Recruiter / Company – Contact Information
Email: ana@vtekis.com
Recruiter Notice:
To remove this job posting, please send an email from
ana@vtekis.com with the subject:
DELETE_JOB_ID_4427