NEWPosted 7 hours ago

Job ID: JOB_ID_8162

Job Overview:

We are seeking a highly experienced and motivated AI/ML Architect with deep hands-on expertise in Databricks on AWS. This role is crucial for leading the design and implementation of scalable, high-performance data and machine learning platforms. The ideal candidate will possess strong architectural thinking, excellent engineering execution skills, and the ability to build modern lakehouse systems, optimize large-scale pipelines, and drive analytical and ML capabilities across the organization. This position involves working with large, multi-terabyte datasets, advanced analytics, and end-to-end ML lifecycle management using Databricks, Python, PySpark, and AWS-native services.

Key Responsibilities:

AI/ML & Advanced Analytics:
- Develop, train, and optimize ML models using Python, PySpark, MLflow, and Databricks Machine Learning.
- Conduct exploratory data analysis (EDA) to identify patterns, trends, and insights in large datasets.
- Deploy ML models into production using MLflow, Databricks Workflows, or other MLOps pipelines.
- Build analytics solutions such as forecasting, anomaly detection, segmentation, or recommendation systems.
- Design ML architectures aligned with Databricks Lakehouse on AWS.
Data Engineering & Lakehouse Architecture:
- Architect and build scalable ETL/ELT pipelines using PySpark, SQL, and Databricks Workflows.
- Implement Delta Lake best practices, including OPTIMIZE, ZORDER, partitioning, and schema evolution.
- Design lakehouse layers (Bronze/Silver/Gold) with strong separation of compute and serving layers.
- Optimize cluster performance and jobs using Spark tuning, caching, and shuffle minimization.
- Work with multi-terabyte, time-series, high-velocity data in a distributed environment.
- Ensure robust data availability for downstream ML and analytics workloads.
AWS Cloud Integration:
- Architect end-to-end data and ML solutions using AWS services, including S3 for storage, IAM for identity & access, Glue Catalog for metadata management, and networking for secure, high-throughput data movement.
- Integrate Databricks with AWS-native compute, API layers, and low-latency endpoints.
Business Collaboration & Leadership:
- Translate business problems into scalable analytical or ML architectures.
- Communicate complex statistical and architectural concepts to non-technical stakeholders.
- Collaborate with product, engineering, and business leaders to drive data-informed initiatives.
- Provide design leadership while remaining hands-on in execution.

Must Demonstrate (Critical Competencies):

Designing Databricks-based lakehouse architectures on AWS (Delta Lake + S3 + Unity Catalog).
Clear separation of compute vs. serving layers in distributed architectures.
Low-latency API strategy where Spark is insufficient (e.g., leveraging optimized services or caching).
Caching strategies to accelerate reads and reduce compute cost.
Data partitioning, file size tuning, and optimization strategies for large-scale pipelines.
Experience handling multi-terabyte structured time-series workloads.
Ability to distill architectural significance from ambiguous business requirements.
Strong curiosity, questioning, and requirement-probing mindset.
Player-coach approach: hands-on technical depth + ability to guide design.

Skills & Qualifications:

Required:
- Bachelors or Masters in Computer Science, Data Science, Engineering, Statistics, or related field.
- 10+ years of experience in data engineering, ML engineering, or AI/ML architecture roles.
- Deep expertise in Databricks on AWS, including PySpark/Spark SQL, Databricks Notebooks, Delta Lake, Unity Catalog, MLflow, and Databricks Jobs & Workflows.
- Strong programming ability in Python (pandas, numpy, scikit-learn).
- Demonstrated experience with large-scale, multi-terabyte data processing.
- Strong understanding of ML algorithms, distributed systems, and data optimization.
Preferred:
- Experience with MLOps and production deployment pipelines.
- Strong grasp of AWS-native data and compute services.
- Understanding of CI/CD using GitHub Actions, GitLab CI, or similar.
- Familiarity with deep learning frameworks (TensorFlow, PyTorch).

Key Competencies:

Strong analytical and problem-solving skills.
Ability to work in fast-paced, highly collaborative environments.
Excellent communication and presentation abilities.
Self-driven with exceptional attention to architectural detail.

Special Requirements

Onsite

Compensation & Location

Salary: $150,000 – $200,000 per year (Estimated)

Location: Los Angeles, CA

Recruiter / Company – Contact Information

Recruiter / Employer: LiveMindz

Email: veenj@livemindz.com

Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
veenj@livemindz.com with the subject:

DELETE_JOB_ID_8162

to delete@join-this.com.