Job ID: JOB_ID_5634
Job Overview
We are seeking an urgent requirement for a Senior MLOps Technical Lead. The objective is to build an intelligent, data-driven platform that supports the development of next-generation test analytics and test agents. This platform will enable faster insights, improved diagnostics, and scalable infrastructure for Generative AI systems, connecting test stations, line-level data, and pipelines. You will build automated evaluation tools and conduct rigorous statistical analyses to ensure the reliability of both human and AI-based assessment systems.
The role involves benchmarking, adapting, and integrating AI/ML models into existing software systems, and independently running and analyzing ML experiments for real improvements.
Must-Have Requirements
- Backend/Systems Experience: 3+ years building production backend or distributed systems (pre-AI experience required).
- Production AI Systems: Has shipped AI/LLM features serving real users at scale (not just prototypes or demos).
- Agentic Systems: Has built AI agents, skills, tools, or MCP (Model Context Protocol) integrations.
- Python Proficiency: Proficient for backend development.
- Secondary Language: Working knowledge of Go, TypeScript, or Rust.
- Cloud Infrastructure: Deep experience with AWS/GCP/Azure, including cost optimization and compute decisions (not just deployment).
- Container & Orchestration: Hands-on with Docker and Kubernetes; capable of building, deploying, debugging, and scaling services independently.
- LLM Integration: Understands token economics, context limits, rate limiting, structured outputs, and API failure modes.
- LLM Evaluation: Understands how to evaluate LLM outputs and the inherent challenges (non-determinism, quality measurement, regression detection).
- Hands-On Engineer: Not just an architect; writes code, debugs production issues, and deploys their own work.
Preferred / Differentiators
- Built multi-step agentic workflows with tool use and function calling.
- Experience with agent orchestration frameworks (LangGraph, CrewAI, or custom).
- Built guardrails, fallbacks, or graceful degradation for AI systems.
- Streaming inference and async agent orchestration.
- Cost/latency optimization: caching, batching, prompt compression.
- ML observability tools: Langfuse, Arize, Braintrust, W&B.
- Retrieval systems (vector search, hybrid search) as a tool, not the focus.
Screening Questions for Candidates
- “Describe a production AI agent or skill system you built. What broke and how did you fix it?”
- “Have you built MCP servers/integrations or custom tool-use systems for LLMs?”
- “How do you evaluate whether an LLM-based feature is working well? What makes this hard?”
- “Walk me through how you’d deploy and scale an AI service on Kubernetes.”
Not a Fit If
- Primarily a model trainer/fine-tuner (we’re not training models).
- AI experience is mainly academic, research, or tutorial-based.
- No production systems experience (only notebooks/demos).
- Looking for an entry-level role with heavy mentorship.
- Background is primarily data science/analytics rather than engineering.
- “Architects” who don’t write or deploy code themselves.
Special Requirements
Visa: Only USC/GC; Location: Cupertino, CA or Austin, TX Onsite Mandatory; Screening: 4 specific questions provided; Interview Mode: Not specified
Compensation & Location
Salary: $100 – $150 per year (Estimated)
Location: Austin, TX
Recruiter / Company – Contact Information
Recruiter / Employer: [End Client]
Email: .dixit@samsonsoft.com
Recruiter Notice:
To remove this job posting, please send an email from
.dixit@samsonsoft.com with the subject:
DELETE_JOB_ID_5634