Job ID: JOB_ID_3184630
About the Role:
Arkhya Tech Inc is seeking a highly skilled and experienced AI Platform Engineer to join our dynamic team. This critical role involves designing and building the foundational components that power enterprise-scale Generative AI (GenAI) applications. You will be instrumental in developing and implementing data guardrails, model safety tooling, observability pipelines, evaluation harnesses, and standardized logging/monitoring frameworks. Your contributions will be vital in enabling safe, reliable, and compliant AI development across a multitude of use cases, teams, and business units. The primary objective is to create common platform services that AI teams can leverage for their development efforts.
Key Responsibilities:
1. Guardrails, Safety & Governance
- Design and implement robust data guardrail frameworks, including pre-processing, redaction, PII/PHI filtering, Data Loss Prevention (DLP) integration, and prompt defenses.
- Build comprehensive “Model Armor” components, encompassing input validation & sanitization, prompt-injection defenses, harmful content detection & policy enforcement, and output filtering, fact-checking, and grounding checks.
- Integrate essential safety tooling, such as policy engines, classifiers, and DLP APIs/safety models.
- Collaborate closely with Security, Compliance, and Data Privacy teams to ensure all developed frameworks meet stringent enterprise governance requirements.
2. Observability Frameworks
- Develop and maintain advanced observability pipelines utilizing tools like Arize AI, focusing on tracing, quality metrics, dataset drift/hallucination tracking, and embedding monitoring.
- Define and enforce platform-wide standards for LLM call tracing, token usage and cost monitoring, latency and reliability metrics, and prompt/model version tracking.
- Provide reusable SDKs or middleware to facilitate seamless adoption of observability features by engineering teams.
3. Logging, Monitoring & Telemetry
- Design standardized LLM-specific logging schemas, capturing critical information such as inputs/outputs, model metadata, retrieval metadata, safety flags, and user context/attribution.
- Build comprehensive monitoring dashboards to track performance, cost, anomalies, errors, and safety events.
- Implement effective alerting mechanisms and Service Level Objectives (SLOs)/Service Level Indicators (SLIs) for LLM inference systems.
4. Evaluation Infrastructure
- Architect and maintain sophisticated evaluation harnesses for GenAI systems, covering RAG evaluation (faithfulness, relevance, hallucination risk), summarization/QA evaluation, and human-in-the-loop review workflows.
- Integrate automated evaluation pipelines into CI/CD processes.
- Support various evaluation frameworks including RAGAS, G-Eval, rubric scoring, pairwise comparisons, and test case generation.
- Develop reusable tooling to empower teams in writing, running, and tracking model evaluations efficiently.
5. Platform Engineering & Reusable Components
- Develop shared libraries, APIs, and services for prompt management/versioning, embedding pipelines and model wrappers, retrieval adapters, common data loaders and document preprocessing, and tool/function schemas.
- Drive consistency across teams by establishing standards, reference architectures, and best practices.
- Review system designs across various use cases to ensure alignment with platform patterns.
6. Collaboration & Enablement
- Partner with AI engineers, product teams, and data scientists to identify cross-cutting needs and translate them into reusable platform features.
- Create comprehensive documentation, onboarding guides, examples, and developer tooling.
- Conduct internal training sessions (brown bags, workshops) on guardrails, observability, and evaluation frameworks.
Required Qualifications:
- 5-10+ years of software engineering or ML infrastructure experience.
- Strong Python engineering fundamentals (FastAPI, async, typing/Pydantic, testing).
- Experience with model safety/guardrails approaches (prompt injection defense, PII redaction, toxicity filters, policy enforcement).
- Hands-on experience with LLM observability platforms such as Arize AI, LangSmith, or similar.
- Experience creating evaluation frameworks using RAGAS, G-Eval, or custom rubric systems.
- Strong familiarity with vector databases (Pinecone, Weaviate, Milvus), embeddings, and retrieval pipelines.
- Solid understanding of LLM architectures, tokenization, embeddings, context limits, and RAG patterns.
- Experience in cloud environments (GCP preferred), Kubernetes/GE, containers, and CI/CD.
- Strong understanding of security, governance, DLP, data privacy, RBAC, and enterprise compliance requirements.
Nice to Have:
- Experience with LangChain/LangGraph or LlamaIndex orchestrations.
- Experience with LLM security tooling like Guardrails.ai, Rebuff, Protect AI, or similar.
- Experience with GCP Vertex AI pipelines, Model Monitoring, and Vector Search.
- Familiarity with knowledge graphs, grounding models, and fact-checking models.
- Experience building SDKs or developer frameworks adopted across multiple teams.
- On-prem or hybrid AI deployment experience.
Soft Skills:
- Strong documentation and communication skills.
- Ability to influence engineering teams and standardize best practices.
- Comfortable working across multiple stakeholders including platform, security, ML engineering, and product teams.
Special Requirements
100% onsite
Compensation & Location
Salary: $140,000 – $180,000 per year
Location: Charlotte, NC
Recruiter / Company – Contact Information
Recruiter / Employer: Arkhya Tech Inc
Email: naveen@arkhyatech.com
Recruiter Notice:
To remove this job posting, please send an email from
naveen@arkhyatech.com with the subject:
DELETE_JOB_ID_3184630