AI Platform Engineer at Arkhya Tech Inc in Charlotte, NC – Onsite W2 2026

Posted on 11 Mar 2026

Job ID: JOB_ID_3184630

About the Role:

Arkhya Tech Inc is seeking a highly skilled and experienced AI Platform Engineer to join our dynamic team. This critical role involves designing and building the foundational components that power enterprise-scale Generative AI (GenAI) applications. You will be instrumental in developing and implementing data guardrails, model safety tooling, observability pipelines, evaluation harnesses, and standardized logging/monitoring frameworks. Your contributions will be vital in enabling safe, reliable, and compliant AI development across a multitude of use cases, teams, and business units. The primary objective is to create common platform services that AI teams can leverage for their development efforts.

Key Responsibilities:

1. Guardrails, Safety & Governance

Design and implement robust data guardrail frameworks, including pre-processing, redaction, PII/PHI filtering, Data Loss Prevention (DLP) integration, and prompt defenses.
Build comprehensive “Model Armor” components, encompassing input validation & sanitization, prompt-injection defenses, harmful content detection & policy enforcement, and output filtering, fact-checking, and grounding checks.
Integrate essential safety tooling, such as policy engines, classifiers, and DLP APIs/safety models.
Collaborate closely with Security, Compliance, and Data Privacy teams to ensure all developed frameworks meet stringent enterprise governance requirements.

2. Observability Frameworks

Develop and maintain advanced observability pipelines utilizing tools like Arize AI, focusing on tracing, quality metrics, dataset drift/hallucination tracking, and embedding monitoring.
Define and enforce platform-wide standards for LLM call tracing, token usage and cost monitoring, latency and reliability metrics, and prompt/model version tracking.
Provide reusable SDKs or middleware to facilitate seamless adoption of observability features by engineering teams.

3. Logging, Monitoring & Telemetry

Design standardized LLM-specific logging schemas, capturing critical information such as inputs/outputs, model metadata, retrieval metadata, safety flags, and user context/attribution.
Build comprehensive monitoring dashboards to track performance, cost, anomalies, errors, and safety events.
Implement effective alerting mechanisms and Service Level Objectives (SLOs)/Service Level Indicators (SLIs) for LLM inference systems.

4. Evaluation Infrastructure

Architect and maintain sophisticated evaluation harnesses for GenAI systems, covering RAG evaluation (faithfulness, relevance, hallucination risk), summarization/QA evaluation, and human-in-the-loop review workflows.
Integrate automated evaluation pipelines into CI/CD processes.
Support various evaluation frameworks including RAGAS, G-Eval, rubric scoring, pairwise comparisons, and test case generation.
Develop reusable tooling to empower teams in writing, running, and tracking model evaluations efficiently.

5. Platform Engineering & Reusable Components

Develop shared libraries, APIs, and services for prompt management/versioning, embedding pipelines and model wrappers, retrieval adapters, common data loaders and document preprocessing, and tool/function schemas.
Drive consistency across teams by establishing standards, reference architectures, and best practices.
Review system designs across various use cases to ensure alignment with platform patterns.

6. Collaboration & Enablement

Partner with AI engineers, product teams, and data scientists to identify cross-cutting needs and translate them into reusable platform features.
Create comprehensive documentation, onboarding guides, examples, and developer tooling.
Conduct internal training sessions (brown bags, workshops) on guardrails, observability, and evaluation frameworks.

Required Qualifications:

5-10+ years of software engineering or ML infrastructure experience.
Strong Python engineering fundamentals (FastAPI, async, typing/Pydantic, testing).
Experience with model safety/guardrails approaches (prompt injection defense, PII redaction, toxicity filters, policy enforcement).
Hands-on experience with LLM observability platforms such as Arize AI, LangSmith, or similar.
Experience creating evaluation frameworks using RAGAS, G-Eval, or custom rubric systems.
Strong familiarity with vector databases (Pinecone, Weaviate, Milvus), embeddings, and retrieval pipelines.
Solid understanding of LLM architectures, tokenization, embeddings, context limits, and RAG patterns.
Experience in cloud environments (GCP preferred), Kubernetes/GE, containers, and CI/CD.
Strong understanding of security, governance, DLP, data privacy, RBAC, and enterprise compliance requirements.

Nice to Have:

Experience with LangChain/LangGraph or LlamaIndex orchestrations.
Experience with LLM security tooling like Guardrails.ai, Rebuff, Protect AI, or similar.
Experience with GCP Vertex AI pipelines, Model Monitoring, and Vector Search.
Familiarity with knowledge graphs, grounding models, and fact-checking models.
Experience building SDKs or developer frameworks adopted across multiple teams.
On-prem or hybrid AI deployment experience.

Soft Skills:

Strong documentation and communication skills.
Ability to influence engineering teams and standardize best practices.
Comfortable working across multiple stakeholders including platform, security, ML engineering, and product teams.

Special Requirements

100% onsite

Compensation & Location

Salary: $140,000 – $180,000 per year

Location: Charlotte, NC

Recruiter / Company – Contact Information

Recruiter / Employer: Arkhya Tech Inc

Email: naveen@arkhyatech.com

Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
naveen@arkhyatech.com with the subject:

DELETE_JOB_ID_3184630

to delete@join-this.com.