NEWPosted 4 hours ago

Job ID: JOB_ID_4580

Key Responsibilities

Architecture & Orchestration

  • Design multi-step agentic workflows with LangGraph (state machines, tools, retries, timeouts) and LangChain (chains, tools, memory).
  • Build guardrails (input/output filtering, red-teaming hooks) and observability (tracing, telemetry, logging, prompt/version tracking).

RAG Pipelines

  • Own ingestion pipelines: chunking, embeddings, document normalization, metadata, and vector DB indexing (e.g., Pinecone, Weaviate, Milvus, FAISS).
  • Implement retrieval strategies: hybrid (BM25 + dense), multi-vector, reranking, query planning, LangGraph retrieval sub-graphs, caching.
  • Build domain-specific adapters (schema, ontology alignment) and grounding with structured tools/knowledge bases.

Vertex AI & Platform Engineering

  • Productionize services on Google Vertex AI (Models, Endpoints, Workbench, Pipelines, Vector Search, Feature Store).
  • Containerize with Docker, orchestrate with Kubernetes/GKE, and automate with CI/CD (GitHub Actions/Cloud Build).

Full-Stack Delivery

  • Build user-facing apps (React/Next.js) and backends (Python/FastAPI, Node/Express), including authentication/authorization and rate limiting.
  • Develop tooling/services (e.g., document loaders, evaluators, red-teaming flows, prompt versioning, synthetic data pipelines).

Evaluation & Reliability

  • Define and automate GenAI evaluation: relevance, faithfulness, hallucination rate, answer-exactness, latency, cost.
  • Use techniques like RAGAS, G-Eval, rubric-based human-in-the-loop, pairwise comparisons, A/B tests, and production feedback loops.

Security, Governance & Cost

  • Implement data privacy controls (PII detection, masking), policy enforcement, prompt hardening, and audit logging.
  • Optimize latency and TCO (embedding/model selection, batching, caching, streaming, adaptive routing, quantization where applicable).

Mentorship & Standards

  • Establish best practices for prompt patterns, orchestration, testing (unit & scenario), and model lifecycle management.
  • Mentor engineers; collaborate with product/design to scope features and deliver business impact.

Required Qualifications

  • 7-10+ years software engineering experience; 3-5+ years applied ML/GenAI building production systems.
  • Expert with LangChain and LangGraph (tools, agents, state graphs, retries, sub-graphs, observability).
  • Hands-on with Vertex AI (Foundational models, Endpoints, Pipelines, Vector Search, Model Garden; IAM & service architectures).
  • Strong RAG practitioner (chunking strategies, embeddings, hybrid retrieval, rerankers like Cohere/Rerank or bge-rerank, evaluation).
  • Deep experience with vector databases (Pinecone, Weaviate, Milvus, FAISS) and embedding models (OpenAI, Vertex, Cohere, bge-large).
  • Production backends in Python (FastAPI) or Node.js, plus React/Next.js front-end experience.
  • Solid cloud experience (GCP preferred; AWS/Azure a plus), Docker/Kubernetes, and CI/CD.
  • Strong understanding of GenAI evaluation (RAGAS, G-Eval, rubric scoring), observability (LangSmith/Llamaindex observability/OpenTelemetry), and prompt/version management.
  • Knowledge of security & governance: PII handling, isolation, data residency, prompt injection defenses, secret management.
  • Excellent communication; proven track record turning ambiguous problem statements into shipped products.

Nice to Have

  • Knowledge graphs (RDF/OWL), retrieval planning, and toolformer/agent patterns.
  • LLM serving and routing (DG/mixture-of-experts, function/tool calling, Guardrails, Instructor schemas, Pydantic).
  • Llamaindex experience; structured RAG (SQL/Graph RAG); function/tool calling integrations (Databases, SaaS).
  • On-prem/vector-optimized deployments; GPU utilization, quantization, LORA fine-tuning.
  • Experiment tracking (Weights & Biases), feature stores, offline/online evaluation pipelines.
  • Enterprise integrations (SharePoint, Confluence, Salesforce) and document governance.

Keywords: continuous integration continuous deployment artificial intelligence machine learning javascript database


Special Requirements

100% onsite in Charlotte, NC. Requires expertise in LangChain, LangGraph, Google Vertex AI, RAG pipelines, and vector databases. Experience with Python (FastAPI), Node.js, React/Next.js, GCP, Docker, Kubernetes, and CI/CD is essential. Knowledge of security, governance, and GenAI evaluation techniques is required.


Compensation & Location

Salary: $130,000 – $200,000 per year (Estimated)

Location: Charlotte, NC


Recruiter / Company – Contact Information

Recruiter / Employer: Vysystems

Email: etha@vysystems.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
etha@vysystems.com with the subject:

DELETE_JOB_ID_4580

to delete@join-this.com.