Job ID: JOB_ID_4580
Key Responsibilities
Architecture & Orchestration
- Design multi-step agentic workflows with LangGraph (state machines, tools, retries, timeouts) and LangChain (chains, tools, memory).
- Build guardrails (input/output filtering, red-teaming hooks) and observability (tracing, telemetry, logging, prompt/version tracking).
RAG Pipelines
- Own ingestion pipelines: chunking, embeddings, document normalization, metadata, and vector DB indexing (e.g., Pinecone, Weaviate, Milvus, FAISS).
- Implement retrieval strategies: hybrid (BM25 + dense), multi-vector, reranking, query planning, LangGraph retrieval sub-graphs, caching.
- Build domain-specific adapters (schema, ontology alignment) and grounding with structured tools/knowledge bases.
Vertex AI & Platform Engineering
- Productionize services on Google Vertex AI (Models, Endpoints, Workbench, Pipelines, Vector Search, Feature Store).
- Containerize with Docker, orchestrate with Kubernetes/GKE, and automate with CI/CD (GitHub Actions/Cloud Build).
Full-Stack Delivery
- Build user-facing apps (React/Next.js) and backends (Python/FastAPI, Node/Express), including authentication/authorization and rate limiting.
- Develop tooling/services (e.g., document loaders, evaluators, red-teaming flows, prompt versioning, synthetic data pipelines).
Evaluation & Reliability
- Define and automate GenAI evaluation: relevance, faithfulness, hallucination rate, answer-exactness, latency, cost.
- Use techniques like RAGAS, G-Eval, rubric-based human-in-the-loop, pairwise comparisons, A/B tests, and production feedback loops.
Security, Governance & Cost
- Implement data privacy controls (PII detection, masking), policy enforcement, prompt hardening, and audit logging.
- Optimize latency and TCO (embedding/model selection, batching, caching, streaming, adaptive routing, quantization where applicable).
Mentorship & Standards
- Establish best practices for prompt patterns, orchestration, testing (unit & scenario), and model lifecycle management.
- Mentor engineers; collaborate with product/design to scope features and deliver business impact.
Required Qualifications
- 7-10+ years software engineering experience; 3-5+ years applied ML/GenAI building production systems.
- Expert with LangChain and LangGraph (tools, agents, state graphs, retries, sub-graphs, observability).
- Hands-on with Vertex AI (Foundational models, Endpoints, Pipelines, Vector Search, Model Garden; IAM & service architectures).
- Strong RAG practitioner (chunking strategies, embeddings, hybrid retrieval, rerankers like Cohere/Rerank or bge-rerank, evaluation).
- Deep experience with vector databases (Pinecone, Weaviate, Milvus, FAISS) and embedding models (OpenAI, Vertex, Cohere, bge-large).
- Production backends in Python (FastAPI) or Node.js, plus React/Next.js front-end experience.
- Solid cloud experience (GCP preferred; AWS/Azure a plus), Docker/Kubernetes, and CI/CD.
- Strong understanding of GenAI evaluation (RAGAS, G-Eval, rubric scoring), observability (LangSmith/Llamaindex observability/OpenTelemetry), and prompt/version management.
- Knowledge of security & governance: PII handling, isolation, data residency, prompt injection defenses, secret management.
- Excellent communication; proven track record turning ambiguous problem statements into shipped products.
Nice to Have
- Knowledge graphs (RDF/OWL), retrieval planning, and toolformer/agent patterns.
- LLM serving and routing (DG/mixture-of-experts, function/tool calling, Guardrails, Instructor schemas, Pydantic).
- Llamaindex experience; structured RAG (SQL/Graph RAG); function/tool calling integrations (Databases, SaaS).
- On-prem/vector-optimized deployments; GPU utilization, quantization, LORA fine-tuning.
- Experiment tracking (Weights & Biases), feature stores, offline/online evaluation pipelines.
- Enterprise integrations (SharePoint, Confluence, Salesforce) and document governance.
Keywords: continuous integration continuous deployment artificial intelligence machine learning javascript database
Special Requirements
100% onsite in Charlotte, NC. Requires expertise in LangChain, LangGraph, Google Vertex AI, RAG pipelines, and vector databases. Experience with Python (FastAPI), Node.js, React/Next.js, GCP, Docker, Kubernetes, and CI/CD is essential. Knowledge of security, governance, and GenAI evaluation techniques is required.
Compensation & Location
Salary: $130,000 – $200,000 per year (Estimated)
Location: Charlotte, NC
Recruiter / Company – Contact Information
Recruiter / Employer: Vysystems
Email: etha@vysystems.com
Recruiter Notice:
To remove this job posting, please send an email from
etha@vysystems.com with the subject:
DELETE_JOB_ID_4580