NEWPosted 3 hours ago

Job ID: JOB_ID_3843

Role Overview:

Define AI/ML reference architecture and solution blueprints (batch/streaming ML, LLM+RAG, multimodal). Lead end-to-end solution design: data ingestion, feature stores, model training, inference, monitoring. Architect LLM applications (chatbots, copilots, agents, summarization, classification) with RAG, evaluation, safety, and guardrails. Own MLOps/LLMOps: CI/CD for models, model registry, feature store, lineage, observability, drift and cost monitoring.

Key Responsibilities:

  • Choose the right cloud and runtime (managed services vs. self-hosted; GPU/CPU; serverless vs. containerized).
  • Establish security, compliance, and governance (PII handling, encryption, auditability, Responsible AI).
  • Collaborate with product and business stakeholders to translate requirements into architectural decisions and delivery plans.
  • Perform technical spikes/POCs, benchmark models/infrastructure, and lead Architecture Reviews.
  • Create and maintain standards, patterns, and reusable components; mentor engineers across teams.
  • Drive performance & cost optimization (throughput/latency/SLA/SLO; caching; quantization/distillation; autoscaling).
  • Support vendor/product evaluations (cloud AI services, vector DBs, orchestration frameworks, monitoring).

Required Qualifications:

  • Bachelors/Masters in Computer Science, Engineering, Data/AI or related field.
  • 15+ years of overall engineering experience with 4+ years in AI/ML solution architecture.
  • Proven experience designing and deploying AI systems in production at scale (LLM and/or classical ML).
  • Strong hands-on proficiency in Python and cloud-native architectures (AWS/Azure/GCP).

Must-Have Technical Skills:

AI/ML & LLM Architecture:

  • Designing LLM/RAG systems: retrieval pipelines, chunking strategies, embeddings, reranking, prompt/response orchestration, evaluation and safety.
  • Model life cycle: finetuning, PEFT/LoRA, quantization/distillation, latency & cost management.
  • Classical ML/NLP: feature engineering, model selection, training, cross-validation, metrics, A/B testing.

MLOps / LLMOps:

  • CI/CD for ML (model/version promotion), feature stores, model registry, lineage and drift detection.
  • Inference stacks: Torch/TensorFlow, vLLM/TGI/ONNX, GPU orchestration, autoscaling, APM.
  • Pipelines & orchestration: Airflow, Kubeflow, MLflow, SageMaker, Vertex AI, Azure ML.

Special Requirements

Visa constraints: None specified. Screening steps: None specified. Interview modes: Hybrid, with 3 days onsite per week. Domain restrictions: None specified.


Compensation & Location

Salary: $65 – $70 per year

Location: Richardson, TX


Recruiter / Company – Contact Information

Recruiter / Employer: AES Inc.

Email: _khan@aesinc.us.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
_khan@aesinc.us.com with the subject:

DELETE_JOB_ID_3843

to delete@join-this.com.