Job ID: JOB_ID_6430
About the Role:
We are seeking a highly skilled and experienced Sr. DevOps Engineer with a focus on AI platforms to join our dynamic team. The ideal candidate will be responsible for designing, implementing, and managing scalable and resilient infrastructure on AWS, with a strong emphasis on integrating and supporting AI services and workflows. This role requires a deep understanding of cloud computing, DevOps principles, CI/CD pipelines, and a passion for accelerating AI adoption.
Key Responsibilities:
- Design, implement, and manage scalable and resilient infrastructure on AWS.
- Architect and maintain Windows/Linux based environments, ensuring seamless integration with cloud platforms.
- Develop and maintain infrastructure-as-code (IaC) using both AWS Cloudformation/CDK and Terraform/OpenTofu.
- Develop and maintain Configuration Management for Windows & Linux servers using Chef.
- Design, build, and optimize CI/CD pipelines using GitLab CI/CD for .NET applications.
- Integrate and support AI services, including orchestration with AWS Bedrock, Google Agentspace, and other generative AI frameworks, ensuring they can be securely and efficiently consumed by platform services.
- Enable AI/ML workflows by building and optimizing infrastructure pipelines that support large-scale model training, inference, and deployment across AWS and GCP environments.
- Automate model lifecycle management (training, deployment, monitoring) through CI/CD pipelines, ensuring reproducibility and seamless integration with development workflows.
- Collaborate with AI engineering teams to deliver scalable environments, standardized APIs, and infrastructure that accelerate AI adoption at the platform level.
- Implement observability, security, data privacy, and cost-optimization strategies specifically for AI workloads, including monitoring and resource scaling for inference services.
- Implement and enforce security best practices across the infrastructure and deployment processes.
- Collaborate closely with development teams to understand their needs and provide DevOps expertise.
- Troubleshoot and resolve infrastructure and application deployment issues.
- Implement and manage monitoring and logging solutions to ensure system visibility and proactive issue detection.
- Clearly and concisely contribute to the development and documentation of DevOps standards and best practices.
- Stay up-to-date with the latest industry trends and technologies in cloud computing, DevOps, and security.
- Provide mentorship and guidance to junior team members.
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
- 5+ years of experience in a DevOps or Site Reliability Engineering (SRE) role.
- 1+ year(s) of experience with AI services & LLMs.
- Extensive hands-on experience with Amazon Web Services (AWS).
- Solid understanding of Windows/Linux Server administration and integration with cloud environments.
- Proven experience with infrastructure-as-code tools, specifically AWS CDK and Terraform.
- Strong experience designing and implementing CI/CD pipelines using GitLab CI/CD.
- Experience deploying and managing .NET applications in cloud environments.
- Deep understanding of security best practices and their implementation in cloud infrastructure and CI/CD pipelines.
- Solid understanding of networking principles (TCP/IP, DNS, load balancing, firewalls) in cloud environments.
- Experience with monitoring and logging tools (e.g., NewRelic, CloudWatch).
- Strong scripting skills (e.g., PowerShell, Python, Ruby, Bash).
- Excellent problem-solving and troubleshooting skills.
- Strong communication and collaboration skills.
- Experience with containerization technologies (e.g., Docker, Kubernetes) is a plus.
- Relevant AWS and/or GCP certifications are a plus.
- Experience with the configuration management tool Chef.
Preferred Qualifications:
- Knowledge of and a strong understanding of PowerShell and Python Scripting.
- Strong background with AWS EC2 features and Services (Autoscaling and Warm Pools).
- Understanding of Windows server Build process using tools like Chocolaty for packages and Packer for AMI/Image generation.
Special Requirements
Visa constraints: Only locals. Screening steps: In person interview (Final Round). Interview modes: In person.
Compensation & Location
Salary: $120,000 – $160,000 per year (Estimated)
Location: Westlake Village, CA
Recruiter / Company – Contact Information
Email: andu@intellecttechsolutions.com
Recruiter Notice:
To remove this job posting, please send an email from
andu@intellecttechsolutions.com with the subject:
DELETE_JOB_ID_6430