NEWPosted 2 hours ago
Job ID: JOB_ID_5108
Role: Senior Systems Operations Engineer (SRE)
We are looking for a Senior Systems Operations Engineer (SRE) with a strong background in application production support, automation, and configuration management. The ideal candidate will have experience in high-availability environments, incident response, and problem management with a focus on root cause analysis.
Key Responsibilities:
- Provide 4+ years of application production support in complex, high-availability environments, including incident response and problem management with strong root cause discipline.
- Utilize 4+ years of hands-on automation and configuration management experience (Ansible preferred or similar), plus strong scripting skills (Python, Bash, Power, or similar).
- Perform 4+ years of Linux administration (RHEL preferred) and/or Windows Server administration supporting enterprise production workloads.
- Apply 4+ years of Git-based version control practices, including pull requests and peer review, with a focus on repeatability and code quality.
- Leverage 4+ years of experience with Monitoring and observability tools like Splunk, AppDynamics, ThousandEyes (alerts and dashboards).
- Support 4+ years of experience developing and/or supporting web applications (Preferably Java).
- Work with infrastructure-as-code concepts, including modular design and environment consistency.
- Support hybrid/private cloud platforms and container-adjustment hosting models; familiarity with OpenShift (OCP) or Kubernetes-based platforms.
- Implement SRE operating practices (reliability metrics, reduction of manual toil, continuous improvement via post-incident learnings).
- Support common middleware platforms and shared services, ability to build automation patterns that standardize operational and reduce manual intervention.
- Utilize enterprise observability and operational practices (service health dashboards, alert engineering, actionable telemetry).
- Explore responsible AI usage in operations (security, validation, accuracy and appropriate guardrails for automation/agents).
- Demonstrate strong cross-functional communication skills; experience operating in regulated environments.
Required Qualifications:
- 4+ years of application production support in complex, high-availability environments.
- 4+ years of hands-on automation and configuration management experience (Ansible preferred or similar).
- Strong scripting skills (Python, Bash, Power, or similar).
- 4+ years of Linux administration (RHEL preferred) and/or Windows Server administration.
- 4+ years of Git-based version control practices.
- 4+ years of experience with Monitoring and observability tools (Splunk, AppDynamics, ThousandEyes).
- 4+ years of experience developing and/or supporting web applications (Preferably Java).
- Working experience with infrastructure-as-code concepts.
- Experience supporting hybrid/private cloud platforms and container-adjustment hosting models.
- Experience implementing SRE operating practices.
- Familiarity with enterprise observability and operational practices.
- Exposure to responsible AI usage in operations.
- Strong cross-functional communication skills.
- Experience operating in regulated environments.
Compensation & Location
Salary: $110,000 – $160,000 per year (Estimated)
Location: Charlotte, NC
Recruiter / Company – Contact Information
Email: rma.diksha@testingxperts.com
Recruiter Notice:
To remove this job posting, please send an email from
rma.diksha@testingxperts.com with the subject:
DELETE_JOB_ID_5108