NEWPosted 2 hours ago
Job ID: JOB_ID_5187
About the Role: Senior Systems Operations Engineer (SRE)
We are seeking a highly experienced Senior Systems Operations Engineer (SRE) with a strong background in application production support, automation, and infrastructure management. This role is crucial for maintaining and improving the reliability, availability, and performance of our complex, high-availability production environments.
Key Responsibilities:
- Provide 4+ years of application production support in complex, high-availability environments, including incident response and problem management with strong root cause discipline.
- Utilize 4+ years of hands-on automation and configuration management experience (Ansible preferred or similar), coupled with strong scripting skills (Python, Bash, Power, or similar).
- Perform 4+ years of Linux administration (RHEL preferred) and/or Windows Server administration supporting enterprise production workloads.
- Apply 4+ years of Git-based version control practices, including pull requests and peer review, with a focus on repeatability and code quality.
- Leverage 4+ years of experience with Monitoring and observability tools like Splunk, AppDynamics, ThousandEyes (alerts and dashboards).
- Gain 4+ years of experience developing and/or supporting web applications (Preferably Java).
- Work with infrastructure-as-code concepts, including modular design and environment consistency.
- Support hybrid/private cloud platforms and container-adjustment hosting models; familiarity with OpenShift (OCP) or Kubernetes-based platforms.
- Implement SRE operating practices, focusing on reliability metrics, reduction of manual toil, and continuous improvement via post-incident learnings.
- Support common middleware platforms and shared services, building automation patterns that standardize operations and reduce manual intervention.
- Utilize enterprise observability and operational practices, including service health dashboards, alert engineering, and actionable telemetry.
- Gain exposure to responsible AI usage in operations, covering security, validation, accuracy, and appropriate guardrails for automation/agents.
- Communicate effectively across cross-functional teams, with experience operating in regulated environments.
Requirements:
- 10+ years of overall experience in Systems Operations and Engineering.
- Strong understanding of SRE principles and practices.
- Proficiency in Linux (RHEL preferred) and/or Windows Server administration.
- Expertise in scripting languages such as Python, Bash, or PowerShell.
- Experience with automation tools like Ansible.
- Familiarity with Git and CI/CD workflows.
- Experience with monitoring and observability tools (Splunk, AppDynamics, ThousandEyes).
- Knowledge of cloud platforms (hybrid/private) and containerization (OpenShift, Kubernetes).
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
Work Environment:
- Location: Charlotte, NC
- Work Mode: Onsite Hybrid (3 days a week)
- Interview Mode: In-person interview
- Preference for local/nearby candidates.
Special Requirements
In person interview, Local/nearby candidates are highly preferable
Compensation & Location
Salary: $120,000 – $160,000 per year (Estimated)
Location: Charlotte, NC
Recruiter / Company – Contact Information
Recruiter / Employer: Arkhya Tech
Email: usjobs@nvoids.com
Recruiter Notice:
To remove this job posting, please send an email from
usjobs@nvoids.com with the subject:
DELETE_JOB_ID_5187