NEWPosted 2 hours ago

Job ID: JOB_ID_5187

About the Role: Senior Systems Operations Engineer (SRE)

We are seeking a highly experienced Senior Systems Operations Engineer (SRE) with a strong background in application production support, automation, and infrastructure management. This role is crucial for maintaining and improving the reliability, availability, and performance of our complex, high-availability production environments.

Key Responsibilities:

  • Provide 4+ years of application production support in complex, high-availability environments, including incident response and problem management with strong root cause discipline.
  • Utilize 4+ years of hands-on automation and configuration management experience (Ansible preferred or similar), coupled with strong scripting skills (Python, Bash, Power, or similar).
  • Perform 4+ years of Linux administration (RHEL preferred) and/or Windows Server administration supporting enterprise production workloads.
  • Apply 4+ years of Git-based version control practices, including pull requests and peer review, with a focus on repeatability and code quality.
  • Leverage 4+ years of experience with Monitoring and observability tools like Splunk, AppDynamics, ThousandEyes (alerts and dashboards).
  • Gain 4+ years of experience developing and/or supporting web applications (Preferably Java).
  • Work with infrastructure-as-code concepts, including modular design and environment consistency.
  • Support hybrid/private cloud platforms and container-adjustment hosting models; familiarity with OpenShift (OCP) or Kubernetes-based platforms.
  • Implement SRE operating practices, focusing on reliability metrics, reduction of manual toil, and continuous improvement via post-incident learnings.
  • Support common middleware platforms and shared services, building automation patterns that standardize operations and reduce manual intervention.
  • Utilize enterprise observability and operational practices, including service health dashboards, alert engineering, and actionable telemetry.
  • Gain exposure to responsible AI usage in operations, covering security, validation, accuracy, and appropriate guardrails for automation/agents.
  • Communicate effectively across cross-functional teams, with experience operating in regulated environments.

Requirements:

  • 10+ years of overall experience in Systems Operations and Engineering.
  • Strong understanding of SRE principles and practices.
  • Proficiency in Linux (RHEL preferred) and/or Windows Server administration.
  • Expertise in scripting languages such as Python, Bash, or PowerShell.
  • Experience with automation tools like Ansible.
  • Familiarity with Git and CI/CD workflows.
  • Experience with monitoring and observability tools (Splunk, AppDynamics, ThousandEyes).
  • Knowledge of cloud platforms (hybrid/private) and containerization (OpenShift, Kubernetes).
  • Excellent problem-solving and analytical skills.
  • Strong communication and collaboration abilities.

Work Environment:

  • Location: Charlotte, NC
  • Work Mode: Onsite Hybrid (3 days a week)
  • Interview Mode: In-person interview
  • Preference for local/nearby candidates.

Special Requirements

In person interview, Local/nearby candidates are highly preferable


Compensation & Location

Salary: $120,000 – $160,000 per year (Estimated)

Location: Charlotte, NC


Recruiter / Company – Contact Information

Recruiter / Employer: Arkhya Tech

Email: usjobs@nvoids.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
usjobs@nvoids.com with the subject:

DELETE_JOB_ID_5187

to delete@join-this.com.