NEWPosted 9 hours ago

Job ID: JOB_ID_1552

Role Overview: OpenShift / Kubernetes Administrator

We are seeking a highly skilled and proactive OpenShift / Kubernetes Administrator to join our dynamic infrastructure team. This role is critical for supporting high-availability applications in both pre-production and production environments. The successful candidate will work closely with the Sterling team to ensure seamless application deployment, connectivity, and platform stability. This position requires a deep understanding of container orchestration, automated workflows, and incident management within a large-scale enterprise environment.

Key Responsibilities and Technical Duties

  • Collaborate extensively with the Sterling application team to provide end-to-end support for their containerized applications. This includes managing Snow tickets for pre-production environments and Change tickets (CTASKs) for production implementations.
  • Validate and test connectivity between various service endpoints. You will be responsible for ensuring that firewall rules are correctly implemented and that services can communicate across the platform, utilizing both manual testing methods and automated validation jobs.
  • Manage resource allocation by scaling pods and deployments dynamically based on application demand and performance metrics in both pre-prod and production clusters.
  • Maintain and update critical Kubernetes and OpenShift objects, including ConfigMaps, Secrets, Routes, and Services. These updates must be performed with precision, either through manual CLI intervention or via CI/CD automated pipelines.
  • Execute comprehensive logging and monitoring strategies. You will be expected to provide deep-dive log data for platform objects, especially when standard logging tools are insufficient, often requiring the use of TCP dumps and other network diagnostic tools.
  • Proactively monitor platform health alerts and warnings. You will act as a primary responder in ServiceNow, creating incidents based on platform alerts and driving them toward resolution or escalating to senior engineering teams when necessary.
  • Manage the lifecycle of digital certificates. This includes the timely update and renewal of ingress certificates, route certificates, and wildcard certificates to prevent service interruptions.
  • Support the infrastructure team during platform patching and upgrade cycles. You will assist in validating the environment post-patching to ensure that neither application functionality nor infrastructure stability has been compromised.
  • Develop and maintain robust Disaster Recovery (DR) plans. You will ensure that all SOPs (Standard Operating Procedures) and Knowledge Base articles are up-to-date to facilitate rapid recovery in the event of a catastrophic failure.
  • Provide first-responder incident management support, collecting forensic data during outages and contributing to Root Cause Analysis (RCA) reports.

Required Technical Expertise

  • Extensive hands-on experience with Red Hat OpenShift Container Platform (OCP) and upstream Kubernetes.
  • Proficiency in both Graphical User Interface (GUI) management and Command Line Interface (CLI) operations (oc/kubectl).
  • Strong understanding of container networking, software-defined networking (SDN), and load balancing (F5/B2X).
  • Experience with ITIL processes and tools like ServiceNow for change and incident management.
  • Ability to script and automate routine tasks using Shell, Python, or Ansible.

Compensation & Location

Salary: $155,000 – $195,000 per year (Estimated)

Location: Raleigh, NC


Recruiter / Company – Contact Information

Recruiter / Employer: Next Level Business Services, Inc.

Email: shashank.kumar1@nlbtech.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
shashank.kumar1@nlbtech.com with the subject:

DELETE_JOB_ID_1552

to delete@join-this.com.