NEWPosted 2 hours ago

Job ID: JOB_ID_8292

About the Role:

We are seeking a highly experienced Site Reliability Engineer (SRE) to join our dynamic team. This role is critical in maintaining and improving the reliability, scalability, and performance of our production systems. You will be responsible for ensuring our infrastructure is robust, efficient, and capable of handling high-volume traffic and complex operations. This is a fully remote position, offering flexibility and the opportunity to work with cutting-edge technologies.

Key Responsibilities:

  • System Maintenance & Operations: Maintain and manage production systems on AWS and/or GCP, ensuring high availability and performance.
  • Kubernetes Management: Manage, maintain, and debug Kubernetes clusters and containerized applications (Golang, Java, Python).
  • Infrastructure as Code: Utilize infrastructure as code software such as Terraform, AWS CloudFormation, or Google Cloud Deployment Manager for provisioning and managing infrastructure.
  • CI/CD Practices: Implement and manage continuous integration and continuous delivery (CI/CD) pipelines using tools like Jenkins, Travis CI, or CircleCI.
  • Monitoring & Logging: Implement and manage comprehensive monitoring and logging solutions using tools like CloudWatch, Stackdriver, Prometheus, Grafana, ELK stack, Datadog, and other relevant services.
  • Troubleshooting & Debugging: Diagnose and resolve complex issues in production environments, ensuring minimal downtime.
  • Performance Optimization: Continuously analyze system performance and identify opportunities for optimization and improvement.
  • Collaboration: Work closely with development and operations teams to ensure seamless deployment and operation of services.
  • Scripting & Automation: Develop and maintain scripts in Python and Shell for automation of operational tasks.

Required Skills and Experience:

  • 14+ years of experience in Site Reliability Engineering or related fields.
  • Strong proficiency in Linux and scripting languages such as Python and Shell.
  • Extensive experience maintaining production systems on AWS and/or GCP.
  • In-depth experience with Kubernetes cluster maintenance, management, and debugging of containerized applications (Golang, Java, Python).
  • Solid understanding of distributed systems and technologies including Kafka, Spark, Storm, Cassandra, ElasticSearch, PostgreSQL, Redis (Elasticache), Zookeeper, Nginx, AWS S3/GCP GS.
  • Experience with infrastructure as code tools (e.g., Terraform, CloudFormation, Google Cloud Deployment Manager).
  • Proven experience with CI/CD practices and tools (Jenkins, Travis CI, CircleCI, etc.).
  • Hands-on experience with monitoring and logging solutions (CloudWatch, Stackdriver, Prometheus, Grafana, ELK, Datadog, etc.).
  • Familiarity with logging service solutions.

About Lincoln Softech LLC:

Lincoln Softech LLC is a leading IT staffing and consulting firm committed to connecting top talent with exceptional opportunities. We pride ourselves on our E-Verify compliant processes and dedication to client satisfaction. We are an equal opportunity employer.


Special Requirements

USC/GC/GCEAD/H4EAD required. Remote work. E-verify company.


Compensation & Location

Salary: $120,000 – $180,000 per year

Location: Remote


Recruiter / Company – Contact Information

Recruiter / Employer: Lincoln Softech LLC

Email: hu@lincolnsofttech.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
hu@lincolnsofttech.com with the subject:

DELETE_JOB_ID_8292

to delete@join-this.com.