NEWPosted 2 hours ago
Job ID: JOB_ID_7465
Job Title: Site Reliability Engineer (Python)
Location: Redmond, WA (Onsite)
Duration: 6-12 months
Job Description:
- Infra/Service Build and maintain components based on large-scale infrastructure (e.g., database systems, distributed queues, deployment platforms).
- Comfortable making changes in mid/large scale distributed systems (order of 10k servers) by handling end-to-end life-cycle.
- Proactive maintenance through extensive use of monitoring, logging and metrics dashboard, work with core infra to diagnose and resolve issues.
- Proficiency in Python (70%).
- Familiarity with data systems, ML pipelines, and distributed databases.
- Binary packaging and distribution.
- Build and CI/CD.
- Miscellaneous operational tasks related to build and third-party modules.
Typical Workload:
- Monitor and maintain health of various infra services, data pipelines and build streams.
- Provide on-call support during business hours.
- Assess level of urgency and escalate to core team, if necessary.
- Work on infra tasks/bugs.
- Work on items related to infra health and efficiency.
- These are most likely follow-ups from monitoring/alerts.
Special Requirements
Need Local Only
Compensation & Location
Salary: $104,000 – $114,400 per year (Estimated)
Location: Redmond, WA
Recruiter / Company – Contact Information
Email: nmukhj@tekskillsinc.com
Recruiter Notice:
To remove this job posting, please send an email from
nmukhj@tekskillsinc.com with the subject:
DELETE_JOB_ID_7465