Job ID: JOB_ID_876
Role Overview
We are seeking a highly experienced Site Reliability Engineer (SRE) with a strong background in Azure and Java development to join our team in the San Francisco Bay Area. This role is designed for a professional with over 12 years of experience in software development and technical operations, specifically focused on running large-scale, mission-critical applications. The successful candidate will bridge the gap between development and operations, ensuring that our services are scalable, reliable, and performant.
Key Responsibilities
- Build deep knowledge of the business domain and understand the end-to-end customer journey to better support service availability.
- Partner with engineering and product stakeholders to improve the design, visibility, and scalability of cloud-native services.
- Efficiently automate manual operational processes using Python and Shell scripting to reduce toil.
- Lead deep-dive investigations into production incidents and facilitate blameless postmortems to drive continuous improvement.
- Improve alert management and decision-making by measuring system health using standardized telemetry and Log Analytics.
- Support planned changes through deployment automation, post-deployment monitoring, and the creation of real-time dashboards.
- Evaluate open-source and vendor products, create proofs of concept, and lead the migration of applications to Azure Cloud.
- Adhere to crucial company controls and security protocols to meet internal and external audit requirements.
- Build value-proposition presentations and case studies to advocate for SRE best practices across the organization.
- Manage and mentor junior team members, ensuring high-quality delivery and stakeholder satisfaction.
Technical Requirements
The ideal candidate must have 10+ years of experience in software development and 8+ years specifically focused on J2EE, Spring Boot, and Microservices. Expertise in Azure Cloud services, including Storage, Kubernetes (AKS), APIM, and Synapse DB, is mandatory. You should have a strong command of Java 8/17 and be proficient in using Spring Data JPA and Hibernate. Experience with NoSQL databases like Cosmos DB and messaging platforms like Kafka is essential. Additionally, candidates should have experience setting up and debugging Spark jobs for data processing and cleansing. A solid understanding of RDBMS, SQL, and CI/CD tools is required to manage the full application lifecycle.
Location and Experience
This position is strictly for local candidates in the Bay Area, California. Given the seniority of the role, a minimum of 12 years of professional experience in the IT industry is required. Candidates must demonstrate logical and creative problem-solving skills and the ability to manage complex stakeholder relationships. The role involves significant hands-on technical work alongside strategic planning for infrastructure growth and stability.
Special Requirements
Need only Locals; 12 Years experience requirement; Expertise in Azure and Java mandatory.
Compensation & Location
Salary: $185,000 – $245,000 per year (Estimated)
Location: Bay Area, CA
Recruiter / Company – Contact Information
Recruiter / Employer: Longfinch Inc
Email: tkrishna@longfinch.com
Recruiter Notice:
To remove this job posting, please send an email from
tkrishna@longfinch.com with the subject:
DELETE_JOB_ID_876