NEWPosted 12 hours ago

Job ID: JOB_ID_1053

Position Summary

We are looking for a Senior Site Reliability Engineer (SRE) to join our dynamic team in New York City. This hybrid role is critical for maintaining the reliability, scalability, and performance of our financial trading platforms. You will be responsible for developing enhancements to support workflows, focusing heavily on automation and efficiency. This position requires a blend of systems engineering, software development, and operational expertise to ensure our systems meet the rigorous demands of the US market trading sessions. The ideal candidate will have a proactive approach to identifying potential system bottlenecks before they impact users.

Core Responsibilities

  • Support the SRE team in developing and implementing enhancements to support workflows, focusing on automation and efficiency improvements.
  • Handle technical escalations and troubleshoot complex FIX and API connectivity issues.
  • Actively participate in on-call rotations during non-traditional hours to ensure rapid response and resolution.
  • Adhere to and administer incident and change management policies to maintain system reliability.
  • Coordinate incident resolution efforts and implement change management protocols.
  • Work closely with international offices to ensure smooth operation and alignment of SRE practices across time zones.
  • Coordinate Incident Post Mortems and Root Cause Analysis (RCA) to prevent recurrence of system failures.
  • Design, implement, and maintain comprehensive monitoring, logging, and tracing solutions (observability stack).
  • Partner with product and engineering teams to define clear Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
  • Manage error budgets to ensure service reliability meets business needs.

Required Technical Qualifications

  • 5+ years in a senior SRE role or a similar position, demonstrating deep knowledge in site reliability engineering and operations.
  • Knowledge of FIX protocol and messages, with the ability to read and analyze FIX logs.
  • Familiarity with REST APIs and a strong understanding of API integration.
  • Proficiency in Python and scripting for automation and system management.
  • Expertise in SQL and transactional databases, including querying and troubleshooting.
  • In-depth knowledge of core networking concepts including TCP/IP, routing, and DNS.
  • Familiarity with maintaining and troubleshooting systems within both cloud (AWS) and co-location (colo) environments.
  • Availability for flexible work hours and willingness to cover US markets trading sessions, including on-call coverage.

Preferred Skills and Experience

  • Experience in the brokerage or financial industry is highly preferred.
  • Proficiency with cloud services, particularly AWS (IAM, EC2, S3, and DynamoDB).
  • Experience maintaining and supporting containerized systems with familiarity in orchestration tools.
  • Knowledge of Infrastructure as Code (IaC) practices and tools such as Terraform or CloudFormation.
  • Ability to manage and troubleshoot job scheduling tools like Rundeck or Apache Airflow.
  • Advanced skills in managing containerized environments using Kubernetes and OpenShift.
  • Practical experience with Confluent Cloud or RedPanda for event streaming architectures.

Special Requirements

Hybrid NYC. Knowledge of FIX protocol. Financial/Brokerage industry preferred.


Compensation & Location

Salary: $175,000 – $235,000 per year (Estimated)

Location: New York, NY


Recruiter / Company – Contact Information

Recruiter / Employer: Nitya Software Solutions

Email: vinay.y@nityainc.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
vinay.y@nityainc.com with the subject:

DELETE_JOB_ID_1053

to delete@join-this.com.