Job ID: JOB_ID_3242
Role Summary
We are seeking a Senior RabbitMQ Engineer (SME), preferably located on the eastern side of the USA, to support a customer engagement at Dicks Sporting Goods. This is a hands-on, staff-augmentation role that requires architecture, technical leadership, and execution. The resource will serve as the primary RabbitMQ authority, responsible for assessing the current platform, stabilizing it rapidly, and guiding the customer towards a supported, resilient RabbitMQ posture in Azure (utilizing VMs and/or AKS). The engagement is planned for 6 months.
Primary Outcome
The main goal is to stabilize RabbitMQ by April/early May and provide a clear modernization recommendation (evaluating Azure Kubernetes Service (AKS), VM-based deployments, or Software as a Service (SaaS)) along with a practical execution plan.
Engagement Context
- The customer is leaning towards Azure, though this is not fully confirmed.
- RabbitMQ supports multiple product teams, with each team managing its own vhosts/tenants.
- The platform currently has technical debt, notably a non-functioning Disaster Recovery (DR) process.
- The current deployment consists of 3-node clusters across Development, Non-Production, and Production environments.
- Configuration is managed via Ansible cookbooks but is bespoke for each product team.
- The peak business load window is in November.
Responsibilities
1) Assessment & Stabilization (Immediate Focus)
- Conduct a thorough current-state review, examining topology, broker configuration, policies, queue types, client connection patterns, and resource thresholds.
- Identify and remediate reliability and performance risks.
- Establish robust operational standards, including monitoring, alerting, runbooks, and on-call readiness.
2) Architecture & Technical Direction
- Define target-state options and their respective tradeoffs, considering Azure VMs, AKS, and SaaS.
- Develop an upgrade strategy to a supported RabbitMQ version, including sequencing, rollout, and rollback plans.
- Recommend best practices for multi-tenant RabbitMQ environments, covering vhosts, permissions, and policy boundaries.
3) DR / Resiliency Improvements
- Diagnose the root cause of the non-functional DR process and implement a pragmatic recovery posture aligned with business requirements.
- Validate failover and recovery procedures through rigorous testing and comprehensive documentation.
4) Platform Enablement & Standardization
- Enhance the maintainability of the Ansible-based configuration and reduce bespoke patterns.
- Create and tune reusable, standardized patterns for vhost provisioning, policies, and operational controls.
- Coach customer engineers, facilitating knowledge transfer and operational ownership.
Required Skills & Experience (Must-Have)
- 7-10+ years of experience with distributed systems and messaging platforms, with expert-level RabbitMQ production experience.
- Strong experience in:
- Clustering and High Availability (HA) patterns (e.g., quorum queues, mirrored strategies).
- Performance tuning (memory watermarks, disk alarms, flow control, channel/connection behaviors).
- Upgrades and lifecycle management (zero/minimal downtime approaches, rollback planning).
- Incident triage and root cause analysis in high-throughput environments.
- Proficiency in Azure operations (networking, VM patterns; AKS familiarity is strongly preferred).
- Hands-on automation experience (Ansible or similar Infrastructure as Code/configuration management tools).
- Ability to operate as a technical lead, demonstrating clear decision-making, documentation, and stakeholder communication skills.
Preferred / Nice-to-Have
- Designing DR for messaging in cloud environments (active/passive and/or multi-region approaches).
- Experience integrating messaging with enterprise integration stacks (e.g., BizTalk patterns).
Deliverables
- Current-state assessment report with a prioritized stabilization plan.
- Implemented stability improvements (configuration tuning, operational guardrails).
- Supported version upgrade plan (and execution, if within scope).
- DR gap analysis report with implemented/tested recovery procedures.
- Recommendation for AKS vs VM vs SaaS, including risk and effort tradeoffs.
- Standardized configuration approach for vhosts/policies, accompanied by documentation and runbooks.
Candidate Profile
- Player/Coach mentality: capable of architecting solutions and executing hands-on tasks efficiently.
- Strong executive communication skills: ability to articulate tradeoffs and risks clearly and concisely.
- Bias for practical outcomes: prioritize stabilization, followed by modernization, with continuous documentation.
Special Requirements
Visa constraints: US based preferred. Screening steps: Not specified. Interview modes: Not specified. Domain restrictions: Dicks Sporting Goods customer engagement.
Compensation & Location
Salary: $70 – $100 per year (Estimated)
Location: Remote
Recruiter / Company – Contact Information
Email: nagaraju.tiruttani@kodeva.com
Recruiter Notice:
To remove this job posting, please send an email from
nagaraju.tiruttani@kodeva.com with the subject:
DELETE_JOB_ID_3242