Job ID: JOB_ID_5684
Job Title: FPGA & Trading Stack Specialist
We are seeking a highly skilled and experienced FPGA & Trading Stack Specialist to join our team for an onsite position in Bala Cynwyd, PA. This role is critical for designing, developing, and optimizing high-performance trading systems and hardware accelerators. The ideal candidate will have a deep understanding of low-latency trading, C++ development, Linux kernel internals, and FPGA design.
Key Responsibilities:
- Design, code, and optimize high-performance C++17/20 trading systems including market-data handlers, order routing engines, and pre-trade risk services.
- Build lock-free, wait-free, cache-aligned software components and custom memory allocators.
- Develop exchange protocol stacks (NASDAQ, NYSE, CME, ICE, OPRA) and high-throughput feed normalization pipelines.
- Deliver measurable improvements in tick-to-trade latency, tail latency, and throughput.
- Engineer bare-metal, deterministic Linux environments optimized for real-time trading workloads.
- Perform kernel, driver, and interrupt-path optimization including IRQ routing, RCU tuning, scheduler tuning, and context-switch minimization.
- Implement CPU isolation, NUMA locality strategies, cache-coherent layouts, and huge-page memory architectures.
- Produce stable, low-jitter execution profiles across trading systems.
- Architect and implement kernel-bypass networking stacks using DPDK, Mellanox VMA, Solarflare OpenOnload.
- Develop RDMA-enabled and multicast market-data pipelines.
- Tune NIC firmware, DMA paths, PCIe configurations, and network queues.
- Build and maintain exchange connectivity platforms and colocation-optimized data paths.
- Design and develop FPGA-accelerated feed handlers, order gateways, and packet-filtering engines.
- Implement ultra-low-latency pipelines using Xilinx UltraScale+/Versal or Intel Stratix/Agilex platforms.
- Collaborate on hardware/software co-design including PCIe, DMA, HBM, and SmartNIC architectures.
- Deliver nanosecond-scale latency improvements through hardware offload.
- Engineer deterministic trading platforms where timing, jitter, and physical constraints are first-class design inputs.
- Design systems accounting for cache behavior, memory latency, bus contention, and hardware clocks.
- Apply PTP / IEEE-1588 synchronization, hardware timestamping, and rdtsc-based measurement frameworks.
- Build and maintain nanosecond-resolution profiling, tracing, and telemetry tooling.
- Use perf, eBPF, ftrace, flame graphs, and hardware counters to isolate latency.
- Drive continuous reduction of variance, tail latency, and execution jitter.
- Work directly with traders, quants, and exchange operations teams to support strategy requirements.
- Optimize platform behavior for market-data ingestion, order flow, and pre-trade risk controls.
- Support production environments with rapid latency triage and optimization cycles.
Required Technical Expertise:
- 15+ years in high-performance or trading systems.
- Prior experience in HFT, exchanges, or market-data firms.
- Demonstrated history of nanosecond-level optimization.
- Deep coding background + hardware adjacency.
- Comfortable debugging production systems under live trading conditions.
- Modern C++17/20 (lock-free, cache-aligned, zero-copy architectures).
- Linux kernel internals (scheduler, IRQs, RCU, huge pages).
- CPU pinning, NUMA engineering, cache topology optimization.
- rdtsc/tsc synchronization, PTP / IEEE-1588.
- Kernel bypass: DPDK, Solarflare OpenOnload, Mellanox VMA.
- RDMA (RoCE, iWARP).
- Multicast market-data optimization.
- Custom TCP/UDP stacks.
- NIC firmware tuning.
- Exchange connectivity stacks.
- Xilinx UltraScale+, Alveo, Versal.
- Intel Stratix, Agilex.
- Vivado, Quartus, ModelSim, Questa.
- Verilog / SystemVerilog / VHDL.
- PCIe, DMA, HBM, on-NIC processing.
- FPGA feed handlers, order gateways, packet filtering.
- Real-time Linux.
- BIOS tuning.
- PCIe lane configuration.
- SR-IOV.
- HugeTLB, transparent huge pages.
- CPU microarchitecture tuning.
- perf, ftrace, flame graphs, eBPF.
- Hardware timestamping.
- Nanosecond-level profiling.
- Jitter elimination.
- Deterministic system design.
- Exchange protocols: NASDAQ, NYSE, CME, ICE, OPRA.
- Market-data normalization.
- Order routing engines.
- Pre-trade risk systems.
- Tick-to-trade optimization.
- Microwave / millimeter-wave trading networks.
- GPS-disciplined clocks.
- Custom NIC firmware.
- Co-location data-center optimization.
- Bare-metal Kubernetes for HFT.
- P4 programmable networking.
- SmartNIC development.
- ASIC prototyping.
Special Requirements
Screening: Prepared to provide all past experience references along with requested documentation. Interview Mode: Not Specified. Domain: Trading Systems, High-Frequency Trading (HFT), Exchanges, Market Data Firms. Visa Constraints: Not Specified.
Compensation & Location
Salary: $150,000 – $200,000 per year (Estimated)
Location: Bala Cynwyd, PA
Recruiter / Company – Contact Information
Email: yaansiddhu10.devrabbit@gmail.com
Recruiter Notice:
To remove this job posting, please send an email from
yaansiddhu10.devrabbit@gmail.com with the subject:
DELETE_JOB_ID_5684