Case study · Algo Trading · 2025

Order Router & Execution Engine$80M routed · 38ms p99 · zero downtime

A Rust + FastAPI order routing service for a quant trading desk, shipped from a blank repository to live order flow in under twelve weeks — and quietly responsible for $80M of live trading volume in Q1–Q3 2025.

$80M

Live volume routed

Q1–Q3 2025

38ms

Risk gate p99

median 11ms

12.4k

Peak throughput

req/s sustained

Reconciliation breaks

since launch

Order router architecture — clients, edge, core, and venue layers connected by FastAPI, Rust risk gate, and broker adapters. — Layered architecture: clients → edge → core → venues

How it works · step by step

The diagram, walked through in plain language

1
A trading signal arrives
When the trader's chart hits a buy or sell condition, TradingView fires a small webhook (a message over the internet) at our system.
2
The front door checks who's knocking
A FastAPI service confirms the message is genuinely from the trader (signed with a secret key) and isn't a duplicate of one we just handled.
3
A risk check, in milliseconds
The signal goes through a Rust 'risk gate' that asks: is the trader within their daily loss limit? Their position size limit? If anything looks off, the order is refused before it ever reaches a broker.
4
The router picks the cheapest broker
Approved orders go to the order router, which compares fees, spread, and recent fill quality across Alpaca, Interactive Brokers, Binance, and MetaTrader 5 — then picks the best one for this specific trade.
5
Every step is recorded immutably
Position changes, fills, and any later corrections from the broker are written to an audit log nobody can edit. The dashboard reads the same log, so what the trader sees on screen always matches what really happened.
6
If anything fails mid-flight, replay catches it
Orders are queued in Redis before the broker call. If the network drops or the broker times out, the queue replays the order rather than losing it.

The brief

The client was a quant trading desk whose strategy code worked beautifully in backtests and just-okay in paper trading. In production, slippage was eating 30% of expected edge, fills were arriving out of order, and a single bad webhook had once flipped their net position the wrong way for forty seconds.

They needed a layer between strategy and broker that was honest about latency, aware of risk, and boring under load. They had twelve weeks before the next live trading window opened.

The constraints

Risk gate had to add < 50ms p99 to the order path. Any slower and the strategy edge collapsed.
Idempotency had to be guaranteed across retries, replays, and broker timeouts — the same TradingView webhook may fire twice in a flaky network second.
Position keeping had to be event-sourced, not derived — auditors wanted a single immutable log.
Zero-downtime deploys, because some live windows ran across deploy slots.
All venue adapters behind one interface, so adding the next broker was a 3-day job, not a 3-week one.

The shape we built

Four clean layers: clients (TradingView webhooks, the strategy runtime, the ops dashboard), the edge (FastAPI + a Rust risk gate), the core (the Rust router, an event-sourced position keeper, P&L pipeline, audit log), and the venues (Alpaca, IBKR, Binance, MT5).

The hot path — from webhook to broker ack — never touches Python after the FastAPI auth check. The risk gate runs in Rust against an in-memory snapshot of positions and exposure, and emits a structured allow/deny in single-digit milliseconds. The router lives next door, picks the venue based on a simple cost model (spread, fees, recent fill quality), and writes the order to a Redis Stream before calling the broker. If the broker call dies mid-flight, replay picks it up.

Position keeping is event-sourced: every fill, every cancel, every reconciliation correction is an immutable row. Current position is a fold over those rows. The reporting layer reads the same events into Parquet via Arrow, and the ops dashboard queries that. There is exactly one source of truth.

What was hard

Broker idiosyncrasies. IBKR's socket protocol versus Binance's REST/websocket split versus Alpaca's clean REST — getting them behind one trait took longer than the entire risk gate.
Reconciling intraday with broker statements. Brokers correct fills hours after the fact. The event store has to accept these corrections without rewriting history.
Time. Every layer needed monotonic timestamps the venues didn't provide; we minted our own and stored both.

What it does today

Live since late February 2025. Through Q3 it has routed $80M of notional across four venues. p99 latency on the risk gate has held at 38ms against a budget of 50ms, and median sits around 11ms. Peak sustained throughput is 12.4k req/s. There have been zero reconciliation breaks since launch and one production incident, caused by a venue outage that the system correctly halted into.

What I'd do differently

I'd push more of the venue cost model out of code and into configuration earlier — the desk's preferences changed three times and each one was a small redeploy when it should have been a database write. I'd also add a shadow-route mode from day one, so new venue adapters can accept traffic in parallel with the live router for a week before going hot.

Stack

Rust (risk gate, router core)
Python · FastAPI (edge, ops)
Postgres (event-sourced positions)
Redis Streams (replay queue)
Parquet · Arrow (P&L pipeline)
AWS (ECS, RDS, S3 WORM)
GitHub Actions (CI + zero-downtime deploy)

More work

Continue the tour

All

AI / LLM · 2024

AI Content Platform

10K daily users · 12 models · 35% lower cost

A SaaS that generates marketing-style writing (articles, ads, product copy) for thousands of paying users — intelligently picking the cheapest AI model that can do each job well, and switching providers in seconds when one of them goes down.

Read case study

Fintech · 2024

Fintech Reporting Dashboard

200M rows · 60% faster · sub-second queries

A financial dashboard that used to take seven seconds to show 'this month's profit and loss' now takes half a second — because we moved the heavy reports off the live database without changing a single number the customer's accountant sees.

Read case study

SaaS · 2024

JobbyAI

resume scoring · job match · interview prep

A free web app that helps job seekers in three ways: it scores their resume, ranks how well they match a job posting, and prepares them for the interview — all using a single AI model behind the scenes, with no signup required to try it.

Read case study

Algo Trading · 2023

Quant Backtest Harness

50K parameter combos · 3 engines · one CLI

A single command-line tool that lets a quant team test trading strategies on three different simulation engines without rewriting any strategy code — and then compares the results in one shared format, so 'which strategy is actually better' becomes a question with a real answer.

Read case study

Fintech · 2023

Accounting API Sync

4 providers · one trait · zero drift

A behind-the-scenes service that keeps an accounting SaaS in sync with QuickBooks, Xero, Wave, and AccountEdge — when a customer edits an invoice in either place, the change shows up on the other side within 30 seconds, without ever silently overwriting work.

Read case study

AI / LLM · 2025

Multi-LLM Agent Runtime

OpenAI · Claude · Gemini · Grok

A small, stateless service that lets non-engineers wire up AI 'agents' (which can call tools, look things up, and reply) — running across four AI providers so a single outage never takes a customer offline, and replay-able to the byte for debugging.

Read case study

Algo Trading · 2024

TradingView ↔ Plaid Bridge

webhook in · broker-native out · 4 signal types

A bridge that takes 'buy' or 'sell' alerts from TradingView charts, checks the user actually has the cash via their bank link (Plaid), then sends the order to their brokerage — all in under a fifth of a second, so the price they wanted is still the price they get.

Read case study

DevTools · 2023

Figma + Chrome Plugin Suite

design · engineering · less friction

Three small browser plugins that quietly fix the slow, fiddly hand-off between designers (working in Figma) and engineers (writing code) — saving each engineer about four hours a week of busywork that nobody was tracking, but everyone resented.

Read case study

Have a similar problem?

If this shape of engagement fits what you're working on, I'd be happy to scope it.

Discuss your architecture See engagement models