Case study · Algo Trading · 2023

Quant Backtest Harness50K parameter combos · 3 engines · one CLI

A strategy backtesting and walk-forward analysis layer over Backtrader, Zipline, and QuantConnect — one CLI, one data contract, three engines underneath, and results you can actually compare.

50K+

Param combos / run

walk-forward grid

Engines supported

swap without rewrite

-85%

Sweep runtime

vs. single-engine baseline

Strategies shipped

mean-rev + trend

Quant backtest harness architecture — Typer CLI, walk-forward spec, data catalog, a dispatcher that fans out across Backtrader, Zipline, and QuantConnect engines running under Ray workers, with results landing in a DuckDB warehouse. — One CLI · three engines under Ray · one DuckDB result schema

How it works · step by step

The diagram, walked through in plain language

1
One command starts a test
A simple CLI takes a strategy name, a date range, and a parameter grid (e.g. 'test these 50,000 combinations of stop-loss and take-profit values').
2
The harness translates inputs for each engine
Backtrader, Zipline, and QuantConnect each want data in their own shape. A small adapter per engine converts our standard format into theirs, so the strategy code does not have to change.
3
Tests run in parallel across many machines
Each parameter combination is a Ray task that runs in its own process, so a 50K-combo sweep finishes in a fraction of the single-machine time.
4
Walk-forward built in
Instead of testing on all history at once (which flatters strategies), the tool can 'train on 24 months, test on the next 3, slide forward by 1 month' repeatedly — the way real trading must work.
5
All results land in one warehouse
Whether the test ran on Backtrader, Zipline, or QuantConnect, the output is normalised into a shared schema and saved to DuckDB.
6
Comparing strategies becomes a SQL query
Researchers ask 'which parameter combination had the best risk-adjusted return last quarter?' against DuckDB instead of stitching Excel sheets together by hand.

The brief

A small prop team was running strategies across three different backtesting engines for three different reasons: Backtrader because it was what the senior quant knew, Zipline for its Pipeline DSL, and QuantConnect when they needed a venue adapter someone else had already maintained.

The problem: the three engines had three different data contracts, three different result shapes, and no shared reporting layer. A “compare these two strategies across engines” question took half a day of manual Excel alignment to answer.

The constraints

Strategy code could not change. Rewriting 40 strategies to fit a new abstraction was a non-starter.
Results had to come out in a single schema, regardless of which engine produced them.
Walk-forward analysis had to be a first-class citizen, not something the user had to stitch together with cron jobs.
Parameter sweeps had to scale horizontally — 50K-combo runs should not take a weekend.

The shape we built

A thin Python harness with three responsibilities: normalize inputs (one data catalog, engine-specific feeders fan out), dispatch execution (each engine runs in its own process with a well-defined result protocol over Unix sockets), and persist results (everything lands in DuckDB with a common schema, indexed by strategy hash and parameter signature).

Walk-forward is expressed declaratively — “train on 24 months, test on 3, step 1 month” — and compiled down to engine-specific calls. Parameter sweeps run under Ray; the harness submits one Ray task per combo and streams results back into DuckDB as they complete.

What was hard

Result alignment. Each engine defines “trade” differently — Backtrader aggregates at position close, Zipline emits on every fill, QuantConnect does whatever you tell it. Normalizing required engine-specific adapters with a shared post-processing layer.
Time handling across timezones. One engine was UTC, one was exchange-local, one was naive. The harness enforces UTC at the boundary and raises loudly on ambiguity.
Reproducibility. Sweep results must be byte-identical across reruns. Pinning random seeds, sorting input data deterministically, and eliminating clock-based branches took the last 10% of the project and was worth all of it.

What it does today

A single CLI runs the team's entire strategy suite across three engines with identical output. Sweep runtime is 85% lower than the single-engine baseline thanks to parallelism. 14 strategies have been shipped from harness to paper-trading since launch. The DuckDB result warehouse has become the team's analytical workbench — every research question now starts as a SQL query, not a Jupyter notebook.

What I'd do differently

I'd add probabilistic Sharpe ratio and deflated Sharpe ratio to the default result schema from day one. Regular Sharpe is the number everyone asks for; the corrected versions are the number that tells you whether to trade. The team converged on computing them by hand for a year before I gave up and made them first-class.

Stack

Python
Backtrader · Zipline · QuantConnect LEAN
Pandas · Polars
DuckDB (result warehouse)
Typer (CLI)
Ray (parallel sweeps)

More work

Continue the tour

All

Algo Trading · 2025

Order Router & Execution Engine

$80M routed · 38ms p99 · zero downtime

A trading desk's chart fires a buy or sell signal; this system safely turns each signal into a real order at the right brokerage in milliseconds — while quietly making sure they never trade more than they meant to or place an order they can't afford.

Read case study

AI / LLM · 2024

AI Content Platform

10K daily users · 12 models · 35% lower cost

A SaaS that generates marketing-style writing (articles, ads, product copy) for thousands of paying users — intelligently picking the cheapest AI model that can do each job well, and switching providers in seconds when one of them goes down.

Read case study

Fintech · 2024

Fintech Reporting Dashboard

200M rows · 60% faster · sub-second queries

A financial dashboard that used to take seven seconds to show 'this month's profit and loss' now takes half a second — because we moved the heavy reports off the live database without changing a single number the customer's accountant sees.

Read case study

SaaS · 2024

JobbyAI

resume scoring · job match · interview prep

A free web app that helps job seekers in three ways: it scores their resume, ranks how well they match a job posting, and prepares them for the interview — all using a single AI model behind the scenes, with no signup required to try it.

Read case study

Fintech · 2023

Accounting API Sync

4 providers · one trait · zero drift

A behind-the-scenes service that keeps an accounting SaaS in sync with QuickBooks, Xero, Wave, and AccountEdge — when a customer edits an invoice in either place, the change shows up on the other side within 30 seconds, without ever silently overwriting work.

Read case study

AI / LLM · 2025

Multi-LLM Agent Runtime

OpenAI · Claude · Gemini · Grok

A small, stateless service that lets non-engineers wire up AI 'agents' (which can call tools, look things up, and reply) — running across four AI providers so a single outage never takes a customer offline, and replay-able to the byte for debugging.

Read case study

Algo Trading · 2024

TradingView ↔ Plaid Bridge

webhook in · broker-native out · 4 signal types

A bridge that takes 'buy' or 'sell' alerts from TradingView charts, checks the user actually has the cash via their bank link (Plaid), then sends the order to their brokerage — all in under a fifth of a second, so the price they wanted is still the price they get.

Read case study

DevTools · 2023

Figma + Chrome Plugin Suite

design · engineering · less friction

Three small browser plugins that quietly fix the slow, fiddly hand-off between designers (working in Figma) and engineers (writing code) — saving each engineer about four hours a week of busywork that nobody was tracking, but everyone resented.

Read case study

Have a similar problem?

If this shape of engagement fits what you're working on, I'd be happy to scope it.

Discuss your architecture See engagement models