The Eloquent Math Behind the Top Five Trading Strategies

Trading strategies, as a class of human activity, are less numerous than one might think. Strip away the branding, the fund names, the proprietary jargon that hedge-fund managers use to explain why their returns are not luck, and almost every systematic return stream of the last half-century descends from one of five families: trend, mean reversion, market making, volatility, and factor. Each family has a founding equation. Each equation has an honest, intuitive picture. This essay takes the five in turn, writes the math as it was written in the original paper, shows the graph that makes the math obvious, and — because a strategy without its failure modes is a prospectus — names the specific circumstances under which each one stops working.

1 · Why five, not fifty

The space of possible trading strategies is large but not arbitrary. Any tradable signal is a function of three things: the horizon at which it acts (microseconds, days, years), the direction of its view on price (up, down, toward, away from equilibrium), and the cross-section it operates over (a single asset, a pair, a universe). Enumerate the non-trivial combinations and a handful of clusters dominate: directional time-series momentum, reversion to an equilibrium, compensation for carrying inventory or risk, compensation for bearing tail-risk, and cross-sectional selection on firm characteristics. That is the five.

The striking thing is how little the list has changed in forty years. Jegadeesh and Titman documented cross-sectional momentum in 1993; Engle and Granger formalised cointegration in 1987; Avellaneda and Stoikov wrote down the optimal market-making policy in 2008; the variance risk premium was measured and named in the early 2000s; the Fama–French three-factor paper came out in 1993, extended to five in 2015. The toolkit is stable. The branding changes.

2 · The shared lens — everyone is optimising Sharpe

Before any specific strategy, the language every strategy speaks. A policy $π$ maps the information available at time $t$ to a position $w_{t}$ . The realised period return is $w_{t} \cdot r_{t + 1}$ . What the strategy is explicitly or implicitly maximising is the Sharpe ratio of this return stream net of costs:

SR = \frac{E [ w _{t} r _{t + 1} ] - c ( ∣ w _{t} - w _{t - 1} ∣ )}{Var ( w _{t} r _{t + 1} )}

The Sharpe ratio net of transaction costs c. Every strategy in this essay is a specific answer to: what w_t maximises this?

Each of the five strategies is a specific, defensible ansatz for that policy — a specific mapping from observable quantities to a position. What differs is the quantity. Trend-followers make $w_{t}$ a function of past returns; mean-reverters make it a function of a spread; market-makers make it a function of inventory and depth; volatility traders make it a function of implied minus realised; factor investors make it a function of firm characteristics. The target is the same. The signal differs.

3 · Strategy one — Trend / Time-Series Momentum

The thesis. Instruments that have risen tend to continue rising; those that have fallen tend to continue falling, over horizons from one month to twelve. The effect survives in every major futures market, every major equity index, bonds, currencies, and commodities. The canonical modern paper is Moskowitz, Ooi, and Pedersen’s Time Series Momentum (2012), which found the effect in every one of 58 futures contracts studied.

The signal. The simplest version is the volatility-scaled past return over a lookback $L$ , typically twelve months excluding the most recent one (to avoid a short-run reversal):

s_{t} = \frac{\sum _{k = 1}^{L} r _{t - k}}{σ ^ _{t} L}

Volatility-scaled time-series momentum signal. Dividing by σ̂_t puts every market on a common risk scale; without it, the loudest instruments dominate.

The position. A common sizing is

w_{t} = sign (s_{t}) \cdot \frac{σ _{target}}{σ ^ _{t}}

Sign-and-size momentum: trade the direction, size to a target volatility. Robust and widely used.

Why it works. Three durable explanations:

Underreaction to news. Diffuse information reaches participants at different speeds; prices adjust gradually, producing autocorrelation in the adjustment interval.
Institutional flows. Risk-budgets rebalance toward winners and away from losers; CTA systematic flows amplify extant trends.
Behavioural. Anchoring and confirmation biases lead holders to hold too long after good news and too long after bad news, smoothing the adjustment path.

Price with a 20-period moving average, signal strip below, and realised P&L. Green strip = long; red = short. The trade is to stay on the side of the signal until it flips.

Where it breaks. Sharp reversals — central-bank shocks, coordinated de-risking, crowded-trade unwinds — generate the trend-follower’s distinctive fat left tail. The equity curve is long-call-like: many small wins, punctuated by the occasional painful gap. Capacity has also compressed the premium: a factor that earned ~8% annualised in the 1990s earns closer to 3–4% for new entrants today.

4 · Strategy two — Mean Reversion / Statistical Arbitrage

The thesis. Some pairs of instruments are economically linked; when their prices drift apart, they tend to come back together. Formalised as cointegration in Engle and Granger (1987) and as stationarity of a linear combination in Johansen (1991). At the time-series level, the tool of choice is the Ornstein–Uhlenbeck process, the continuous-time analogue of an AR(1) with pull to a mean.

The mathematics. Let $y_{t}$ and $x_{t}$ be two log prices; the cointegrated spread is $z_{t} = y_{t} - β x_{t}$ where $β$ is estimated by OLS (Engle–Granger two-step) or by the leading cointegrating vector (Johansen). If the spread is stationary, we model it as:

d z_{t} = θ (μ - z_{t}) d t + σ d W_{t}

The Ornstein–Uhlenbeck SDE. θ governs the strength of mean reversion; σ the diffusion around μ.

Two closed-form consequences are load-bearing for the strategy. The half-life — the time for a shock to decay by half — is

τ_{1/2} = \frac{ln 2}{θ}

A spread with θ = 0.05 per day has a half-life of about 14 days — usable. One with θ = 0.002 has a half-life of a year — not.

The stationary distribution of the OU process is Gaussian with variance $σ^{2} / (2 θ)$ , giving an unambiguous z-score:

\overset{z}{^}_{t} = \frac{z _{t} - μ}{σ / 2 θ}

The number every pairs trader watches. Enter at |ẑ| ≥ 2; exit near ẑ = 0; stop out if the spread breaches a structural-break threshold.

For the sophisticated operator, Bertram (2010) derives closed-form optimal entry and exit bands that maximise expected profit per unit time net of transaction cost $c$ . The bands widen with $c$ and narrow with $θ$ — exactly as intuition demands.

Two rebased log prices (top), their OLS-residual spread with 2σ entry bands (bottom), and trade markers where |ẑ| ≥ 2.

Where it breaks. Cointegration is not a property of the universe; it is a property of a window. When the economic link between the two instruments breaks — a merger abandonment, a regulatory shift, a business-model divergence — the spread is simply non-stationary going forward and the strategy loses money on every bet waiting for a reversion that will not come. “This time is different” is the mean-reverter’s last famous words.

5 · Strategy three — Market Making

The thesis. A market maker provides immediacy and is paid for it via the bid-ask spread. The canonical continuous-time optimal policy is Avellaneda and Stoikov (2008). The formalisation trades off two risks: earning the spread (which wants tight quotes and many fills) and carrying inventory (which wants wide, asymmetric quotes that unwind position).

The setup. Midprice follows arithmetic Brownian motion: $d S_{t} = σ d W_{t}$ . The market maker quotes bid $S - δ^{b}$ and ask $S + δ^{a}$ , with fills arriving as Poisson processes whose intensities decline with distance from the mid:

λ^{b} (δ^{b}) = A e^{- k δ^{b}}, λ^{a} (δ^{a}) = A e^{- k δ^{a}}

Fill intensity falls exponentially with distance from the mid. A captures market activity; k the market's price sensitivity to distance.

The solution. The market maker chooses $δ^{b}, δ^{a}$ to maximise expected terminal exponential utility. Avellaneda and Stoikov derive the answer as a quote around a shifted reference price — the reservation price — that absorbs inventory risk:

r (s, q, t) = s - q γ σ^{2} (T - t)

Reservation price. If inventory q is long, the MM shifts both quotes down — encouraging asks to fill, discouraging bids. γ is risk aversion.

And the half-spread the MM quotes on each side of the reservation price:

δ^{*} (t) = \frac{γ σ ^{2} ( T - t )}{2} + \frac{1}{γ} ln (1 + \frac{γ}{k})

The optimal half-spread decomposes into two parts. The first — γσ²(T−t)/2 — is the inventory-risk premium. The second — (1/γ)ln(1 + γ/k) — is the liquidity-demand premium.

Why it works. Information asymmetry is bounded in normal markets, and the patient provider of immediacy is systematically compensated for bearing the inventory risk that the impatient taker is trying to avoid. The compensation is the spread.

Order book (left) with the market maker's quotes anchored on the reservation price; Poisson fill intensity λ(δ) = A e^(−kδ) on the right, showing the tradeoff between spread and fill rate. The profit-maximising δ is where the spread × intensity product peaks.

Where it breaks. Adverse selection. When informed flow is present — the large cash equity trader who knows about the upcoming earnings, the quant fund unwinding a position across every venue at once — the market-maker is always the last to adjust and always on the wrong side of the trade. The protection is inventory-symmetric quoting, rapid cancellation, and reading order-flow statistics in real time. That is why the arms race for co-location and kernel bypass exists: the profitable part of the spread is in the hands of whoever can respond in microseconds rather than milliseconds.

6 · Strategy four — Volatility Risk Premium

The thesis. On equity indices, option-implied volatility systematically exceeds subsequently realised volatility. The difference is a risk premium paid to volatility sellers for bearing tail-gamma risk — a premium on the order of three to five volatility points annualised for the S&P 500 over most of the modern period. Sell volatility, earn the premium, hedge the delta, survive the tails.

The mathematics. The canonical derivation comes from the Itô expansion of a delta-hedged option’s P&L. For a trader short a European call hedged with the Black–Scholes delta at implied volatility $σ_{i}$ , the instantaneous P&L under a realised volatility $σ_{r}$ is:

d Π = \frac{1}{2} S^{2} Γ (σ_{i}^{2} - σ_{r}^{2}) d t + O (d t^{3/2})

The delta-hedged short-option P&L. When implied exceeds realised (σ_i > σ_r), the seller wins the difference, scaled by instantaneous gamma Γ and spot squared.

Integrated over the life of the option, and averaged across many options and time, the expected P&L of the short-vol carrier is positive — provided the implied-realised spread is positive on average, which empirically it is for equity indices. The VIX minus subsequent realised S&P vol has a long-run mean near zero-point-three to four vol points on a monthly basis.

The trade. The cleanest implementation is a short variance swap, which pays $σ_{i}^{2} - σ_{r}^{2}$ exactly. The more available one is short at-the-money straddles delta-hedged daily. The operator earns carry in calm regimes and eats the realisation of tail risk in regime shifts. Kelly sizing here is essential: full-Kelly is close to insolvency.

Implied volatility (top, rose) persistently above realised volatility (top, cyan) on the S&P-like series. The shaded wedge between them is the variance risk premium. Cumulative short-vol P&L (amber, bottom) accrues steadily and gives back a visible chunk at the one regime shock — the strategy's characteristic signature.

Where it breaks. The risk that gives the premium its name. VIX spikes on real news — a banking crisis, a pandemic, a geopolitical shock — produce single-day losses that can easily match ten years of carry. Volatility-selling has become a textbook example of the payoff shape called negatively skewed: quiet profit, loud loss. The correct response is position-limit sizing, hard stops, and a permanent humility about VaR.

7 · Strategy five — Factor Investing

The thesis. Over long horizons, certain firm characteristics predict cross-sectional equity returns: small firms beat large, cheap firms beat expensive, profitable firms beat unprofitable, conservatively-invested firms beat aggressively-invested ones, recent winners beat recent losers. The canonical framework is Fama–French (1993, extended 2015), with Carhart (1997) adding momentum.

The regression. A stock’s excess return, in the five-factor-plus-momentum framework, is:

r_{i} - r_{f} = α + β_{MKT} MKT + β_{SMB} SMB + β_{HML} HML + β_{RMW} RMW + β_{CMA} CMA + β_{MOM} MOM + ε

Fama–French–Carhart. MKT is market minus riskless; SMB is small minus big; HML is high book-to-market minus low; RMW is robust minus weak profitability; CMA is conservative minus aggressive investment; MOM is cross-sectional momentum.

The trade. For each factor, construct a long-short portfolio by ranking the investable universe on the relevant characteristic (size, book-to-market, profitability, investment, trailing return) and taking the top quintile long, the bottom quintile short, dollar-neutral. The excess return of this portfolio is the estimated factor premium. Modern practitioners hedge out the market exposure and blend the remaining factors according to their estimated covariance.

Sizing by Kelly. Given estimated premium $μ_{f}$ and variance $σ_{f}^{2}$ for factor $f$ , the continuous-Kelly fraction is the familiar ratio:

w_{f}^{*} = \frac{μ _{f}}{σ _{f}^{2}}

Continuous Kelly per factor. In practice, multiply by a fraction (one-quarter or one-half Kelly) because μ̂ is estimated with substantial error and over-sizing a spurious factor is a career-limiting move.

Illustrative cumulative returns of five long-short factor portfolios over the same window. MOM and QMJ compound cleanly; HML runs flat for years with a pronounced drawdown in the late 2010s; MKT is the market reference. The shape is the story.

Where it breaks. Factors crowd, factors decay, and factors can go through decade-long droughts. Value’s 2018–2020 drawdown was a historical near-record; HML returns over the 2010s were a rounding error versus their 1990s magnitude. The academic literature calls this either factor decay or time-varying risk premia, depending on which camp the author belongs to. Either way, single-factor concentration is a risk; multi-factor, volatility-targeted, cost-aware construction is the defensible implementation.

8 · The synthesis — five failures of the EMH

Each of the five strategies corresponds to a specific, named failure of the idealised efficient-markets hypothesis. The failure is the strategy. Naming them helps show why the list is five and not fifty:

Family	Inefficiency	Horizon	Capacity
Momentum	Gradual info diffusion	week – month	High
Mean reversion	Liquidity dislocations	hour – week	Medium
Market making	Inventory-bearer compensation	microsecond – second	Low
VRP	Tail-risk aversion premium	week – month	Medium
Factor	Behavioural / risk-premium sorting	year – decade	High

five strategies · five distinct failures of the idealised efficient-markets hypothesis

The same lens — the Sharpe optimisation of $E [w_{t} r_{t + 1}]$ minus cost — covers every row. What differs is which quantity $w_{t}$ is a function of, and what the posterior distribution of the edge looks like under empirical data. The symmetry is elegant and deceptively deep.

“Markets are not efficient; they are adaptive. The strategies that work today are the ones whose inefficiencies are currently underexploited.”

— Andrew Lo, Adaptive Markets, 2017

9 · Caveats, stated honestly

All five strategies have decayed. The premium on every row of the table is smaller for a new entrant in 2026 than for the same strategy in 1996. The rank order has not changed; the absolute numbers have halved or more.
Transaction costs kill the naïve implementation. The paper Sharpe ratios you read in the academic literature are pre-cost. For retail-scale operators, the correct exercise is to simulate the strategy with realistic slippage and broker commissions and only then decide whether it is tradable. Many are not.
Crowding compresses the premium. Every strategy on this list is a standing allocation for some fraction of the $5-trillion global hedge-fund book. The price of admission is now the infrastructure to run it at low cost; the edge is in the execution, not the signal.
Regime dependence is real. Trend fails in choppy markets; mean reversion fails in structural breaks; market-making fails in high-volatility risk-off; VRP fails at regime transitions; factor investing fails when the dominant driver is macro rather than cross-sectional. No strategy is all-weather. Ensembles and regime filters are not optional.
Estimation error is larger than most operators think. At 126 daily observations, the standard error of an annualised mean return estimate is on the order of the estimate itself. Kelly-sizing a spuriously positive mean is a fast path to insolvency. Most experienced operators size at a quarter of Kelly, sometimes less.

10 · The eloquence

The ceaseless output of the hedge-fund marketing apparatus would have you believe that trading strategies are a continuous, high-dimensional space that takes a PhD to navigate. It is not. There are five families, they are each one small equation, and they work when they work for reasons that predate most of the PhDs who claim them. The eloquence is not in the complexity of the mathematics. The eloquence is in how little mathematics is needed.

The operator’s job, after all this, is not to invent a sixth family. It is to pick two or three of the five, execute them carefully, size them with respect for estimation error, and compound the returns without blowing up. The math is small. The discipline is the whole game.

#trading#algo-trading#quant-finance#stochastic-calculus#time-series#kelly#mathematics#econometrics

Get the next essay in your inbox.

Tuesday weekly. Mathematics, finance, and AI — written like an engineer, not a marketer.

Free. Weekly. One click to unsubscribe. Hosted on Buttondown.

Found this useful?

Share it — it helps the next person find this work.

X LinkedIn