Introduction to Algorithmic Trading: 3 Things Every Algo Trader Must Know

Almost everyone who starts algorithmic trading starts by asking the wrong question. The wrong question is “what’s a good strategy?” The right question, the one that separates traders who still have capital in five years from traders who have a very educational story to tell at dinner parties, is “what will kill a good strategy before it has a chance to work?” The answer is almost never the strategy. The answer, in descending order of how often I’ve watched it happen, is: bad data, unrealistic execution, and wrong position sizing.

This essay is the thing I wish someone had written for me in 2016, when I was happily overfitting moving-average crossovers to the 2013–2015 EUR/USD and congratulating myself on my Sharpe ratio. What follows are the three concepts that, if you understand them properly before you ever put money on, will save you — conservatively — the cost of a nice car. I’m going to keep the math honest and the code runnable. The prerequisite is that you know what a price series is.

Thing #1 — Your data is lying, in ways you can measure

A strategy’s backtest is only as truthful as the data it runs on, and most free historical data has been quietly manipulated by the passage of time. The three specific lies to watch for have names, and every serious dataset is evaluated against them.

Survivorship bias

If your equity universe is “all stocks in the S&P 500 today, backtested from 2005,” congratulations — you have perfect foresight baked into the experiment. The S&P 500 of 2005 had Lehman Brothers, Washington Mutual, and Countrywide in it; your 2026 universe does not. You have removed, from your backtest, every company that failed. The surviving companies overperform by construction.

The correction is a point-in-time universe — the set of tickers that were actually in your target universe on each date — which most free data sources do not give you. Paid sources (CRSP, Norgate, Refinitiv) do. This is, by itself, one of the better reasons to pay for data.

Look-ahead bias

Look-ahead bias is the subtler cousin. It happens when your strategy at time $t$ uses information that wasn’t actually available at time $t$ . The most common sources:

Revised fundamentals. The earnings number you see in your database for Q1 2019 is the revised number, published later. Your strategy at the end of Q1 2019 only knew the preliminary number. Using the revised number in a backtest is pretending you had a time machine.
Close-on-close signals with close execution. If your signal is computed from the close price and you “execute at the close,” you have almost certainly executed at a price nobody could have gotten.
Bar alignment. Joining intraday data from multiple sources with mismatched timestamps quietly leaks future information into past features. Always lag features by at least one bar and confirm the lag is enforced.

lag_features.py

python

import pandas as pd

# Compute indicator from closes, then LAG by one bar before using it.
df["sma_20"]   = df["close"].rolling(20).mean()
df["signal"]   = (df["close"] > df["sma_20"]).astype(int)

# At time t, you trade using what you knew at time t-1.
df["position"] = df["signal"].shift(1).fillna(0)

# Return from holding yesterday's decision through today.
df["pnl"]      = df["position"] * df["close"].pct_change()

# Audit: does any column depend on a future value?
assert df[["sma_20", "signal"]].shift(-1).dropna().notna().all().all()

The pattern to internalize — every feature you use at time t must have been knowable at time t-1. Lag, then signal.

Corporate actions

Stock splits, dividends, mergers, and spinoffs are structural price discontinuities. If you test a strategy on raw prices without adjusting for these, a 2:1 split looks like a 50% overnight loss and your algorithm will “learn” to panic-sell Apple on every split date. Use adjusted close for long-horizon backtests; use raw prices plus a correct event feed when you care about execution realism (because real orders get real dividends, not adjusted ones).

Thing #2 — The market will charge you for every assumption you make about fills

A profitable backtest is a hypothesis about the future of your P&L. A profitable live result is the same hypothesis after the market has charged you for every shortcut you took in the simulator. There are three specific charges, they have names, and — like data problems — you can measure each of them.

Commission & fees

The easiest one, usually hardcoded correctly by every backtester. Per-share, per-contract, or basis-point fees depending on venue. Not where most strategies die. But non-zero.

Bid-ask spread

If your backtest prices every trade at the bar’s close, you’re implicitly assuming mid-price execution. Real market orders pay the full spread; real limit orders pay half the spread if they’re lucky and miss the fill entirely if they’re not. A reasonable default for a liquid US equity is ~1 basis point per side. A reasonable default for a mid-cap in a stressed market is 10–50 basis points per side. For crypto altcoins, please do not ask.

Slippage and market impact

When you place a market order for 100,000 shares, you do not get the posted bid-ask spread. You walk through the book. The price moves against you as you fill. This is market impact, and it scales — roughly — with the square root of the fraction of daily volume you are trying to consume:

impact_{bps} \approx κ \cdot σ \cdot \frac{Q}{V}

Almgren et al.'s square-root law: κ is a venue/name-specific constant, σ is daily volatility, Q is your order size, V is average daily volume. Rough, empirical, and better than pretending impact is zero.

For a strategy trading 1% of daily volume in a name with 2% daily volatility, the square-root law predicts 10–20 basis points of impact per round trip. That number matters more than the difference between two mediocre strategies.

Latency

The time between your decision and your fill. For a retail trader over the internet, 100–300ms is normal. For a colocated HFT shop, it’s measured in microseconds. In those 100ms, the market moves. Your fill is not your signal price; your fill is your signal price plus $Δ_{adverse} (ℓ)$ , the expected adverse move during latency $ℓ$ . For slow strategies, this is rounding error. For any strategy reacting to fast news or book imbalance, this is the strategy.

honest_costs.py

python

import numpy as np

def trade_cost_bps(
    order_size: float,
    adv: float,                # average daily volume (shares)
    daily_vol: float,          # daily vol (e.g. 0.02 for 2%)
    spread_bps: float = 1.0,   # one-way, liquid equities
    kappa: float = 0.1,        # impact constant; calibrate per venue
    latency_adverse_bps: float = 0.5,
) -> float:
    """Round-trip cost estimate in basis points of notional."""
    impact_bps = kappa * daily_vol * 1e4 * np.sqrt(order_size / adv)
    return 2 * spread_bps + impact_bps + 2 * latency_adverse_bps

# Example: trading 1% of ADV in a 2% daily-vol name
print(trade_cost_bps(order_size=10_000, adv=1_000_000, daily_vol=0.02))
# → ~5 bps round-trip. Compare to your backtested gross return per trade.

The smallest realistic cost model. Not correct, but correct-shaped — which is better than the default (zero).

Build a cost model that is at worst wrong in the right direction — pessimistic rather than optimistic — and your backtests start telling you something useful.

Thing #3 — Position sizing determines whether you survive

This is the one most introductory material buries. It is, for anyone who actually trades live, the most important.

You can have a genuine edge — a strategy with positive expected return after costs — and still go broke if you size it wrong. You can have a mediocre edge and compound it into a fortune if you size it right. The mathematics that governs this is simple, a hundred years old, and routinely ignored by people who will tell you, with conviction, that their Sharpe of 1.4 is enough.

The Kelly criterion, exactly

For a strategy with excess return $μ$ and volatility $σ$ , the Kelly criterion says the fraction of capital that maximizes the long-run logarithm of wealth is:

f^{*} = \frac{μ}{σ ^{2}}

Kelly's optimal fraction — the only position size that is provably log-optimal in the long run.

For a strategy with excess return 8% and volatility 16% per year, $f^{*} = 0.08/0.1 6^{2} = 3.125$ . You would, under Kelly, run this strategy at 3.1× leverage. Nobody who has actually traded does this. The reason: full Kelly assumes you know $μ$ and $σ$ with certainty. You don’t. You estimated them on a finite sample, and your estimates are wrong by some amount you also do not know.

Why practitioners use fractional Kelly

Because full Kelly at a mis-estimated edge is ruinous. If your true edge is half what you think it is, full Kelly produces negative expected log-wealth — you lose money in expectation, despite having a “positive” strategy. The standard defensive move is fractional Kelly: sizing at $0.25 f^{*}$ to $0.5 f^{*}$ of the optimal fraction.

Half-Kelly gives up about 25% of the expected log-growth rate in exchange for dramatically less sensitivity to estimation error and much shallower drawdowns. Quarter-Kelly gives up more growth for even smoother equity. The right fraction is the one that matches your drawdown tolerance, which is always smaller than you think it is.

The risk of ruin

The probability that a fixed-fraction bettor eventually loses all of their capital is shockingly high when leverage is high, even for strategies with positive expected return. For a strategy with win probability $p$ , loss probability $q = 1 - p$ , and symmetric payoffs, the probability of eventual ruin from an initial stake $N$ units of your bet size is:

P_{ruin} = (\frac{q}{p})^{N}, p > q

The classical gambler's ruin — symmetric payoffs. Ruin probability collapses geometrically with the ratio of bankroll to bet size.

The practical takeaway: bet size measured in units of your bankroll is the only variable that matters for survival. Halve your bet size, square your probability of survival. This is why professional traders size small and let compounding do the work.

position_size.py

python

import numpy as np

def half_kelly_size(
    mu_annual: float,           # expected excess return
    sigma_annual: float,        # annual volatility
    current_dd: float = 0.0,    # current drawdown as positive number
    dd_tolerance: float = 0.20, # pull risk linearly as we approach this
    max_leverage: float = 2.0,
) -> float:
    full_kelly = mu_annual / (sigma_annual ** 2)
    half = 0.5 * full_kelly
    # Pull risk when we're in a drawdown — survive to trade tomorrow.
    scale = max(0.0, 1.0 - current_dd / dd_tolerance)
    return float(np.clip(half * scale, 0.0, max_leverage))

# Example: μ=8%, σ=16%, flat P&L
print(half_kelly_size(0.08, 0.16))          # → ~1.56
# Same strategy, currently down 10%
print(half_kelly_size(0.08, 0.16, 0.10))    # → ~0.78

A defensible position sizer — half-Kelly capped by both a hard leverage limit and a drawdown-conditional scaler.

The three things, in one sentence each

What to do next, in order

If you’re just starting, here’s the sequence that will save you the most time and money:

Pick one asset class. One. US equities or crypto perps or FX majors. Don’t build a “multi-asset framework” before you’ve traded one strategy in one market.
Pick one backtester. Vectorized for research, event-driven for validation. See my field guide to backtesting libraries if you want the opinionated comparison.
Build a cost model first. Before any alpha hunting. If your cost model says a strategy needs 20 bps per trade to be worth running, you will know the moment you find a 3bps edge that you don’t, in fact, have an edge.
Paper-trade in production for at least a month before any real money. Compare paper-trade results to backtest results on the same dates. If they don’t match within your cost model’s predictions, your cost model is wrong and you’ve just learned something valuable.
Start at one-tenth of the position size your Kelly math recommends. After three months of live data that matches your expectations, consider scaling up. You will not want to. That’s fine.

The closing thought

“The amount you wager on each bet is far more important than the edge you have on the bet itself.”

— Ed Thorp, A Man for All Markets

Thorp knew what he was talking about. He built the strategy that beat blackjack, ran one of the best-performing hedge funds of the 20th century, and wrote more honestly about the craft than nearly anyone. He would want you to know, before you lose any money, that the strategy is not the thing. The craft is data you can trust, costs you’ve measured, and size you can sleep at night holding.

If you internalize the three things above, you will make it past the first eighteen months, which is where most algo traders quietly stop. From there, it’s a game of patience, good hygiene, and compounding. Welcome to it. Size small.

#algo-trading#trading#risk-management#kelly#backtesting#foundations

Get the next essay in your inbox.

Tuesday weekly. Mathematics, finance, and AI — written like an engineer, not a marketer.

Free. Weekly. One click to unsubscribe. Hosted on Buttondown.

Found this useful?

Share it — it helps the next person find this work.

X LinkedIn

Introduction to Algorithmic Trading: 3 Things Every Algo Trader Must Know

Thing #1 — Your data is lying, in ways you can measure

Survivorship bias

Look-ahead bias

Corporate actions

Thing #2 — The market will charge you for every assumption you make about fills

Commission & fees

Bid-ask spread

Slippage and market impact

Latency

Thing #3 — Position sizing determines whether you survive

The Kelly criterion, exactly

Why practitioners use fractional Kelly

The risk of ruin

The three things, in one sentence each

What to do next, in order

The closing thought

Get the next essay in your inbox.

If this resonated, let's talk.

Continue reading

The Eloquent Math Behind the Top Five Trading Strategies

The Best Python Backtesting Libraries in 2026