Introduction
Market microstructure studies how trading actually happens: how orders interact, how liquidity is supplied, and how prices absorb information in real time. It explains why two markets with the same headline price can offer very different trading conditions once spread, depth, queue position, and price impact matter.
Think of the order book as two lines at a checkout: one side wants to buy, the other wants to sell. The price is set by whoever is at the front of each line. When buyers start piling up faster than sellers, the price moves up — and you can see that coming before it happens.
The spread is the gap between what buyers offer and what sellers ask. Market makers set it wide when they’re nervous — like a store charging more for something they’re not sure they can restock. A widening spread is an early warning that something is off, before prices visibly react.
Depth is how many people are standing in line behind the front. A long line means a big order won’t shift the price much. A short line means one large trade can send the price flying. A book that looks full at the top but empty underneath is the most dangerous — it feels safe until it isn’t.
Every trade has a buyer and a seller, but one of them pushed — they were in a hurry. Order flow tracks who is pushing and how hard. When a lot of urgent buyers show up at the same time prices are swinging wildly, that’s when being a market maker gets expensive and dangerous.
Every trade nudges the price. Bigger trades nudge more. Impact measures how much price moves per dollar traded. In thin markets that nudge is large, so investors demand higher returns to take the risk of getting in and out. That’s why less liquid assets often have higher expected returns.
In fast markets, milliseconds matter. When quotes are updating furiously but the price barely moves, someone may be flooding the system with fake orders to slow down competitors. Clustering of trades in time also signals urgency that average bars completely hide.
These metrics form a dependency chain, not a checklist. Volatility sets the regime. Spread and depth define resting liquidity within it. Imbalance and flow describe the pressure being applied. Impact and slippage measure what happens when pressure meets supply. No single number is diagnostic on its own — the signal is always in the combination, and the combination changes with the regime.
The spread is the price of immediacy — what the market maker charges to bear adverse selection risk right now. Huang & Stoll (1997) decomposed it into adverse selection, inventory, and order processing costs, which is why a widening spread is diagnostic, not just expensive. But the average hides the regime change: a bar with mean spread of 2 bps that was 0.5 bps for 55 seconds and 8 bps for 5 seconds tells a fundamentally different story than a steady 2 bps. Bollerslev & Melvin (1994) showed spread volatility itself predicts future return volatility — the instability of the spread matters as much as its level.
Cont, Kukanov & Stoikov (2014) showed it clearly: order book imbalance predicts the direction of the next mid-price change, and the relationship is monotonic — stronger lean, larger expected move. The mechanism is straightforward. When one side is substantially thicker, the thinner side depletes first and the midpoint shifts. But the real insight is in the layers. L1 and L2 can tell contradictory stories, and the divergence between them is often more informative than either alone.
Not all trades carry the same information. Barclay & Warner (1993) found that medium-sized orders — not the largest — account for a disproportionate share of cumulative price impact. Informed traders deliberately size down to avoid detection. This means aggregate volume alone is insufficient: you need to decompose flow by direction, size, and concentration to understand what it is actually telling you.
Kyle’s lambda and Amihud illiquidity approach the same question from opposite ends: how much does trading move the price? Lambda measures information content per unit of signed flow; Amihud measures return per dollar of volume. Where they agree, the market is behaving consistently. Where they diverge is where the diagnosis gets interesting — rising Amihud without rising lambda points to exogenous volatility, not informed flow. Rising lambda alone means someone in the order flow knows something the quote does not yet reflect.
In crypto perpetual futures, funding rate, basis, and open interest are the primary indicators of leverage positioning and directional crowding. When funding, basis, and open interest all point the same direction, the market is building a position that will eventually need to unwind.
Once the concepts are clear, the next step is to study the actual metric families: order flow composition, top-of-book liquidity, multi-level depth, price impact, and derivatives state variables. Each dataset page includes the academic background, what we measure, and why it matters for research.
/api/v1/data/trade_size
Trades
Requires timestamp=true
small_order_volumeSmall Order VolumeTotal traded notional from small trades under $100small_order_countSmall Order CountNumber of trades under $100medium_order_volumeMedium Order VolumeTotal traded notional from medium trades from $100 to under $1,000medium_order_countMedium Order CountNumber of trades from $100 to under $1,000large_order_volumeLarge Order VolumeTotal traded notional from large trades of at least $1,000large_order_countLarge Order CountNumber of trades of at least $1,000volumeTotal VolumeTotal traded quantity in the intervaln_tradesTrade CountNumber of trades in the intervalsmall_order_percentageSmall Order %Ratio of small-trade notional to total traded quantitymedium_order_percentageMedium Order %Ratio of medium-trade notional to total traded quantitylarge_order_percentageLarge Order %Ratio of large-trade notional to total traded quantitysmall_order_count_percentageSmall Order Count %Share of trades that were small tradesmedium_order_count_percentageMedium Order Count %Share of trades that were medium tradeslarge_order_count_percentageLarge Order Count %Share of trades that were large tradestrade_amount_meanMean Trade SizeAverage trade quantitytrade_amount_medianMedian Trade SizeMedian trade quantitytrade_amount_stdTrade Size Std DevStandard deviation of trade quantitytrade_amount_varianceTrade Size VarianceVariance of trade quantitytrade_amount_skewnessTrade Size SkewnessSkewness of the trade-quantity distributiontrade_amount_kurtosisTrade Size KurtosisKurtosis of the trade-quantity distributiontrade_amount_minMin Trade SizeSmallest trade quantity in the intervaltrade_amount_maxMax Trade SizeLargest trade quantity in the intervaltrade_amount_range_ratioTrade Size Range RatioLargest trade quantity divided by smallest trade quantitytrade_amount_cvTrade Size CVCoefficient of variation of trade quantityfrom datetime import datefrom aperiodic import get_metricsdf = get_metrics( api_key="YOUR_API_KEY", metric="trade_size", exchange="binance-futures", symbol="perpetual-BTC-USDT:USDT", interval="1d", start_date=date(2024, 1, 1), end_date=date(2024, 3, 1),)print(df.head())timestampreqstringTimestamp source. 'exchange' uses the exchange-reported timestamp, 'true' uses actual arrival time at our servers.
exchangetrueintervalreqstringAggregation time interval for the data.
1m5m15m30m1h4h1dexchangereqstringSource exchange for the data.
binance-futuresokx-perpshyperliquid-perpssymbolreqstringTrading pair symbol in the format of Atlas' universal symbology: https://github.com/aperiodic-io/atlas
start_datereqstring<date>Start date for the data range (YYYY-MM-DD format). Data is partitioned by year and month.
end_datereqstring<date>End date for the data range (YYYY-MM-DD format). Must be greater than or equal to start_date.
Successful response with download URLs for each monthly file
filesobject[]requiredArray of file information for each month in the requested date range
yearintegerrequiredYear of the data file
monthintegerrequiredMonth of the data file (1-12)
urlstring<uri>requiredPresigned URL for direct file download (valid for 5 minutes). URLs are served from dataset-specific subdomains, e.g. ohlcv.aperiodic.io, trade-metrics.aperiodic.io, l1-metrics.aperiodic.io, l2-metrics.aperiodic.io, derivative-metrics.aperiodic.io.
{
"files": [
{
"year": 2024,
"month": 1,
"url": "https://ohlcv.aperiodic.io/binance-futures/1h/BTCUSDT/2024-01.parquet?X-Amz-Expires=300&..."
},
{
"year": 2024,
"month": 2,
"url": "https://ohlcv.aperiodic.io/binance-futures/1h/BTCUSDT/2024-02.parquet?X-Amz-Expires=300&..."
}
]
}
/api/v1/data/trade_size?timestamp=exchange&interval=1m&exchange=binance-futures