Trade Flow Analysis: How to Calculate VWAP and Detect Whale Trades in Python

Learn to analyze crypto trade data with Python. Calculate VWAP, measure buy/sell pressure, and detect large whale trades using tick-level data.

By CryptoHFTData Team
7 min read
#tutorial#trades#vwap#python#whale-detection

What you'll learn

  • How to load and process trade data from CryptoHFTData
  • Calculating VWAP (Volume Weighted Average Price)
  • Measuring buy vs sell pressure from trade flow
  • Detecting large "whale" trades

Introduction

Trade data is the raw record of every transaction in a market. Each trade tells you the price, size, and direction—whether a buyer or seller initiated the transaction. This information is essential for understanding market dynamics beyond what candlestick charts can show.

In this tutorial, we'll analyze a full day of BTCUSDT trades from Binance Futures.

Data: BTCUSDT on Binance Futures, August 1, 2025


Loading Trade Data

First, install the SDK and load the data:

pip install cryptohftdata
import cryptohftdata as chd

# Configure with your API key
chd.configure_client(api_key="your_api_key")

# Load trade data
trades = chd.get_trades(
    symbol="BTCUSDT",
    exchange="binance_futures",
    start_date="2025-08-01",
    end_date="2025-08-01",
)

print(f"Loaded {len(trades):,} trades")
# Output: Loaded 4,670,637 trades

That's 4.7 million trades in a single day for BTCUSDT on Binance Futures.


Understanding the Data Structure

Each row represents a single trade execution:

print(trades.columns.tolist())

Key columns:

ColumnDescription
trade_timeWhen the trade occurred (milliseconds)
priceExecution price
quantityTrade size in base currency (BTC)
is_buyer_makerTrue if buyer was the passive side

Understanding is_buyer_maker:

  • is_buyer_maker=True → Seller was the aggressor (sell pressure)
  • is_buyer_maker=False → Buyer was the aggressor (buy pressure)

This tells you who initiated the trade, which is crucial for understanding order flow.


Basic Statistics

import pandas as pd

# Convert types
trades["price"] = pd.to_numeric(trades["price"])
trades["quantity"] = pd.to_numeric(trades["quantity"])
trades["volume_usd"] = trades["price"] * trades["quantity"]
trades["side"] = trades["is_buyer_maker"].map({True: "sell", False: "buy"})

# Total volume
total_btc = trades["quantity"].sum()
total_usd = trades["volume_usd"].sum()
print(f"Total volume: {total_btc:,.0f} BTC (${total_usd/1e9:.1f}B)")
Total volume: 221,246 BTC ($25.4B)

$25 billion in volume for a single day—this is why futures markets are the primary venue for institutional trading.


Calculating VWAP

VWAP (Volume Weighted Average Price) is the average price weighted by volume. It's widely used by institutions as a benchmark for execution quality.

VWAP = Σ(price × volume) / Σ(volume)
def calculate_vwap(df, interval="1h"):
    """Calculate VWAP over time intervals."""
    df_indexed = df.set_index(pd.to_datetime(df["trade_time"], unit="ms"))

    # Sum of price × volume
    pv_sum = (df["price"] * df["quantity"]).groupby(
        df_indexed.index.floor(interval)
    ).sum()

    # Sum of volume
    vol_sum = df.set_index(df_indexed.index)["quantity"].resample(interval).sum()

    vwap = pv_sum / vol_sum
    return vwap

vwap_hourly = calculate_vwap(trades, "1h")

Sample VWAP vs Close comparison (hourly):

HourVWAPCloseDiff (bps)
00:00$115,080.50$115,366.90+24.9
01:00$115,069.85$115,270.20+17.4
02:00$115,650.72$115,908.80+22.3
03:00$115,803.96$115,595.70-18.0
04:00$115,458.22$115,648.30+16.5

When the close price is above VWAP, buyers were more aggressive in that period. When below, sellers dominated.

Cumulative VWAP

For execution benchmarking, traders track cumulative VWAP throughout the day:

trades["cum_pv"] = (trades["price"] * trades["quantity"]).cumsum()
trades["cum_vol"] = trades["quantity"].cumsum()
trades["cum_vwap"] = trades["cum_pv"] / trades["cum_vol"]

VWAP Analysis

The chart shows price (blue) and cumulative VWAP (orange). VWAP acts as a dynamic support/resistance level—when price is above VWAP, buyers are in control; when below, sellers dominate.


Buy vs Sell Pressure

The is_buyer_maker field reveals the aggressor side of each trade. Aggregating this over time shows where buying or selling pressure is concentrated.

# Calculate per-minute pressure
trades["datetime"] = pd.to_datetime(trades["trade_time"], unit="ms")
df_indexed = trades.set_index("datetime")

buy_vol = df_indexed[df_indexed["side"] == "buy"]["volume_usd"].resample("1min").sum()
sell_vol = df_indexed[df_indexed["side"] == "sell"]["volume_usd"].resample("1min").sum()

# Cumulative delta (buy - sell)
cumulative_delta = (buy_vol.fillna(0) - sell_vol.fillna(0)).cumsum()

Overall pressure for 2025-08-01:

MetricValue
Buy volume$12.2B (48.1%)
Sell volume$13.2B (51.9%)
Net flow-$965M
Avg buy trade$5,274
Avg sell trade$5,587

Sellers were slightly more aggressive on this day, with net outflow of nearly $1 billion.

Buy/Sell Pressure

The top panel shows buy (green) vs sell (red) volume over time. The bottom panel shows cumulative delta—the running total of buy volume minus sell volume. A declining cumulative delta indicates persistent sell pressure.


Detecting Whale Trades

Large trades often signal institutional activity or significant market events. However, there's an important nuance: a single large market order may be split into multiple trade events.

When a large market order sweeps through multiple price levels in the order book, each fill at each price level is recorded as a separate trade. These trades share the same event_time timestamp. To accurately measure order sizes, we need to group trades by event_time.

Grouping Trades by Event Time

def group_trades_by_event_time(df):
    """
    Group trades by event_time to reconstruct original market orders.
    A large market order hitting multiple price levels creates multiple
    trade events with the same event_time.
    """
    grouped = df.groupby(["event_time", "side"]).agg({
        "price": "mean",
        "quantity": "sum",
        "volume_usd": "sum",
        "datetime": "first",
    }).reset_index()
    return grouped

# Group 4.7M trades into ~1.2M orders
orders = group_trades_by_event_time(trades)
print(f"Grouped {len(trades):,} trades into {len(orders):,} orders")
Grouped 4,670,637 trades into 1,187,028 orders

Now we can find whale orders using the top 0.1%:

threshold = orders["volume_usd"].quantile(0.999)
print(f"Large order threshold: ${threshold:,.0f}")

whale_orders = orders[orders["volume_usd"] >= threshold]
print(f"Whale orders found: {len(whale_orders):,}")
Large order threshold: $1,636,116
Whale orders found: 1,188

With proper grouping, the threshold for top 0.1% is now $1.64 million—orders of this size represent serious institutional activity.

Top 10 Largest Orders

TimeAvg PriceQuantityValueSide
07:32:56$114,800180.52 BTC$20.7MSell
12:46:27$115,78089.72 BTC$10.4MBuy
16:17:25$115,56089.21 BTC$10.3MBuy
17:31:01$114,83588.69 BTC$10.2MBuy
22:34:02$113,11288.03 BTC$10.0MSell
12:46:27$115,77765.69 BTC$7.6MBuy
00:52:23$114,33061.71 BTC$7.1MBuy
22:43:08$112,90762.06 BTC$7.0MSell
15:38:24$115,01156.59 BTC$6.5MBuy
20:17:12$113,18454.26 BTC$6.1MBuy

The largest order of the day was a $20.7 million sell at 07:32 UTC. Without grouping by event_time, this would have appeared as multiple smaller trades—but it was actually a single massive order sweeping through the book.

Whale Order Summary

whale_buys = whale_orders[whale_orders["side"] == "buy"]["volume_usd"].sum()
whale_sells = whale_orders[whale_orders["side"] == "sell"]["volume_usd"].sum()

print(f"Whale buy volume:  ${whale_buys/1e9:.2f}B")
print(f"Whale sell volume: ${whale_sells/1e9:.2f}B")
print(f"Whale net flow:    ${(whale_buys - whale_sells)/1e6:.0f}M")
Whale buy volume:  $1.39B
Whale sell volume: $1.73B
Whale net flow:    -$339M

The top 0.1% of orders (1,188 whale orders) accounted for $3.13 billion in volume—12% of the day's total. Whales were net sellers by $339 million.

Whale Trades

The chart shows price with whale orders (top 0.1%, grouped by event_time) overlaid. Green triangles are large buys, red triangles are large sells. Marker size is proportional to order value. Notice how whale activity clusters around key price levels and during periods of high volatility.


Key Takeaways

  1. Volume: BTCUSDT on Binance Futures sees 4.7M+ trades and $25B+ in daily volume
  2. VWAP: A key benchmark that shows whether buyers or sellers are in control
  3. Order flow: The is_buyer_maker field reveals who initiated each trade
  4. Trade grouping: Large orders split across price levels share the same event_time—group by this to reconstruct true order sizes
  5. Whale detection: Top 0.1% orders (>$1.6M) represent true institutional activity, accounting for 12% of daily volume

Get Started

pip install cryptohftdata
import cryptohftdata as chd

chd.configure_client(api_key="your_api_key")

trades = chd.get_trades(
    symbol="BTCUSDT",
    exchange="binance_futures",
    start_date="2025-08-01",
    end_date="2025-08-01",
)

Get your API key at cryptohftdata.com.

Ready to access the data?

Join thousands of traders and researchers building on CryptoHFTData.