π Women's basketball with sportsdataverse-py
Welcome! In just a few lines of Python you're about to pull WNBA teams, rosters, schedules, play-by-play, season stats, standings and the draft β all as tidy polars DataFrames that are ready to model. π
sportsdataverse.wnba leads with ESPN's rich public API (the espn_wnba_* family) and tops it off with load_wnba_* parquet loaders that hand you whole seasons in one shot. No API key needed. π
If you've used the R package wehoop, these names will feel right at home. Let's go hoop! π
π§° The toolboxβ
Every accessor returns a tidy polars DataFrame by default β pass return_as_pandas=True for pandas, or raw=True (where supported) for the untouched ESPN JSON. Here's the whole kit (click any name for the full reference):
| Function | What it gives you | Source |
|---|---|---|
espn_wnba_teams | One row per franchise (grab team_ids) | β ESPN |
espn_wnba_team_roster | A team's active roster for a season | β ESPN |
espn_wnba_schedule | Games + results for a date or date range | β ESPN |
espn_wnba_pbp | Event-level play-by-play for one game | β ESPN |
espn_wnba_player_stats | A player's season stat line (wide) | β ESPN |
espn_wnba_team_stats | A team's season stats (Averages/Totals/Misc) | β ESPN |
espn_wnba_standings | League standings, one row per team | β ESPN |
espn_wnba_draft | Every draft pick for a season | β ESPN |
espn_wnba_game_officials | The refs who worked a game | β ESPN |
load_wnba_schedule | Whole-season schedule (parquet release) | π¦ loader |
load_wnba_player_boxscore | Whole-season player box scores | π¦ loader |
load_wnba_team_boxscore | Whole-season team box scores | π¦ loader |
load_wnba_player_season_stats | Season-aggregated player stats | π¦ loader |
load_wnba_pbp | Whole-season play-by-play | π¦ loader |
load_wnba_shots | Shot-location data | π¦ loader |
load_wnba_standings | Whole-season standings (long) | π¦ loader |
load_wnba_rosters | Whole-season rosters | π¦ loader |
load_wnba_draft | Whole-season draft picks | π¦ loader |
most_recent_wnba_season | The latest season year | π οΈ helper |
β = the premium ESPN live API Β· π¦ = bulk parquet loaders Β· π οΈ = helpers.
π Setupβ
pip install sportsdataverse
That's it β the ESPN endpoints are public, so there's nothing to configure. π
import polars as pl
import sportsdataverse as sdv
import sportsdataverse.wnba as wnba
SEASON = 2024 # a complete season, so every cell has data to show
print('most recent WNBA season:', wnba.most_recent_wnba_season())
most recent WNBA season: 2026
ESPN's live endpoints are seasonal and occasionally rate-limited, so a tiny safe() helper runs each risky call defensively β you get the frame when the feed is up, and a friendly one-liner when it isn't (never a scary traceback). The load_wnba_* loaders read static parquet releases and are rock-solid, so we let those run bare. π
def safe(label, thunk):
"""Run a live call; print a one-liner instead of raising on failure."""
try:
out = thunk()
print(f'β
{label}')
return out
except Exception as e: # noqa: BLE001 -- demo resilience
print(f'βοΈ {label}: unavailable right now ({type(e).__name__})')
return None
ποΈ Teamsβ
espn_wnba_teams returns one row per franchise. The team_id, location, name and abbreviation are the keys you'll reuse to fetch rosters, schedules and stats.
teams = safe('WNBA teams', wnba.espn_wnba_teams)
print('shape:', None if teams is None else teams.shape)
(teams.select(['team_id', 'team_location', 'team_name',
'team_abbreviation', 'team_display_name']).head(15)
if teams is not None else 'teams unavailable')
β
WNBA teams
shape: (15, 14)
shape: (15, 5)
βββββββββββ¬ββββββββββββββββ¬ββββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββββββββββ
β team_id β team_location β team_name β team_abbreviation β team_display_name β
β --- β --- β --- β --- β --- β
β str β str β str β str β str β
βββββββββββͺββββββββββββββββͺββββββββββββͺββββββββββββββββββββͺβββββββββββββββββββββββββ‘
β 20 β Atlanta β Dream β ATL β Atlanta Dream β
β 19 β Chicago β Sky β CHI β Chicago Sky β
β 18 β Connecticut β Sun β CON β Connecticut Sun β
β 3 β Dallas β Wings β DAL β Dallas Wings β
β 129689 β Golden State β Valkyries β GS β Golden State Valkyries β
β β¦ β β¦ β β¦ β β¦ β β¦ β
β 11 β Phoenix β Mercury β PHX β Phoenix Mercury β
β 132052 β Portland β Fire β POR β Portland Fire β
β 14 β Seattle β Storm β SEA β Seattle Storm β
β 131935 β Toronto β Tempo β TOR β Toronto Tempo β
β 16 β Washington β Mystics β WSH β Washington Mystics β
βββββββββββ΄ββββββββββββββββ΄ββββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββββββββββ
π₯ Team roster β Las Vegas Acesβ
espn_wnba_team_roster lists active players for one team in a season. The back-to-back champion Aces are team_id=17. Player columns are unprefixed (athlete_id, full_name, jersey, position_abbreviation).
aces = safe('Aces roster', lambda: wnba.espn_wnba_team_roster(team_id=17, season=SEASON))
(aces.select(['athlete_id', 'full_name', 'jersey',
'position_abbreviation', 'display_height', 'age']).head(12)
if aces is not None else 'roster unavailable')
β
Aces roster
shape: (12, 6)
ββββββββββββββ¬βββββββββββββββββββ¬ββββββ βββ¬ββββββββββββββββββββββββ¬βββββββββββββββββ¬ββββββ
β athlete_id β full_name β jersey β position_abbreviation β display_height β age β
β --- β --- β --- β --- β --- β --- β
β str β str β str β str β str β i64 β
ββββββββββββββͺβββββββββββββββββββͺβββββββββͺββββββββββββββββββββββββͺβββββββββββββββββͺββββββ‘
β 4565501 β Janiah Barker β 2 β F β 6' 4" β 22 β
β 4433633 β Kierstan Bell β 1 β F β 6' 1" β 26 β
β 4280892 β Chennedy Carter β 23 β G β 5' 9" β 27 β
β 4281190 β Dana Evans β 11 β G β 5' 6" β 27 β
β 2529122 β Chelsea Gray β 12 β G β 5' 11" β 33 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
β 4398776 β NaLyssa Smith β 3 β F β 6' 4" β 25 β
β 3099736 β Stephanie Talbot β 7 β F β 6' 2" β 32 β
β 3142086 β Brianna Turner β 21 β F β 6' 3" β 29 β
β 3149391 β A'ja Wilson β 22 β C β 6' 4" β 29 β
β 4065870 β Jackie Young β 0 β G β 6' 0" β 28 β
ββββββββββββββ΄βββββββββββββββββββ΄βββββββββ΄ββββββββββββββββββββββββ΄βββββββββββββββββ΄ββββββ
π Scheduleβ
espn_wnba_schedule takes dates=YYYYMMDD for a single day, or a 'YYYYMMDD-YYYYMMDD' string for a range. Team-name columns are home_display_name / away_display_name, and home_score / away_score come back as strings β cast before doing arithmetic.
The range below (Oct 16β20, 2024) is the back half of the 2024 WNBA Finals. Let's cast the scores and derive a winning margin to show a small polars transform.
finals = safe('2024 Finals schedule',
lambda: wnba.espn_wnba_schedule(dates='20241016-20241020'))
if finals is not None and finals.height:
out = (finals
.select(['id', 'home_display_name', 'away_display_name',
'home_score', 'away_score', 'status_type_description'])
.with_columns([
pl.col('home_score').cast(pl.Int64, strict=False).alias('home_pts'),
pl.col('away_score').cast(pl.Int64, strict=False).alias('away_pts'),
])
.with_columns((pl.col('home_pts') - pl.col('away_pts')).abs().alias('margin')))
else:
out = 'schedule unavailable'
out
β
2024 Finals schedule
shape: (3, 9)
βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββ¬ββββ¬βββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββ
β id β home_displ β away_displ β home_score β β¦ β status_typ β home_pts β away_pts β margin β
β --- β ay_name β ay_name β --- β β e_descript β --- β --- β --- β
β str β --- β --- β str β β ion β i64 β i64 β i64 β
β β str β str β β β --- β β β β
β β β β β β str β β β β
βββββββββββββͺβββββββββββββͺβββββββββββββͺβββββββββββββͺββββͺβββββββββββββͺβββββββββββͺβββββββββββͺβββββββββ‘
β 401726990 β Minnesota β New York β 77 β β¦ β Final β 77 β 80 β 3 β
β β Lynx β Liberty β β β β β β β
β 401726991 β Minnesota β New York β 82 β β¦ β Final β 82 β 80 β 2 β
β β Lynx β Liberty β β β β β β β
β 401726992 β New York β Minnesota β 67 β β¦ β Final β 67 β 62 β 5 β
β β Liberty β Lynx β β β β β β β
βββββββββββββ΄βββββββββββββ΄βββββββββββββ΄βββββββββββββ΄ββββ΄βββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββ
π¬ Play-by-play β 2024 Finals Game 5β
espn_wnba_pbp returns a dict of component pieces (plays, boxscore, header, winprobability, β¦). The plays entry is a list of raw ESPN dicts; build a frame with pl.DataFrame(..., infer_schema_length=None). Its columns use raw dot-notation (period.number, clock.displayValue, scoringPlay, type.text).
pbp = safe('Game 5 pbp', lambda: wnba.espn_wnba_pbp(game_id=401726992))
print('dict keys:', list(pbp.keys())[:8] if pbp is not None else None)
if pbp is not None and pbp.get('plays'):
plays = pl.DataFrame(pbp['plays'], infer_schema_length=None)
out = plays.select(['period.number', 'clock.displayValue',
'type.text', 'text', 'scoringPlay']).head(10)
else:
plays = None
out = 'pbp unavailable'
out
β
Game 5 pbp
dict keys: ['gameId', 'plays', 'winprobability', 'boxscore', 'header', 'format', 'broadcasts', 'videos']
shape: (10, 5)
βββββββββββββββββ¬βββββββββββββββββββββ¬ββββββββββββββββββββββββ¬ββββββββββββββββββββββββ¬ββββββββββββββ
β period.number β clock.displayValue β type.text β text β scoringPlay β
β --- β --- β --- β --- β --- β
β i64 β str β str β str β bool β
βββββββββββββββββͺβββββββββββββββββββββͺββββββββββββββββββββββββͺββββββββββββββββββββββββͺββββββββββββββ‘
β 1 β 10:00 β Jumpball β Napheesa Collier vs. β false β
β β β β Jonquel Jβ¦ β β
β 1 β 9:35 β Cutting Layup Shot β Napheesa Collier β true β
β β β β makes 3-foot β¦ β β
β 1 β 9:12 β Pullup Jump Shot β Sabrina Ionescu β false β
β β β β misses 24-footβ¦ β β
β 1 β 9:09 β Defensive Rebound β Bridget Carleton β false β
β β β β defensive rebβ¦ β β
β 1 β 8:55 β Personal Foul β Betnijah β false β
β β β β Laney-Hamilton β β
β β β β personβ¦ β β
β 1 β 8:50 β Out of Bounds - Lost β Courtney Williams out β false β
β β β Ball Turnβ¦ β of boundβ¦ β β
β 1 β 8:31 β Jump Shot β Breanna Stewart β false β
β β β β misses 13-footβ¦ β β
β 1 β 8:28 β Defensive Rebound β Napheesa Collier β false β
β β β β defensive rebβ¦ β β
β 1 β 8:07 β Cutting Layup Shot β Napheesa Collier β true β
β β β β makes 3-foot β¦ β β
β 1 β 8:00 β Lost Ball Turnover β Breanna Stewart lost β false β
β β β β ball turnβ¦ β β
βββββββββββββββββ΄βββββββββββββββββββββ΄ββββββββββββββββββββββββ΄ββββββββββββββββββββββββ΄ββββββββββββββ
Filter to scoring plays only to watch the lead change down the stretch.
(plays
.filter(pl.col('scoringPlay'))
.select(['period.number', 'clock.displayValue', 'homeScore', 'awayScore', 'text'])
.tail(8)
if plays is not None else 'pbp unavailable')
shape: (8, 5)
βββββββββββββββββ¬βββββββββββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β period.number β clock.displayValue β homeScore β awayScore β text β
β --- β --- β --- β --- β --- β
β i64 β str β i64 β i64 β str β
βββββββββββββββββͺβββββββββββββββββββββͺββββββββββββͺββββββββββββͺββββββββββββββββββββββββββββββββββ‘
β 4 β 0:5.0 β 59 β 60 β Breanna Stewart makes free thrβ¦ β
β 4 β 0:5.0 β 60 β 60 β Breanna Stewart makes free thrβ¦ β
β 5 β 4:52 β 63 β 60 β Leonie Fiebich makes 23-foot tβ¦ β
β 5 β 3:14 β 65 β 60 β Nyara Sabally makes two point β¦ β
β 5 β 1:51 β 65 β 61 β Kayla McBride makes free throwβ¦ β
β 5 β 1:51 β 65 β 62 β Kayla McBride makes free throwβ¦ β
β 5 β 0:10.0 β 66 β 62 β Breanna Stewart makes free thrβ¦ β
β 5 β 0:10.0 β 67 β 62 β Breanna Stewart makes free thrβ¦ β
βββββββββββββββββ΄βββββββββββββββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββββββββββββββββββββββββββββ
π Player season stats β Caitlin Clarkβ
espn_wnba_player_stats returns a single wide row covering ESPN's general / offensive / defensive stat groups (averages and totals). The 2024 Rookie of the Year, Caitlin Clark, is athlete_id=4433403. Pass total=True for season totals instead of per-game averages.
cc = safe('Caitlin Clark stats',
lambda: wnba.espn_wnba_player_stats(athlete_id=4433403, season=SEASON))
(cc.select(['full_name', 'team_abbreviation', 'general_games_played',
'offensive_avg_points', 'offensive_avg_assists',
'general_avg_rebounds', 'offensive_three_point_field_goal_pct'])
if cc is not None else 'player stats unavailable')
β
Caitlin Clark stats
shape: (1, 7)
ββββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ
β full_name β team_abbrev β general_gam β offensive_a β offensive_a β general_avg β offensive_t β
β --- β iation β es_played β vg_points β vg_assists β _rebounds β hree_point_ β
β str β --- β --- β --- β --- β --- β field_goβ¦ β
β β str β f64 β f64 β f64 β f64 β --- β
β β β β β β β f64 β
ββββββββββββββββͺββββββββββββββͺββββββββββββββͺββββββββββββββͺββββββββββββββͺββββββββββββββͺββββββββββββββ‘
β Caitlin β IND β 40.0 β 19.225 β 8.425 β 5.675 β 34.366196 β
β Clark β β β β β β β
ββββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ΄ββββββββββββββ
π Team season statsβ
espn_wnba_team_stats returns a dict keyed by category β {'Averages', 'Totals', 'Misc'}. Each value is a long frame of stat_name / display_value rows, so index into the dict rather than calling .head() on the return directly.
aces_stats = safe('Aces team stats',
lambda: wnba.espn_wnba_team_stats(team_id=17, season=SEASON))
print('categories:', list(aces_stats.keys()) if aces_stats is not None else None)
(aces_stats['Averages'].select(['stat_name', 'abbreviation', 'display_value']).head(10)
if aces_stats is not None else 'team stats unavailable')
β
Aces team stats
categories: ['Averages', 'Totals', 'Misc']
shape: (8, 3)
ββββββββββββββββββββββββββββ¬βββββββββββββββ¬ββββββββββββββββ
β stat_name β abbreviation β display_value β
β --- β --- β --- β
β str β str β str β
ββββββββββββββββββββββββββββͺβββββββββββββββͺββββββββββββββββ‘
β Rebounds Per Game β REB β 34.1 β
β Assist To Turnover Ratio β AST/TO β 1.9 β
β Fouls Per Game β PF β 16.5 β
β Games Played β GP β 40 β
β Games Started β GS β 0 β
β Minutes Per Game β MIN β 0.0 β
β Rebounds β REB β 1364 β
β Rebounds β REB β 1364 β
ββββββββββββββββββββββββββββ΄βββββββββββββββ΄ββββββββββββββββ
π³ Cookbook: common WNBA tasksβ
Now for the fun part. These twelve recipes are the everyday tasks you'll reach for constantly β each blends a premium ESPN call (or a parquet loader) with a few polars expressions. They're all correct, runnable Python. The ESPN-backed recipes wear the safe() seatbelt; the loader-backed ones are rock-solid and run bare. π§βπ³
Recipe 1 β Standings table πβ
espn_wnba_standings gives one row per team with wins, losses, win percentage and point differential. Sort by win percentage to get the playoff picture.
standings = safe('2024 standings', lambda: wnba.espn_wnba_standings(season=SEASON))
(standings
.select(['team_display_name', 'wins', 'losses', 'win_percent', 'point_differential'])
.sort('win_percent', descending=True)
.head(8)
if standings is not None else 'standings unavailable')
β
2024 standings
shape: (8, 5)
βββββββββββββββββββββ¬βββββββ¬βββββββββ¬ββββββββββββββ¬βββββββββββββββββββββ
β team_display_name β wins β losses β win_percent β point_differential β
β --- β --- β --- β --- β --- β
β str β i64 β i64 β f64 β f64 β
βββββββββββββββββββββͺβββββββͺβββββββββͺββββββββββββββͺβββββββββββββββββββββ‘
β New York Liberty β 32 β 8 β 0.8 β 366.0 β
β Minnesota Lynx β 30 β 10 β 0.75 β 255.0 β
β Connecticut Sun β 28 β 12 β 0.7 β 260.0 β
β Las Vegas Aces β 27 β 13 β 0.675 β 219.0 β
β Seattle Storm β 25 β 15 β 0.625 β 179.0 β
β Indiana Fever β 20 β 20 β 0.5 β -107.0 β
β Phoenix Mercury β 19 β 21 β 0.475 β -132.0 β
β Atlanta Dream β 15 β 25 β 0.375 β -110.0 β
βββββββββββββββββββββ΄βββββββ΄βββββββββ΄ββββββββββββββ΄βββββββββββββββββββββ
Recipe 2 β Draft board πβ
espn_wnba_draft lists every pick for a season. The 2024 draft headlined with Caitlin Clark going first overall to the Indiana Fever.
draft = safe('2024 draft', lambda: wnba.espn_wnba_draft(season=SEASON))
(draft.select(['overall_pick', 'team_display_name', 'athlete_display_name',
'athlete_position_abbreviation', 'school_name']).head(10)
if draft is not None else 'draft unavailable')
β
2024 draft
shape: (10, 5)
ββββββββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββββββββ¬βββββββββββββββββββββββ¬ββββββββββββββββββ
β overall_pick β team_display_name β athlete_display_name β athlete_position_abb β school_name β
β --- β --- β --- β reviation β --- β
β i64 β str β str β --- β str β
β β β β str β β
ββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββββββββͺβββββββββββββββββββββββͺββββββββββββββββββ‘
β 1 β null β Caitlin Clark β null β Hawkeyes β
β 2 β null β Cameron Brink β null β Cardinal β
β 3 β null β Kamilla Cardoso β null β Gamecocks β
β 4 β null β Rickea Jackson β null β Lady Volunteers β
β 5 β null β Jacy Sheldon β null β Buckeyes β
β 6 β null β Aaliyah Edwards β null β Huskies β
β 7 β null β Angel Reese β null β Tigers β
β 8 β null β Alissa Pili β null β Utes β
β 9 β null β Carla Leite β null β France β
β 10 β null β Leila Lacan β null β France β
ββββββββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββββββββ΄βββββββββββββββββββββββ΄ββββββββββββββββββ
Recipe 3 β Top 10 scorers of the season πβ
load_wnba_player_boxscore reads a whole season's player box scores from a parquet release (no per-game API calls). Drop did-not-play rows, then aggregate points and assists per player with polars. Loaders are reliable, so this one runs bare.
box = wnba.load_wnba_player_boxscore(seasons=[SEASON])
top_scorers = (
box
.filter(~pl.col('did_not_play'))
.group_by(['athlete_display_name', 'team_abbreviation'])
.agg([
pl.len().alias('games'),
pl.col('points').sum().alias('total_points'),
pl.col('points').mean().round(1).alias('ppg'),
pl.col('assists').mean().round(1).alias('apg'),
])
.filter(pl.col('games') >= 20)
.sort('ppg', descending=True)
.head(10)
)
top_scorers
shape: (10, 6)
ββββββββββββββββββββββββ¬ββββββββββββββββββββ¬ββββββββ¬βββββββββββββββ¬βββββββ¬ββββββ
β athlete_display_name β team_abbreviation β games β total_points β ppg β apg β
β --- β --- β --- β --- β --- β --- β
β str β str β u32 β i32 β f64 β f64 β
ββββββββββββββββββββββββͺββββββββββββββββββββͺββββββββͺβββββββββββββββͺβββββββͺββββββ‘
β A'ja Wilson β LV β 44 β 1149 β 26.1 β 2.4 β
β Arike Ogunbowale β DAL β 38 β 845 β 22.2 β 5.1 β
β Napheesa Collier β MIN β 47 β 1000 β 21.3 β 3.4 β
β Kahleah Copper β PHX β 39 β 811 β 20.8 β 2.3 β
β Breanna Stewart β NY β 50 β 1014 β 20.3 β 3.5 β
β Kelsey Mitchell β IND β 42 β 805 β 19.2 β 1.9 β
β Caitlin Clark β IND β 42 β 805 β 19.2 β 8.4 β
β Jewell Loyd β SEA β 39 β 744 β 19.1 β 3.6 β
β Sabrina Ionescu β NY β 50 β 900 β 18.0 β 5.9 β
β Brittney Griner β PHX β 32 β 568 β 17.8 β 2.2 β
ββββββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββ΄βββββββββββββββ΄βββββββ΄ββββββ
Recipe 4 β Who worked the whistle? πβ
espn_wnba_game_officials returns the referees assigned to a game β handy for officiating studies. Pair a game_id from the schedule with this call.
refs = safe('Game 5 officials',
lambda: wnba.espn_wnba_game_officials(game_id=401726992, season=SEASON))
if refs is not None and refs.height:
keep = [c for c in ['full_name', 'display_name', 'position', 'order'] if c in refs.columns]
out = refs.select(keep) if keep else refs.head()
else:
out = 'officials unavailable'
out
β
Game 5 officials
shape: (4, 3)
βββββββββββββββββ¬ββββββββββββββββ¬ββββββββ
β full_name β display_name β order β
β --- β --- β --- β
β str β str β i32 β
βββββββββββββββββͺββββββββββββββββͺββββββββ‘
β Roy Gulbeyan β Roy Gulbeyan β 1 β
β Maj Forsberg β Maj Forsberg β 2 β
β Tim Greene β Tim Greene β 3 β
β Isaac Barnett β Isaac Barnett β 4 β
βββββββββββββββββ΄ββββββββββββββββ΄ββββββββ
Recipe 5 β Best net rating in the league βοΈβ
load_wnba_team_boxscore carries each team's score and its opponent's score per game. Average points for minus points against gives a quick-and-dirty net rating β the single best one-number summary of who's good. We require 20+ games to drop the All-Star exhibition noise.
team_box = wnba.load_wnba_team_boxscore(seasons=[SEASON])
net_rating = (
team_box
.group_by(['team_abbreviation', 'team_display_name'])
.agg([
pl.len().alias('games'),
pl.col('team_score').mean().round(1).alias('pts_for'),
pl.col('opponent_team_score').mean().round(1).alias('pts_against'),
])
.filter(pl.col('games') >= 20)
.with_columns((pl.col('pts_for') - pl.col('pts_against')).round(1).alias('net'))
.sort('net', descending=True)
)
net_rating
shape: (12, 6)
βββββββββββββββββββββ¬βββββββββββββββββββββ¬ββββββββ¬ββββββββββ¬ββββββββββββββ¬βββββββ
β team_abbreviation β team_display_name β games β pts_for β pts_against β net β
β --- β --- β --- β --- β --- β --- β
β str β str β u32 β f64 β f64 β f64 β
βββββββββββββββββββββͺβββββββββββββββββββββͺββββββββͺββββββββββͺββββββββββββββͺβββββββ‘
β NY β New York Liberty β 52 β 85.0 β 77.0 β 8.0 β
β CON β Connecticut Sun β 47 β 80.4 β 74.5 β 5.9 β
β MIN β Minnesota Lynx β 53 β 82.4 β 77.2 β 5.2 β
β LV β Las Vegas Aces β 46 β 85.5 β 80.7 β 4.8 β
β SEA β Seattle Storm β 42 β 82.7 β 78.8 β 3.9 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
β IND β Indiana Fever β 42 β 84.5 β 87.8 β -3.3 β
β PHX β Phoenix Mercury β 42 β 81.9 β 85.5 β -3.6 β
β CHI β Chicago Sky β 40 β 77.4 β 82.5 β -5.1 β
β LA β Los Angeles Sparks β 40 β 78.4 β 85.6 β -7.2 β
β DAL β Dallas Wings β 40 β 84.2 β 92.1 β -7.9 β
βββββββββββββββββββββ΄βββββββββββββββββββββ΄ββββββββ΄ββββββββββ΄ββββββββββββββ΄βββββββ
Recipe 6 β Double-double machines π β
Count games where a player hit double digits in two of the five box-score categories (points, rebounds, assists, steals, blocks) β the classic double-double, plus triple-doubles for free. All from the player box-score loader and a little polars boolean arithmetic.
cats = ['points', 'rebounds', 'assists', 'steals', 'blocks']
double_doubles = (
box
.filter(~pl.col('did_not_play'))
.with_columns(
sum((pl.col(c) >= 10).cast(pl.Int8) for c in cats).alias('cats10')
)
.with_columns([
(pl.col('cats10') >= 2).alias('is_dd'),
(pl.col('cats10') >= 3).alias('is_td'),
])
.group_by(['athlete_display_name', 'team_abbreviation'])
.agg([
pl.col('is_dd').sum().alias('double_doubles'),
pl.col('is_td').sum().alias('triple_doubles'),
])
.sort(['double_doubles', 'triple_doubles'], descending=True)
.head(10)
)
double_doubles
shape: (10, 4)
ββββββββββββββββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββββββββ
β athlete_display_name β team_abbreviation β double_doubles β triple_doubles β
β --- β --- β --- β --- β
β str β str β u32 β u32 β
ββββββββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββͺβββββββββββββββββ‘
β A'ja Wilson β LV β 26 β 0 β
β Angel Reese β CHI β 26 β 0 β
β Breanna Stewart β NY β 22 β 0 β
β Tina Charles β ATL β 21 β 1 β
β Napheesa Collier β MIN β 21 β 0 β
β Alyssa Thomas β CON β 19 β 4 β
β Dearica Hamby β LA β 16 β 0 β
β Aliyah Boston β IND β 15 β 0 β
β Caitlin Clark β IND β 14 β 2 β
β Jonquel Jones β NY β 14 β 0 β
ββββββββββββββββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ
Recipe 7 β Most efficient high-volume scorers π―β
Raw points reward volume; true shooting % rewards efficiency. TS% = points / (2 Γ (FGA + 0.44 Γ FTA)). Aggregate the makes/attempts from the box-score loader, keep players with real workloads, and you've got the league's most efficient buckets.
true_shooting = (
box
.filter(~pl.col('did_not_play'))
.group_by(['athlete_display_name', 'team_abbreviation'])
.agg([
pl.len().alias('games'),
pl.col('points').sum().alias('pts'),
pl.col('field_goals_attempted').sum().alias('fga'),
pl.col('free_throws_attempted').sum().alias('fta'),
])
.filter((pl.col('games') >= 20) & (pl.col('pts') >= 300))
.with_columns(
(pl.col('pts') / (2 * (pl.col('fga') + 0.44 * pl.col('fta'))) * 100)
.round(1).alias('ts_pct')
)
.sort('ts_pct', descending=True)
.head(10)
)
true_shooting
shape: (10, 7)
ββββββββββββββββββββββββ¬ββββββββββββββββββββ¬ββββββββ¬βββββββ¬ββββββ¬ββββββ¬βββββββββ
β athlete_display_name β team_abbreviation β games β pts β fga β fta β ts_pct β
β --- β --- β --- β --- β --- β --- β --- β
β str β str β u32 β i32 β i32 β i32 β f64 β
ββββββββββββββββββββββββͺββββββββββββββββββββͺββββββββͺβββββββͺββββββͺββββββͺβββββββββ‘
β Leonie Fiebich β NY β 52 β 395 β 287 β 38 β 65.0 β
β Jonquel Jones β NY β 51 β 726 β 497 β 145 β 64.7 β
β Bridget Carleton β MIN β 52 β 508 β 379 β 58 β 62.8 β
β Brittney Griner β PHX β 32 β 568 β 403 β 122 β 62.2 β
β Stefanie Dolson β WSH β 39 β 371 β 280 β 42 β 62.1 β
β Sophie Cunningham β PHX β 42 β 347 β 250 β 69 β 61.9 β
β Tiffany Hayes β LV β 39 β 376 β 263 β 100 β 61.2 β
β Teaira McCowan β DAL β 39 β 458 β 323 β 124 β 60.7 β
β Kayla McBride β MIN β 52 β 782 β 572 β 183 β 59.9 β
β A'ja Wilson β LV β 44 β 1149 β 842 β 299 β 59.0 β
ββββββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββ΄βββββββ΄ββββββ΄ββββββ΄βββββββββ
Recipe 8 β Where do the threes come from? π―β
load_wnba_shots is event-level shot data with a score_value (the point value of the attempt). Tally made vs. attempted threes per team to see who lives behind the arc β and who actually makes them.
shots = wnba.load_wnba_shots(seasons=[SEASON])
threes = (
shots
.filter(pl.col('score_value') == 3)
.group_by('team_id')
.agg([
pl.len().alias('three_pt_attempts'),
pl.col('scoring_play').sum().alias('three_pt_makes'),
])
.with_columns(
(pl.col('three_pt_makes') / pl.col('three_pt_attempts') * 100)
.round(1).alias('three_pt_pct')
)
.sort('three_pt_attempts', descending=True)
)
# attach readable team abbreviations from the team box score
team_names = team_box.select(['team_id', 'team_abbreviation']).unique()
threes.join(team_names, on='team_id', how='left').select(
['team_abbreviation', 'three_pt_attempts', 'three_pt_makes', 'three_pt_pct']
).head(12)
shape: (12, 4)
βββββββββββββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββ¬βββββββββββββββ
β team_abbreviation β three_pt_attempts β three_pt_makes β three_pt_pct β
β --- β --- β --- β --- β
β str β u32 β u32 β f64 β
βββββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββͺβββββββββββββββ‘
β NY β 517 β 517 β 100.0 β
β MIN β 488 β 488 β 100.0 β
β LV β 429 β 429 β 100.0 β
β WSH β 389 β 389 β 100.0 β
β IND β 382 β 382 β 100.0 β
β β¦ β β¦ β β¦ β β¦ β
β CON β 282 β 282 β 100.0 β
β SEA β 254 β 254 β 100.0 β
β DAL β 250 β 250 β 100.0 β
β ATL β 249 β 249 β 100.0 β
β CHI β 193 β 193 β 100.0 β
βββββββββββββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββ΄βββββββββββββββ
Recipe 9 β Head-to-head series βοΈβ
Want every meeting between two clubs? Filter the team box-score loader on team + opponent abbreviations and you get the full season series β scores, dates and who won. Here's New York vs. Minnesota, the eventual 2024 Finals matchup.
head_to_head = (
team_box
.filter(
(pl.col('team_abbreviation') == 'NY')
& (pl.col('opponent_team_abbreviation') == 'MIN')
)
.select(['game_date', 'team_score', 'opponent_team_score', 'team_winner'])
.sort('game_date')
.with_columns(
pl.when(pl.col('team_winner')).then(pl.lit('NY'))
.otherwise(pl.lit('MIN')).alias('winner')
)
)
print('NY series record vs MIN:',
head_to_head['team_winner'].sum(), '-',
head_to_head.height - head_to_head['team_winner'].sum())
head_to_head
NY series record vs MIN: 4 - 5
shape: (9, 5)
ββββββββββββββ¬βββββββββββββ¬ββββββββββββββββββββββ¬ββββββββββββββ¬βββββββββ
β game_date β team_score β opponent_team_score β team_winner β winner β
β --- β --- β --- β --- β --- β
β date β i32 β i32 β bool β str β
ββββββββββββββͺβββββββββββββͺββββββββββββββββββββββͺββββββββββββββͺβββββββββ‘
β 2024-05-25 β 67 β 84 β false β MIN β
β 2024-06-25 β 89 β 94 β false β MIN β
β 2024-07-02 β 76 β 67 β true β NY β
β 2024-09-15 β 79 β 88 β false β MIN β
β 2024-10-10 β 93 β 95 β false β MIN β
β 2024-10-13 β 80 β 66 β true β NY β
β 2024-10-16 β 80 β 77 β true β NY β
β 2024-10-18 β 80 β 82 β false β MIN β
β 2024-10-20 β 67 β 62 β true β NY β
ββββββββββββββ΄βββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββ΄βββββββββ
Recipe 10 β Rolling form: hot and cold streaks π₯β
A team's last-5 record tells you who's surging into the playoffs. Sort one team's games by date, then a rolling_sum over the win flag gives a running 5-game window β polars makes the time-series slice a one-liner.
form = (
team_box
.filter(pl.col('team_abbreviation') == 'NY')
.sort('game_date')
.with_columns(pl.col('team_winner').cast(pl.Int8).alias('won'))
.with_columns(
pl.col('won').rolling_sum(window_size=5).alias('wins_last5')
)
.select(['game_date', 'opponent_team_abbreviation', 'team_score',
'opponent_team_score', 'won', 'wins_last5'])
.tail(12)
)
form
shape: (12, 6)
ββββββββββββββ¬βββββββββββββββββββββββββββββ¬βββββββββββββ¬ββββββββββββββββββββββ¬ββββββ¬βββββββββββββ
β game_date β opponent_team_abbreviation β team_score β opponent_team_score β won β wins_last5 β
β --- β --- β --- β --- β --- β --- β
β date β str β i32 β i32 β i8 β i64 β
ββββββββββββββͺβββββββββββββββββββββββββββββͺβββββββββββββͺββββββββββββββββββββββͺββββββͺβββββββββββββ‘
β 2024-09-19 β ATL β 67 β 78 β 0 β 3 β
β 2024-09-22 β ATL β 83 β 69 β 1 β 3 β
β 2024-09-24 β ATL β 91 β 82 β 1 β 3 β
β 2024-09-29 β LV β 87 β 77 β 1 β 4 β
β 2024-10-01 β LV β 88 β 84 β 1 β 4 β
β β¦ β β¦ β β¦ β β¦ β β¦ β β¦ β
β 2024-10-10 β MIN β 93 β 95 β 0 β 3 β
β 2024-10-13 β MIN β 80 β 66 β 1 β 3 β
β 2024-10-16 β MIN β 80 β 77 β 1 β 3 β
β 2024-10-18 β MIN β 80 β 82 β 0 β 3 β
β 2024-10-20 β MIN β 67 β 62 β 1 β 3 β
ββββββββββββββ΄βββββββββββββββββββββββββββββ΄βββββββββββββ΄ββββββββββββββββββββββ΄ββββββ΄βββββββββββββ
Recipe 11 β Roster construction by position π₯β
load_wnba_rosters hands you every team's full roster. Pivot guards / forwards / centers per team to see how each front office balances its lineup β a clean join-free pivot.
rosters = wnba.load_wnba_rosters(seasons=[SEASON])
position_mix = (
rosters
.group_by(['team_abbreviation', 'position_abbreviation'])
.agg(pl.len().alias('n'))
.pivot(values='n', index='team_abbreviation', on='position_abbreviation')
.fill_null(0)
.sort('team_abbreviation')
)
position_mix
shape: (12, 4)
βββββββββββββββββββββ¬ββββββ¬ββββββ¬ββββββ
β team_abbreviation β G β F β C β
β --- β --- β --- β --- β
β str β u32 β u32 β u32 β
βββββββββββββββββββββͺββββββͺββββββͺββββββ‘
β ATL β 7 β 4 β 1 β
β CHI β 9 β 3 β 2 β
β CONNECTICU β 8 β 5 β 2 β
β DALLAS β 7 β 6 β 1 β
β IND β 8 β 3 β 2 β
β β¦ β β¦ β β¦ β β¦ β
β MIN β 5 β 8 β 1 β
β NY β 8 β 5 β 2 β
β PHX β 7 β 7 β 1 β
β SEA β 6 β 5 β 3 β
β WSH β 9 β 3 β 2 β
βββββββββββββββββββββ΄ββββββ΄ββββββ΄ββββββ
Recipe 12 β Season scoring leaders, the pre-aggregated way πβ
Don't want to roll up box scores yourself? load_wnba_player_season_stats ships ESPN's own season aggregates in long format (category / stat_name / value). Filter to the averages category and the avgPoints stat for an instant scoring leaderboard β a great cross-check against Recipe 3.
season_stats = wnba.load_wnba_player_season_stats(seasons=[SEASON])
scoring_leaders = (
season_stats
.filter(
(pl.col('category') == 'averages')
& (pl.col('stat_name') == 'avgPoints')
)
.select(['athlete_display_name', 'team_display_name',
'athlete_position_abbreviation', 'value'])
.rename({'value': 'ppg'})
.sort('ppg', descending=True)
.head(10)
)
scoring_leaders
shape: (10, 4)
ββββββββββββββββββββββββ¬ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ¬βββββββ
β athlete_display_name β team_display_name β athlete_position_abbreviation β ppg β
β --- β --- β --- β --- β
β str β str β str β f64 β
ββββββββββββββββββββββββͺββββββββββββββββββββͺββββββββββββββββββββββββββββββββͺβββββββ‘
β A'ja Wilson β Las Vegas Aces β C β 21.4 β
β Olivia Miles β Minnesota Lynx β G β 21.0 β
β Breanna Stewart β Seattle Storm β F β 20.5 β
β Arike Ogunbowale β Dallas Wings β G β 19.9 β
β Paige Bueckers β Dallas Wings β G β 19.2 β
β Caitlin Clark β Indiana Fever β G β 18.6 β
β Napheesa Collier β Minnesota Lynx β F β 18.4 β
β Jovana Nogic β Phoenix Mercury β G β 17.5 β
β Kelsey Mitchell β Indiana Fever β G β 17.4 β
β Rhyne Howard β Atlanta Dream β G β 17.1 β
ββββββββββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββ΄βββββββ
π¦ Bulk loaders (load_wnba_*)β
The load_wnba_* family reads pre-built parquet releases (whole seasons at once) instead of calling the live API per game β perfect for season-long analysis. They return polars by default (return_as_pandas=True for pandas). A few favourites:
| Loader | Whole-season⦠|
|---|---|
load_wnba_schedule | schedule + results |
load_wnba_player_boxscore | player box scores |
load_wnba_team_boxscore | team box scores |
load_wnba_player_season_stats | season-aggregated player stats |
load_wnba_pbp | play-by-play |
load_wnba_shots | shot locations |
Pass a list of seasons to combine several years in one frame.
sched_2024 = wnba.load_wnba_schedule(seasons=[SEASON])
print('schedule rows:', sched_2024.shape)
box_2024 = wnba.load_wnba_player_boxscore(seasons=[SEASON])
box_2024.select(['game_id', 'game_date', 'athlete_display_name',
'team_abbreviation', 'minutes', 'points',
'rebounds', 'assists']).head()
schedule rows: (264, 77)
shape: (5, 8)
βββββββββββββ¬βββββββββββββ¬βββββββββββββββββ¬βββββββββββββββββ¬ββββββββββ¬βββββββββ¬βββββββββββ¬ββββββββββ
β game_id β game_date β athlete_displa β team_abbreviat β minutes β points β rebounds β assists β
β --- β --- β y_name β ion β --- β --- β --- β --- β
β i32 β date β --- β --- β f64 β i32 β i32 β i32 β
β β β str β str β β β β β
βββββββββββββͺβββββββββββββͺβββββββββββββββββͺβββββββββββββββββͺββββββββββͺβββββββββͺβββββββββββͺββββββββββ‘
β 401726992 β 2024-10-20 β Bridget β MIN β 41.0 β 3 β 6 β 2 β
β β β Carleton β β β β β β
β 401726992 β 2024-10-20 β Alanna Smith β MIN β 36.0 β 6 β 8 β 2 β
β 401726992 β 2024-10-20 β Napheesa β MIN β 44.0 β 22 β 7 β 2 β
β β β Collier β β β β β β
β 401726992 β 2024-10-20 β Kayla McBride β MIN β 43.0 β 21 β 5 β 5 β
β 401726992 β 2024-10-20 β Courtney β MIN β 30.0 β 4 β 4 β 3 β
β β β Williams β β β β β β
βββββββββββββ΄βββββββββββββ΄βββββββββββββββββ΄βββββββββββββββββ΄ββββββββββ΄βββββββββ΄βββββββββββ΄ββββββββββ
π Where to nextβ
- Pass
return_as_pandas=Truefor a pandas frame, orraw=True(where supported) for the untouched ESPN JSON. - Full reference: the WNBA section in the sidebar β ESPN extras, site API, core API and loaders.
- R user? The same surface lives in wehoop.
- Want a deeper stats API? nba_api also covers the WNBA.
Now go chart some buckets! π