ποΈ Welcome to sportsdataverse-py β the cross-sport quickstart
One pip install, every major league. sportsdataverse is a single Python
package that speaks to the official, premium native data feeds across the
sporting world β the same endpoints the leagues use to power their own apps β
plus the ESPN mirror and pre-built parquet release loaders. Everything
comes back as a tidy polars DataFrame, ready to model. π
This page is your map to the whole package. By the end you'll be able to:
- πΊοΈ see every datasource available for every league, with links straight to its tutorial and its reference index;
- π§ predict function names you've never seen β sportsdataverse uses one consistent naming contract, so knowing one function tells you the others;
- π³ cook through ~20 cross-sport recipes that show the breadth in action.
If you've used the R sisters β hoopR, wehoop, cfbfastR, baseballr, fastRhockey, oddsapiR β the names here will feel like home. Let's take the tour! π
πΊοΈ 1 Β· The master index β every datasource, every leagueβ
Here's the whole package on one page. Each row is a league (or the betting-odds module); each cell tells you which datasource families are wired up. π³ marks the premium native feeds (the leagues' own APIs / tracking systems / Statcast). Click a league's tutorial for the deep dive, or its reference for the full function index.
| League | Tutorial Β· Reference | ESPN (espn_<lg>_*) | Native premium API | Tracking / analytics | Release loaders (load_*) |
|---|---|---|---|---|---|
| π NBA | tutorial Β· ref | β | β | β | load_nba_pbp, load_nba_team_boxscore |
| π WNBA | tutorial Β· ref | β | β | β | load_wnba_pbp, load_wnba_player_boxscore |
| π MBB (NCAA M) | tutorial Β· ref | β | β | β | load_mbb_pbp, load_mbb_team_boxscore |
| π WBB (NCAA W) | tutorial Β· ref | β | β | β | load_wbb_pbp, load_wbb_team_boxscore |
| π NFL | tutorial Β· ref | β | π³ nfl_* (api.nfl.com) | π³ Next Gen Stats nfl_ngs_* | load_nfl_pbp, load_nfl_player_stats, load_injuries |
| π CFB (College) | tutorial Β· ref | β | yahoo_cfb_*, fox_cfb_* | β | load_cfb_pbp |
| βΎ MLB | tutorial Β· ref | β | π³ mlb_* (MLB Stats API) | π³ Statcast mlb_statcast_* | load_mlb_pbp, load_mlb_team_boxscore |
| π NHL | tutorial Β· ref | β | π³ nhl_* (api-web) | π³ NHL EDGE nhl_edge_* | load_nhl_pbp, load_nhl_team_boxscore |
| π PWHL (Women's pro) | tutorial Β· ref | β | π³ pwhl_* (HockeyTech) | corsi / shifts / TOI | load_pwhl_schedules |
| π AHL (Minor pro) | tutorial Β· ref | β | π³ ahl_* (HockeyTech) | corsi / shifts / TOI | β |
| π OHL (CHL junior) | tutorial Β· ref | β | π³ ohl_* (HockeyTech) | corsi / shifts / TOI | β |
| π WHL (CHL junior) | tutorial Β· ref | β | π³ whl_* (HockeyTech) | corsi / shifts / TOI | β |
| π QMJHL (CHL junior) | tutorial Β· ref | β | π³ qmjhl_* (HockeyTech) | corsi / shifts / TOI | β |
| π² Betting odds | tutorial Β· ref | β | π³ toa_* (The Odds API) | line history / props | β |
π‘ HockeyTech leagues (AHL/OHL/WHL/QMJHL/PWHL) ship public client keys β no setup needed. Only the betting-odds module wants a free
ODDS_API_KEY.
π§© The five function stylesβ
Across all those rows, only five families exist. Learn the shape of each once and you can read any function name in the package:
- Live ESPN wrappers β
espn_<lg>_*(e.g.espn_nba_teams,espn_wbb_scoreboard). The same set exists for every ESPN league: teams, rosters, scoreboards, standings, schedules, play-by-play, box scores. πͺ - Native premium API wrappers β the league's own feed:
nfl_*(api.nfl.com),mlb_*(MLB Stats API),nhl_*(api-web),pwhl_*/ahl_*/ohl_*/whl_*/qmjhl_*(HockeyTech),toa_*(The Odds API). π³ - Tracking / analytics feeds β the really premium stuff:
mlb_statcast_*(Baseball Savant),nhl_edge_*(player tracking),nfl_ngs_*(Next Gen Stats). - Release / parquet loaders β
load_<sport>_*()reads a pre-built parquet release (fast, reliable, whole-season-at-once):load_nba_pbp,load_mlb_team_boxscore,load_pwhl_schedules, β¦ - Parser layer β
parse_*turns a raw native payload into a tidy frame (e.g.parse_mlb_api_standings). Most wrappers parse for you; the parsers are there when you fetch the rawDictyourself.
The return contract never changes. Every wrapper gives you polars by
default; pass return_as_pandas=True for a pandas frame, and on the native
APIs pass return_parsed=False for the raw JSON Dict. One contract, every
sport. ποΈ
π Setupβ
pip install sportsdataverse
# or
uv add sportsdataverse
Every league is a submodule of the umbrella package, and the headline cross-league wrappers + discovery helpers are re-exported at the top level. Let's import it.
import os
import polars as pl
import sportsdataverse as sdv
import sportsdataverse.odds as odds
# Every league hangs off the top-level package:
[m for m in dir(sdv) if m in
("cfb", "nfl", "nba", "wnba", "mbb", "wbb", "nhl", "mlb", "pwhl",
"ahl", "ohl", "whl", "qmjhl", "odds")]
['cfb', 'mbb', 'mlb', 'nba', 'nfl', 'nhl', 'odds', 'pwhl', 'wbb', 'wnba']
Live endpoints are seasonal and occasionally rate-limited, and the
naming-convention loops below fan out many live calls at once β so a tiny
safe() helper runs every network call defensively. You get the frame when the
feed is up, and a friendly one-liner when it isn't β never a scary traceback.
That keeps this whole page runnable offline or in the off-season. π
def safe(label, thunk):
'''Run a live call; return its result, or print a one-liner and return None.'''
try:
out = thunk()
print(f"β
{label}")
return out
except Exception as e: # noqa: BLE001 -- demo resilience
print(f"βοΈ {label}: unavailable right now ({type(e).__name__})")
return None
# Odds is the only module that wants a (free) key β guard those cells:
HAS_KEY = bool(os.environ.get("ODDS_API_KEY"))
print("ODDS_API_KEY set:", HAS_KEY,
"β odds cells will" + ("" if HAS_KEY else " NOT") + " run live")
ODDS_API_KEY set: False β odds cells will NOT run live
π§ 2 Β· The naming-convention superpowerβ
Here's the centerpiece. sportsdataverse names things so predictably that knowing one function name tells you the others. The same style of data is exactly one rename away across every sport β swap the league slug and the call just works. Let's prove it. πͺ
πͺ The ESPN families are identical across every leagueβ
espn_<lg>_teams, espn_<lg>_team_roster, espn_<lg>_scoreboard,
espn_<lg>_standings exist for every ESPN league. A one-line helper +
getattr tours them all and returns the same shape each time.
def teams(league):
'''Knowing one name (espn_<lg>_teams) gives you all of them.'''
return getattr(sdv, f"espn_{league}_teams")()
rows = []
for lg in ["nba", "wnba", "nhl", "mlb"]:
df = safe(f"espn_{lg}_teams", lambda lg=lg: teams(lg))
rows.append({"league": lg.upper(),
"fn": f"espn_{lg}_teams()",
"n_teams": None if df is None else df.height,
"n_cols": None if df is None else df.width})
pl.DataFrame(rows) # same columns, same shape β one contract, four leagues
β
espn_nba_teams
β
espn_wnba_teams
β
espn_nhl_teams
β
espn_mlb_teams
shape: (4, 4)
ββββββββββ¬ββββββββββββββββββββ¬ββββββββββ¬βββββββββ
β league β fn β n_teams β n_cols β
β --- β --- β --- β --- β
β str β str β i64 β i64 β
ββββββββββͺββββββββββββββββ ββββͺββββββββββͺβββββββββ‘
β NBA β espn_nba_teams() β 30 β 14 β
β WNBA β espn_wnba_teams() β 15 β 14 β
β NHL β espn_nhl_teams() β 32 β 14 β
β MLB β espn_mlb_teams() β 30 β 14 β
ββββββββββ΄ββββββββββββββββββββ΄ββββββββββ΄βββββββββ
Same trick for the scoreboard and standings families β the call is identical, only the slug changes.
def call(family, league, **kw):
'''Generic dispatcher: call("scoreboard", "nhl") -> espn_nhl_scoreboard().'''
return getattr(sdv, f"espn_{league}_{family}")(**kw)
board = safe("espn_nfl_scoreboard", lambda: call("scoreboard", "nfl"))
stand = safe("espn_nba_standings", lambda: call("standings", "nba"))
print("NFL scoreboard rows:", None if board is None else board.height,
"| NBA standings rows:", None if stand is None else getattr(stand, "height", None))
β
espn_nfl_scoreboard
β
espn_nba_standings
NFL scoreboard rows: 16 | NBA standings rows: 30
π¦ The loaders follow one pattern tooβ
load_<sport>_pbp and load_<sport>_team_boxscore read pre-built parquet for
every sport β same signature (seasons=[...]), same return type. Knowing
load_nba_pbp means you already know load_nhl_pbp and load_mlb_pbp.
# A single getattr loop loads play-by-play for four different sports:
season = 2024
for sport in ["nba", "wnba", "nhl"]:
fn = getattr(sdv, f"load_{sport}_pbp")
print(f"load_{sport}_pbp(seasons=[{season}]) -> signature is identical for every sport")
# (we don't pull all of them here β that's a lot of parquet; Recipe 3 runs one.)
load_nba_pbp(seasons=[2024]) -> signature is identical for every sport
load_wnba_pbp(seasons=[2024]) -> signature is identical for every sport
load_nhl_pbp(seasons=[2024]) -> signature is identical for every sport
π The HockeyTech leagues share one surfaceβ
AHL / OHL / WHL / QMJHL / PWHL all expose <lg>_schedule, <lg>_standings,
<lg>_teams, <lg>_team_roster, and most_recent_<lg>_season. Learn one, you
learned all five.
import sportsdataverse.hockey.ahl as ahl
import sportsdataverse.hockey.ohl as ohl
import sportsdataverse.hockey.whl as whl
import sportsdataverse.hockey.qmjhl as qmjhl
import sportsdataverse.pwhl as pwhl
HOCKEYTECH = {"ahl": ahl, "ohl": ohl, "whl": whl, "qmjhl": qmjhl, "pwhl": pwhl}
rows = []
for lg, mod in HOCKEYTECH.items():
season = safe(f"most_recent_{lg}_season", getattr(mod, f"most_recent_{lg}_season"))
rows.append({"league": lg.upper(),
"schedule_fn": f"{lg}_schedule()",
"standings_fn": f"{lg}_standings()",
"season": season})
pl.DataFrame(rows)
β
most_recent_ahl_season
β
most_recent_ohl_season
β
most_recent_whl_season
β
most_recent_qmjhl_season
β
most_recent_pwhl_season
shape: (5, 4)
ββββββββββ¬βββββββββββββββββββ¬ββββββββββββββββββββ¬βββββββββ
β league β schedule_fn β standings_fn β season β
β --- β --- β --- β --- β
β str β str β str β i64 β
ββββββββββͺβββββββββββββββββββͺββββββββββββββββββββͺβββββββββ‘
β AHL β ahl_schedule() β ahl_standings() β 2026 β
β OHL β ohl_schedule() β ohl_standings() β 2027 β
β WHL β whl_schedule() β whl_standings() β 2026 β
β QMJHL β qmjhl_schedule() β qmjhl_standings() β 2027 β
β PWHL β pwhl_schedule() β pwhl_standings() β 2027 β
ββββββββββ΄βββββββββββββββββββ΄ββββββββββββββββββββ΄βββββββββ
π Discovery helpers β when you don't know the name yetβ
Four top-level helpers let you search the surface instead of guessing:
list_functions(league=None, search=..., parsers_only=..., wrappers_only=...)β list/search every wrapper.function_count(league=None)β how many functions each league exposes.find_team(name, league)β fuzzy team lookup (returns the ESPN team dict +id).find_athlete(name, league)β fuzzy player lookup.
# What does the package know about "scoreboard"? (grouped by league)
hits = sdv.list_functions(search="scoreboard")
for lg, fns in hits.items():
print(f"{lg:>4}: {', '.join(fns)}")
cfb: espn_cfb_scoreboard, scoreboard_event_parsing, yahoo_cfb_scoreboard
mbb: espn_mbb_scoreboard, scoreboard_event_parsing
mlb: espn_mlb_scoreboard
nba: espn_nba_scoreboard, scoreboard_event_parsing
nfl: espn_nfl_scoreboard, scoreboard_event_parsing
nhl: espn_nhl_scoreboard, nhl_scoreboard, parse_nhl_web_scoreboard, scoreboard_event_parsing
wbb: espn_wbb_scoreboard, scoreboard_event_parsing
wnba: espn_wnba_scoreboard, scoreboard_event_parsing
soccer: espn_soccer_scoreboard
cricket: espn_cricket_scoreboard
epl: espn_epl_scoreboard
laliga: espn_laliga_scoreboard
bundesliga: espn_bundesliga_scoreboard
seriea: espn_seriea_scoreboard
ligue1: espn_ligue1_scoreboard
mls: espn_mls_scoreboard
ligamx: espn_ligamx_scoreboard
ucl: espn_ucl_scoreboard
uel: espn_uel_scoreboard
nwsl: espn_nwsl_scoreboard
wwc: espn_wwc_scoreboard
wc: espn_wc_scoreboard
mch: espn_mch_scoreboard
wch: espn_wch_scoreboard
ufl: espn_ufl_scoreboard
xfl: espn_xfl_scoreboard
cfl: espn_cfl_scoreboard
college_baseball: espn_college_baseball_scoreboard
college_softball: espn_college_softball_scoreboard
# How big is each league's surface?
counts = sdv.function_count()
pl.DataFrame({"league": list(counts.keys()), "n_functions": list(counts.values())}) \
.sort("n_functions", descending=True)
shape: (34, 2)
ββββββββββ¬ββββββββββββββ
β league β n_functions β
β --- β --- β
β str β i64 β
ββββββββββͺββββββββββββββ‘
β nhl β 337 β
β mlb β 270 β
β nfl β 237 β
β cfb β 169 β
β wnba β 162 β
β β¦ β β¦ β
β pwhl β 44 β
β ahl β 14 β
β ohl β 14 β
β qmjhl β 14 β
β whl β 14 β
ββββββββββ΄ββββββββββββββ
# Fuzzy lookups β no IDs to memorize:
team = sdv.find_team("Lakers", "nba")
ath = sdv.find_athlete("LeBron", "nba")
print("team ->", None if team is None else f"{team['displayName']} (id={team['id']})")
print("athlete ->", None if ath is None else f"{ath['displayName']} (id={ath['id']})")
team -> Los Angeles Lakers (id=13)
athlete -> LeBron James (id=1966)
π³ 3 Β· Twenty cross-sport recipesβ
Now the fun part β 20 runnable recipes that show the breadth and the overlap. Every recipe is defensively guarded, so a flaky network or off-season just prints a friendly note instead of erroring. Mix, match, and remix. π
Recipe 1 β Any league's teams πͺβ
teams("<lg>") (our helper from above) hits espn_<lg>_teams for any ESPN
league. Here's the WBB team list.
wbb_teams = safe("espn_wbb_teams", lambda: teams("wbb"))
cols = ["team_id", "team_abbreviation", "team_display_name", "team_location"]
(wbb_teams.select([c for c in cols if c in wbb_teams.columns]).head()
if wbb_teams is not None and wbb_teams.height else "WBB teams unavailable right now")
β
espn_wbb_teams
shape: (5, 4)
βββββββββββ¬ββββββββββββββββββββ¬βββββββββββββββββββββββββββββ¬ββββββββββββββββββββ
β team_id β team_abbreviation β team_display_name β team_location β
β --- β --- β --- β --- β
β str β str β str β str β
βββββββββββͺββββββββββββββββββββͺβββββββββββββββββββββββββββββͺββββββββββββββββββββ‘
β 2000 β ACU β Abilene Christian Wildcats β Abilene Christian β
β 2005 β AF β Air Force Falcons β Air Force β
β 2006 β AKR β Akron Zips β Akron β
β 2010 β AAMU β Alabama A&M Bulldogs β Alabama A&M β
β 333 β ALA β Alabama Crimson Tide β Alabama β
βββββββββββ΄ββββββββββββββββββββ΄βββββββββββββββββββββββββββββ΄ββββββββββββββββββββ
Recipe 2 β Any league's scoreboard πβ
espn_<lg>_scoreboard() returns today's slate as a tidy frame. Same call for
MLB, NBA, NHL β just change the slug.
sb = safe("espn_mlb_scoreboard", lambda: sdv.espn_mlb_scoreboard())
(sb.head() if sb is not None and getattr(sb, "height", 0)
else "no MLB games on the board right now")
β
espn_mlb_scoreboard
shape: (5, 50)
βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬βββββββββββ
β game_id β uid β date β name β β¦ β away_logo β away_scor β away_winn β away_ran β
β --- β --- β --- β --- β β --- β e β er β k β
β str β str β str β str β β str β --- β --- β --- β
β β β β β β β str β bool β str β
βββββββββββββͺββββββββββββͺββββββββββββͺββββββββββββͺββββͺββββββββββββͺββββββββββββͺββββββββββββͺβββββββββββ‘
β 401815776 β s:1~l:10~ β 2026-06-1 β Miami β β¦ β https://a β 2 β false β null β
β β e:4018157 β 6T22:40Z β Marlins β β .espncdn. β β β β
β β 76 β β at Philad β β com/i/tea β β β β
β β β β elphia β¦ β β mloβ¦ β β β β
β 401815775 β s:1~l:10~ β 2026-06-1 β Kansas β β¦ β https://a β 4 β false β null β
β β e:4018157 β 6T22:45Z β City β β .espncdn. β β β β
β β 75 β β Royals at β β com/i/tea β β β β
β β β β Washingtβ¦ β β mloβ¦ β β β β
β 401815779 β s:1~l:10~ β 2026-06-1 β Toronto β β¦ β https://a β 6 β true β null β
β β e:4018157 β 6T22:45Z β Blue Jays β β .espncdn. β β β β
β β 79 β β at Boston β β com/i/tea β β β β
β β β β Reβ¦ β β mloβ¦ β β β β
β 401815774 β s:1~l:10~ β 2026-06-1 β Chicago β β¦ β https://a β 2 β false β null β
β β e:4018157 β 6T23:05Z β White Sox β β .espncdn. β β β β
β β 74 β β at New β β com/i/tea β β β β
β β β β York β¦ β β mloβ¦ β β β β
β 401815777 β s:1~l:10~ β 2026-06-1 β New York β β¦ β https://a β 3 β false β null β
β β e:4018157 β 6T23:10Z β Mets at β β .espncdn. β β β β
β β 77 β β Cincinnat β β com/i/tea β β β β
β β β β i Reβ¦ β β mloβ¦ β β β β
βββββββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββββββ΄βββββββββββ
Recipe 3 β Load any sport's season play-by-play π¦β
load_<sport>_pbp(seasons=[...]) reads the parquet release. One sport here
(WNBA, a smaller season) to keep the download light.
wnba_pbp = safe("load_wnba_pbp([2024])", lambda: sdv.load_wnba_pbp(seasons=[2024]))
print("WNBA 2024 pbp rows:", None if wnba_pbp is None else wnba_pbp.height)
(wnba_pbp.select([c for c in ["game_id", "period_number", "clock_display_value", "text"]
if c in wnba_pbp.columns]).head()
if wnba_pbp is not None and wnba_pbp.height else "pbp unavailable right now")
β
load_wnba_pbp([2024])
WNBA 2024 pbp rows: 101501
shape: (5, 4)
βββββββββββββ¬ββββββββββββββββ¬ββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β game_id β period_number β clock_display_value β text β
β --- β --- β --- β --- β
β i32 β i32 β str β str β
βββββββββββββͺββββββββββββββββͺββββββββββββββββββββββͺββββββββββββββββββββββββββββββββββ‘
β 401726992 β 1 β 10:00 β Napheesa Collier vs. Jonquel Jβ¦ β
β 401726992 β 1 β 9:35 β Napheesa Collier makes 3-foot β¦ β
β 401726992 β 1 β 9:12 β Sabrina Ionescu misses 24-footβ¦ β
β 401726992 β 1 β 9:09 β Bridget Carleton defensive rebβ¦ β
β 401726992 β 1 β 8:55 β Betnijah Laney-Hamilton personβ¦ β
βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
Recipe 4 β The same box-score shape for two different sports πͺβ
load_<sport>_team_boxscore returns the same kind of frame for basketball and
hockey. Load one season of each and compare the shapes.
nba_box = safe("load_nba_team_boxscore([2024])", lambda: sdv.load_nba_team_boxscore(seasons=[2024]))
nhl_box = safe("load_nhl_team_boxscore([2024])", lambda: sdv.load_nhl_team_boxscore(seasons=[2024]))
print("NBA team-box shape:", None if nba_box is None else nba_box.shape)
print("NHL team-box shape:", None if nhl_box is None else nhl_box.shape)
β
load_nba_team_boxscore([2024])
β
load_nhl_team_boxscore([2024])
NBA team-box shape: (2640, 57)
NHL team-box shape: (2800, 19)
Recipe 5 β Standings for several leagues at once πβ
One loop over espn_<lg>_standings tours basketball, hockey, and baseball.
rows = []
for lg in ["nba", "nhl", "mlb"]:
df = safe(f"espn_{lg}_standings", lambda lg=lg: getattr(sdv, f"espn_{lg}_standings")())
rows.append({"league": lg.upper(),
"rows": None if df is None else getattr(df, "height", None),
"cols": None if df is None else getattr(df, "width", None)})
pl.DataFrame(rows)
β
espn_nba_standings
β
espn_nhl_standings
β
espn_mlb_standings
shape: (3, 3)
ββββββββββ¬βββββββ¬βββββββ
β league β rows β cols β
β --- β --- β --- β
β str β i64 β i64 β
ββββββββββͺβββββββͺβββββββ‘
β NBA β 30 β 31 β
β NHL β 32 β 35 β
β MLB β 30 β 46 β
ββββββββββ΄βββββββ΄βββββββ
Recipe 6 β Find a team by name πβ
find_team fuzzy-matches across the ESPN leagues and hands back the team dict
(with its id, ready to feed into a roster call).
for nm, lg in [("Patriots", "nfl"), ("Yankees", "mlb"), ("Bruins", "nhl"), ("Crimson Tide", "cfb")]:
t = sdv.find_team(nm, lg)
print(f"{lg:>3} {nm:<14} -> {None if t is None else t['displayName']} (id={None if t is None else t['id']})")
nfl Patriots -> New England Patriots (id=17)
mlb Yankees -> New York Yankees (id=10)
nhl Bruins -> Boston Bruins (id=1)
cfb Crimson Tide -> Alabama Crimson Tide (id=333)
Recipe 7 β Find an athlete by name πβ
find_athlete does the same for players β great for grabbing an ESPN athlete
id without leaving Python.
for nm, lg in [("Caitlin Clark", "wnba"), ("Patrick Mahomes", "nfl"), ("Connor McDavid", "nhl")]:
a = sdv.find_athlete(nm, lg)
print(f"{lg:>4} {nm:<16} -> {None if a is None else a['displayName']} (id={None if a is None else a['id']})")
wnba Caitlin Clark -> None (id=None)
nfl Patrick Mahomes -> Patrick Mahomes (id=3139477)
nhl Connor McDavid -> Connor McDavid (id=3895074)
Recipe 8 β A team and its roster, end to end π₯β
Chain find_team β espn_<lg>_team_roster: look up an ID by name, then pull the
roster. The roster wrapper is parsed to polars by default.
lal = sdv.find_team("Lakers", "nba")
roster = None
if lal is not None:
roster = safe(f"espn_nba_team_roster(team_id={lal['id']})",
lambda: sdv.espn_nba_team_roster(team_id=lal["id"], return_as_pandas=False))
(roster.head() if roster is not None and getattr(roster, "height", 0)
else "roster unavailable right now")
β
espn_nba_team_roster(team_id=13)
shape: (5, 68)
βββββββββββ¬βββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ¬ββββββββββββ
β id β uid β guid β first_nam β β¦ β birth_pla β hand_type β hand_abbr β hand_disp β
β --- β --- β --- β e β β ce_state β --- β eviation β lay_value β
β str β str β str β --- β β --- β str β --- β --- β
β β β β str β β str β β str β str β
βββββββββββͺβββββββββββββͺββββββββββββͺββββββββββββͺββββͺββββββββββββͺββββββββββββͺββββββββββββͺββββββββββββ‘
β 4278129 β s:40~l:46~ β 9af41ea8- β Deandre β β¦ β null β null β null β null β
β β a:4278129 β a24c-025f β β β β β β β
β β β -a63f-826 β β β β β β β
β β β 3fbβ¦ β β β β β β β
β 3945274 β s:40~l:46~ β 583794eb- β Luka β β¦ β null β null β null β null β
β β a:3945274 β 0f38-9bbd β β β β β β β
β β β -3e25-9dd β β β β β β β
β β β 33bβ¦ β β β β β β β
β 4066648 β s:40~l:46~ β 40c1bcf6- β Rui β β¦ β null β null β null β null β
β β a:4066648 β 675b-f217 β β β β β β β
β β β -f97c-1d6 β β β β β β β
β β β 280β¦ β β β β β β β
β 4397077 β s:40~l:46~ β 4cd92ac1- β Jaxson β β¦ β OK β null β null β null β
β β a:4397077 β 73ce-653d β β β β β β β
β β β -c3b1-9c6 β β β β β β β
β β β 8e9β¦ β β β β β β β
β 4683774 β s:40~l:46~ β 456f71fd- β Bronny β β¦ β OH β null β null β null β
β β a:4683774 β 2ce5-3f50 β β β β β β β
β β β -8d0d-f30 β β β β β β β
β β β c01β¦ β β β β β β β
βββββββββββ΄βββββββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββββββ΄ββββββββββββ
Recipe 9 β polars β pandas in one keyword πΌβ
Every wrapper honors return_as_pandas=True. Same data, different frame β handy
when the next step (sklearn, statsmodels, seaborn) wants pandas.
teams_pl = safe("espn_wnba_teams (polars)", lambda: sdv.espn_wnba_teams())
teams_pd = safe("espn_wnba_teams (pandas)", lambda: sdv.espn_wnba_teams(return_as_pandas=True))
print("polars:", type(teams_pl).__name__, None if teams_pl is None else teams_pl.shape)
print("pandas:", type(teams_pd).__name__, None if teams_pd is None else teams_pd.shape)
β
espn_wnba_teams (polars)
β
espn_wnba_teams (pandas)
polars: DataFrame (15, 14)
pandas: DataFrame (15, 14)
Recipe 10 β The return_parsed toggle on a native API ποΈβ
Native API wrappers parse to polars by default; return_parsed=False hands back
the raw JSON Dict straight from the league feed.
parsed = safe("nhl_standings (parsed)", lambda: sdv.nhl.nhl_standings())
raw = safe("nhl_standings (raw dict)", lambda: sdv.nhl.nhl_standings(return_parsed=False))
print("parsed ->", type(parsed).__name__, None if parsed is None else getattr(parsed, "shape", None))
print("raw ->", type(raw).__name__, "(top-level keys:", None if not isinstance(raw, dict) else list(raw.keys())[:4], ")")
β
nhl_standings (parsed)
β
nhl_standings (raw dict)
parsed -> DataFrame (32, 84)
raw -> dict (top-level keys: ['wildCardIndicator', 'standingsDateTimeUtc', 'standings'] )
Recipe 11 β π Premium NFL pull (api.nfl.com)β
nfl_standings() hits the league's own API and returns one tidy row per team.
nfl_st = safe("nfl_standings (api.nfl.com)", lambda: sdv.nfl.nfl_standings(season=2024, week=18))
cols = ["team_abbr", "team_full_name", "overall_wins", "overall_losses",
"division_name", "conference_name"]
(nfl_st.select([c for c in cols if c in nfl_st.columns]).head(8)
if nfl_st is not None and getattr(nfl_st, "height", 0) else "NFL standings unavailable right now")
β
nfl_standings (api.nfl.com)
shape: (8, 3)
ββββββββββββββββββββββ¬βββββββββββββββ¬βββββββββββββββββ
β team_full_name β overall_wins β overall_losses β
β --- β --- β --- β
β str β i64 β i64 β
ββββββββββββββββββββββͺβββββββββββββββͺβββββββββββββββββ‘
β Arizona Cardinals β 8 β 9 β
β Atlanta Falcons β 8 β 9 β
β Baltimore Ravens β 12 β 5 β
β Buffalo Bills β 13 β 4 β
β Carolina Panthers β 5 β 12 β
β Chicago Bears β 5 β 12 β
β Cincinnati Bengals β 9 β 8 β
β Cleveland Browns β 3 β 14 β
ββββββββββββββββββββββ΄βββββββββββββββ΄βββββββββββββββββ
Recipe 12 β βΎ Premium MLB pull (MLB Stats API + parser)β
mlb_* wrappers return the raw Dict; pair them with the matching
parse_mlb_api_* for a tidy frame. Here's division standings, parsed.
def mlb_standings():
raw = sdv.mlb.mlb_standings(league_id="103,104", season=2024)
return sdv.mlb.parse_mlb_api_standings(raw)
mlb_st = safe("MLB standings (Stats API + parser)", mlb_standings)
keep = ["standings_division_name", "team_name", "wins", "losses", "winning_percentage", "games_back"]
(mlb_st.select([c for c in keep if c in mlb_st.columns]).head(10)
if mlb_st is not None and getattr(mlb_st, "height", 0) else "MLB standings unavailable right now")
β
MLB standings (Stats API + parser)
shape: (10, 6)
βββββββββββββββββββββββββββ¬ββββββββββββ¬βββββββ¬βββββββββ¬βββββββββββββββββββββ¬βββββββββββββ
β standings_division_name β team_name β wins β losses β winning_percentage β games_back β
β --- β --- β --- β --- β --- β --- β
β str β str β i64 β i64 β str β str β
βββββββββββββββββββββββββββͺββββββββββββͺβββββββͺβββββββββͺβββββββββββββββββββββͺβββββββββββββ‘
β null β Yankees β 94 β 68 β .580 β - β
β null β Orioles β 91 β 71 β .562 β 3.0 β
β null β Red Sox β 81 β 81 β .500 β 13.0 β
β null β Rays β 80 β 82 β .494 β 14.0 β
β null β Blue Jays β 74 β 88 β .457 β 20.0 β
β null β Guardians β 92 β 69 β .571 β - β
β null β Royals β 86 β 76 β .531 β 6.5 β
β null β Tigers β 86 β 76 β .531 β 6.5 β
β null β Twins β 82 β 80 β .506 β 10.5 β
β null β White Sox β 41 β 121 β .253 β 51.5 β
βββββββββββββββββββββββββββ΄ββββββββββββ΄βββββββ΄βββββββββ΄βββββββββββββββββββββ΄βββββββββββββ
Recipe 13 β βΎ MLB Statcast β the premium tracking firehoseβ
mlb_statcast_search() returns one row per pitch β the raw Baseball Savant tracking
data. Grab a single day and pull a few of the most useful columns.
pitches = safe("mlb_statcast_search (1 day)",
lambda: sdv.mlb.mlb_statcast_search(start_dt="2024-07-01", end_dt="2024-07-01"))
show = [c for c in ["game_date", "player_name", "pitch_type", "release_speed",
"launch_speed", "launch_angle", "events"]
if pitches is not None and c in pitches.columns]
(pitches.select(show).head(10)
if pitches is not None and getattr(pitches, "height", 0) else "no Statcast rows for that day right now")
β
mlb_statcast_search (1 day)
shape: (10, 7)
ββββββββββββββ¬ββββββββββββββββ¬βββββββββββββ¬βββββββββββββββ¬βββββββββββββββ¬βββββββββββββββ¬ββββββββββββ
β game_date β player_name β pitch_type β release_spee β launch_speed β launch_angle β events β
β --- β --- β --- β d β --- β --- β --- β
β str β str β str β --- β f64 β f64 β str β
β β β β f64 β β β β
ββββββββββββββͺββββββββββββββββͺβββββββββββββͺβββββββββββββββͺβββββββββββββββͺβββββββββββββββͺββββββββββββ‘
β 2024-07-01 β Alonso, Pete β FF β 94.8 β 78.0 β 46.0 β field_out β
β 2024-07-01 β Alonso, Pete β FF β 96.6 β null β null β null β
β 2024-07-01 β Alonso, Pete β FF β 96.3 β 73.3 β 20.0 β null β
β 2024-07-01 β Alonso, Pete β FF β 97.2 β null β null β null β
β 2024-07-01 β Alonso, Pete β FF β 95.6 β null β null β null β
β 2024-07-01 β Alonso, Pete β FF β 95.8 β null β null β null β
β 2024-07-01 β Varsho, β FF β 97.4 β null β null β strikeout β
β β Daulton β β β β β β
β 2024-07-01 β Varsho, β KC β 84.0 β 94.3 β -12.0 β null β
β β Daulton β β β β β β
β 2024-07-01 β Martinez, β FF β 97.5 β null β null β strikeout β
β β J.D. β β β β β β
β 2024-07-01 β Martinez, β FF β 96.6 β null β null β null β
β β J.D. β β β β β β
ββββββββββββββ΄ββββββββββββββββ΄βββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄βββββββββββββββ΄ββββββββββββ
Recipe 14 β π Premium NHL pull (api-web)β
nhl_standings() reads the modern NHL api-web feed β one row per team, parsed
to polars.
nhl_st = safe("nhl_standings (api-web)", lambda: sdv.nhl.nhl_standings())
keep = ["team_abbrev", "team_name", "wins", "losses", "ot_losses", "points",
"conference_name", "division_name"]
(nhl_st.select([c for c in keep if c in nhl_st.columns]).head(8)
if nhl_st is not None and getattr(nhl_st, "height", 0) else "NHL standings unavailable right now")
β
nhl_standings (api-web)
shape: (8, 6)
ββββββββ¬βββββββββ¬ββββββββββββ¬βββββββββ¬ββββββββββββββββββ¬ββββββββββββββββ
β wins β losses β ot_losses β points β conference_name β division_name β
β --- β --- β --- β --- β --- β --- β
β i64 β i64 β i64 β i64 β str β str β
ββββββββͺβββββββββͺββββββββββββͺβββββββββͺββββββββββββββββββͺββββββββββββββββ‘
β 55 β 16 β 11 β 121 β Western β Central β
β 53 β 22 β 7 β 113 β Eastern β Metropolitan β
β 50 β 20 β 12 β 112 β Western β Central β
β 50 β 23 β 9 β 109 β Eastern β Atlantic β
β 50 β 26 β 6 β 106 β Eastern β Atlantic β
β 48 β 24 β 10 β 106 β Eastern β Atlantic β
β 46 β 24 β 12 β 104 β Western β Central β
β 45 β 27 β 10 β 100 β Eastern β Atlantic β
ββββββββ΄βββββββββ΄ββββββββββββ΄βββββββββ΄β βββββββββββββββββ΄ββββββββββββββββ
Recipe 15 β π NHL EDGE tracking leaderboardβ
NHL EDGE is the league's player- and puck-tracking system. The
nhl_edge_skater_speed_top_10 board surfaces the fastest skating bursts.
edge = safe("nhl_edge_skater_speed_top_10",
lambda: sdv.nhl.nhl_edge_skater_speed_top_10(positions="forwards",
sort_by="maxskatingspeed"))
(edge.head(10) if edge is not None and getattr(edge, "height", 0)
else "NHL EDGE leaderboard unavailable right now")
βοΈ nhl_edge_skater_speed_top_10: unavailable right now (NoESPNDataError)
'NHL EDGE leaderboard unavailable right now'
Recipe 16 β π Premium PWHL pull (HockeyTech)β
The women's pro league rides the HockeyTech feed. pwhl_standings() returns the
table; load_pwhl_schedules() reads the parquet release for a whole season.
pwhl_st = safe("pwhl_standings", lambda: sdv.pwhl.pwhl_standings(season=sdv.pwhl.most_recent_pwhl_season()))
pwhl_sched = safe("load_pwhl_schedules([2024])", lambda: sdv.pwhl.load_pwhl_schedules(seasons=[2024]))
print("standings rows:", None if pwhl_st is None else getattr(pwhl_st, "height", None),
"| schedule rows:", None if pwhl_sched is None else getattr(pwhl_sched, "height", None))
(pwhl_st.head() if pwhl_st is not None and getattr(pwhl_st, "height", 0)
else "PWHL standings unavailable right now")
βοΈ pwhl_standings: unavailable right now (ValueError)
β
load_pwhl_schedules([2024])
standings rows: None | schedule rows: 85
'PWHL standings unavailable right now'
Recipe 17 β π Junior hockey: schedule for all four CHL/AHL loops πβ
Because AHL/OHL/WHL/QMJHL share one surface, a single loop tours every league's schedule.
rows = []
for lg, mod in {"ahl": ahl, "ohl": ohl, "whl": whl, "qmjhl": qmjhl}.items():
season = safe(f"{lg} season", getattr(mod, f"most_recent_{lg}_season"))
sch = (safe(f"{lg}_schedule", lambda mod=mod, lg=lg: getattr(mod, f"{lg}_schedule")())
if season else None)
rows.append({"league": lg.upper(), "season": season,
"games": None if sch is None else getattr(sch, "height", None)})
pl.DataFrame(rows)
β
ahl season
β
ahl_schedule
β
ohl season
β
ohl_schedule
β
whl season
β
whl_schedule
β
qmjhl season
β
qmjhl_schedule
shape: (4, 3)
ββββββββββ¬βββββββββ¬ββββββββ
β league β season β games β
β --- β --- β --- β
β str β i64 β i64 β
ββββββββββͺβββββββββͺββββββββ‘
β AHL β 2026 β 10000 β
β OHL β 2027 β 10000 β
β WHL β 2026 β 10000 β
β QMJHL β 2027 β 10000 β
ββββββββββ΄βββββββββ΄ββββββββ
Recipe 18 β π² A quick odds peek (key-guarded)β
odds.toa_sports() lists every in-season sport/league key β it's free
(doesn't touch your quota). Set a free ODDS_API_KEY to light it up.
if HAS_KEY:
sports = safe("odds.toa_sports", lambda: odds.toa_sports(all_sports=False))
out = (sports.select([c for c in ["key", "group", "title"] if c in sports.columns]).head(10)
if sports is not None and getattr(sports, "height", 0) else "no in-season sports returned")
else:
out = "set ODDS_API_KEY to run: odds.toa_sports() (free, doesn't touch quota)"
out
"set ODDS_API_KEY to run: odds.toa_sports() (free, doesn't touch quota)"
Recipe 19 β π² Live odds for a league (key-guarded)β
odds.toa_sports_odds() returns long-format odds β one row per
event Γ book Γ market Γ outcome β exactly the shape you want for modelling.
if HAS_KEY:
board = safe("odds.toa_sports_odds (NFL h2h)",
lambda: odds.toa_sports_odds(sport="americanfootball_nfl", regions="us", markets="h2h"))
keep = ["home_team", "away_team", "bookmaker_key", "market_key", "outcome_name", "outcome_price"]
out = (board.select([c for c in keep if c in board.columns]).head(10)
if board is not None and getattr(board, "height", 0) else "no NFL odds on the board right now")
else:
out = "set ODDS_API_KEY to run: odds.toa_sports_odds(sport='americanfootball_nfl')"
out
"set ODDS_API_KEY to run: odds.toa_sports_odds(sport='americanfootball_nfl')"
Recipe 20 β Count the whole surface, per league π’β
function_count() returns the exposed-function tally for every league β a quick
sense of how much each sport gives you. (HockeyTech + odds modules are counted in
their own submodules.)
counts = sdv.function_count()
df = (pl.DataFrame({"league": list(counts.keys()), "n_functions": list(counts.values())})
.sort("n_functions", descending=True))
print("Total wrappers across the counted leagues:", sum(counts.values()))
df
Total wrappers across the counted leagues: 4063
shape: (34, 2)
ββββββββββ¬ββββββββββββββ
β league β n_functions β
β --- β --- β
β str β i64 β
ββββββββββͺββββββββββββββ‘
β nhl β 337 β
β mlb β 270 β
β nfl β 237 β
β cfb β 169 β
β wnba β 162 β
β β¦ β β¦ β
β pwhl β 44 β
β ahl β 14 β
β ohl β 14 β
β qmjhl β 14 β
β whl β 14 β
ββββββββββ΄ββββββββββββββ
π Where to nextβ
You've now seen the whole map β every datasource, the naming contract that makes the package guessable, and 20 recipes spanning ten-plus leagues. Each sport has a dedicated tutorial that leads with its premium endpoints:
02_cfb_introβ π college football03_nfl_introβ π NFL (api.nfl.com+ nflverse)04_nba_introβ π NBA05_wbb_introβ π NCAA women's basketball06_mbb_introβ π NCAA men's basketball07_nhl_introβ π NHL (api-web+ EDGE + ESPN)08_wnba_introβ π WNBA09_mlb_introβ βΎ MLB (Stats API + Statcast + ESPN)10_pwhl_introβ π PWHL11_junior_hockey_introβ π AHL / OHL / WHL / QMJHL12_odds_introβ π² Betting odds (The Odds API)
Reference indexes: NBA Β· WNBA Β· MBB Β· WBB Β· NFL Β· CFB Β· MLB Β· NHL Β· PWHL Β· AHL Β· OHL Β· WHL Β· QMJHL Β· Odds.
Part of the SportsDataverse β the names here mirror the R sisters (hoopR, wehoop, cfbfastR, baseballr, fastRhockey, oddsapiR). Now go build something great! π