sportsdataverse-py 
See CHANGELOG.md for details.
The goal of sportsdataverse-py is to provide the community with a Python package for working with sports data as a companion to the cfbfastR, hoopR, and wehoop R packages. Beyond data aggregation and tidying ease, one of the multitude of services that sportsdataverse-py provides is for benchmarking open-source expected points and win probability metrics for American Football.
Quickstart​
pip install sportsdataverse
# Today's NBA scoreboard as a polars DataFrame — no kwargs needed via parsed.*
from sportsdataverse.parsed.nba import espn_nba_scoreboard
df = espn_nba_scoreboard() # → polars
# Or via the original module with the return_parsed=True opt-in:
from sportsdataverse.nba import espn_nba_scoreboard
df = espn_nba_scoreboard(return_parsed=True)
print(df.select(["event_id", "home_name", "away_name",
"home_score", "away_score"]).head())
# Aaron Judge's 2024 season stats from the official MLB API
from sportsdataverse.mlb import mlb_api_person_stats, parse_mlb_api_person_stats
judge = parse_mlb_api_person_stats(
mlb_api_person_stats(person_id=592450, stats="season", season=2024)
)
print(judge.select(["stats_group", "stat_home_runs", "stat_avg"]))
# Connor McDavid's 2024-25 EDGE skating speed profile
from sportsdataverse.nhl import nhl_edge_skater_detail, parse_edge_detail
mcdavid = parse_edge_detail(nhl_edge_skater_detail(8478402))
print(mcdavid.select(["player_first_name_default", "top_shot_speed_metric"]))
Every wrapper returns a raw Dict by default; pass
return_parsed=True (ESPN cross-league wrappers) or compose with the
matching parse_* function (NHL / MLB sibling APIs) to get a polars
DataFrame. See Polars / pandas parser layer
below.
Supported leagues and data sources​
| League | Module | Surfaces covered |
|---|---|---|
| NBA | sportsdataverse.nba | ESPN (Site v2 + Web v3 + Core v2) — 118 wrappers |
| WNBA | sportsdataverse.wnba | ESPN — 124 wrappers |
| MBB (NCAA M) | sportsdataverse.mbb | ESPN + NCAA-only (bracketology, rankings, recruits) — 121 wrappers |
| WBB (NCAA W) | sportsdataverse.wbb | ESPN + NCAA-only — 126 wrappers |
| CFB | sportsdataverse.cfb | ESPN + NCAA + football-only (QBR) — 123 wrappers |
| NFL | sportsdataverse.nfl | ESPN + football-only (QBR) — 119 wrappers |
| MLB | sportsdataverse.mlb | ESPN + MLB Stats API (statsapi.mlb.com) + Baseball Savant / Statcast — 175 wrappers |
| NHL | sportsdataverse.nhl | api-web.nhle.com/v1/ (game-feed) + NHL EDGE (player tracking) + Stats REST + Records site — 132 wrappers |
| Total | ~1,030 wrappers |
Polars / pandas parser layer​
Every wrapper returns raw Dict by default. The parser layer in
sportsdataverse._common_espn_parsers
(plus matching modules for the MLB and NHL sibling APIs) turns those
payloads into tidy polars (or pandas) DataFrames.
For ESPN wrappers, pass return_parsed=True to get a DataFrame
directly — the raw-Dict contract is unchanged when the kwarg is
omitted, so existing callers are unaffected:
from sportsdataverse.nba import espn_nba_team_roster
raw = espn_nba_team_roster(team_id=13) # → Dict (default)
df = espn_nba_team_roster(team_id=13, return_parsed=True) # → polars
pdf = espn_nba_team_roster(team_id=13,
return_parsed=True,
return_as_pandas=True) # → pandas
For NHL / MLB sibling-API wrappers, compose the wrapper with its parser:
from sportsdataverse.nhl import nhl_web_pbp, parse_nhl_web_pbp
df = parse_nhl_web_pbp(nhl_web_pbp(2023030417)) # 331-row polars frame
See the Architecture and Parsers pages for full details.
Installation​
sportsdataverse-py can be installed via pip:
pip install sportsdataverse
or from the repo (which may at times be more up to date):
git clone https://github.com/sportsdataverse/sportsdataverse-py
cd sportsdataverse-py
pip install -e .
Our Authors
Citations​
To cite the sportsdataverse-py Python package in publications, use:
BibTex Citation
@misc{gilani_sdvpy_2021,
author = {Gilani, Saiem},
title = {sportsdataverse-py: The SportsDataverse's Python Package for Sports Data.},
url = {https://py.sportsdataverse.org},
season = {2021}
}