Skip to main content
Version: Next 🚧

sportsdataverse-py

Lifecycle PyPI Contributors Twitter
Follow

See CHANGELOG.md for details.

The goal of sportsdataverse-py is to provide the community with a Python package for working with sports data as a companion to the cfbfastR, hoopR, and wehoop R packages. Beyond data aggregation and tidying ease, one of the multitude of services that sportsdataverse-py provides is for benchmarking open-source expected points and win probability metrics for American Football.

Quickstart​

pip install sportsdataverse
# Today's NBA scoreboard as a polars DataFrame — no kwargs needed via parsed.*
from sportsdataverse.parsed.nba import espn_nba_scoreboard
df = espn_nba_scoreboard() # → polars

# Or via the original module with the return_parsed=True opt-in:
from sportsdataverse.nba import espn_nba_scoreboard
df = espn_nba_scoreboard(return_parsed=True)
print(df.select(["event_id", "home_name", "away_name",
"home_score", "away_score"]).head())

# Aaron Judge's 2024 season stats from the official MLB API
from sportsdataverse.mlb import mlb_api_person_stats, parse_mlb_api_person_stats
judge = parse_mlb_api_person_stats(
mlb_api_person_stats(person_id=592450, stats="season", season=2024)
)
print(judge.select(["stats_group", "stat_home_runs", "stat_avg"]))

# Connor McDavid's 2024-25 EDGE skating speed profile
from sportsdataverse.nhl import nhl_edge_skater_detail, parse_edge_detail
mcdavid = parse_edge_detail(nhl_edge_skater_detail(8478402))
print(mcdavid.select(["player_first_name_default", "top_shot_speed_metric"]))

Every wrapper returns a raw Dict by default; pass return_parsed=True (ESPN cross-league wrappers) or compose with the matching parse_* function (NHL / MLB sibling APIs) to get a polars DataFrame. See Polars / pandas parser layer below.

Supported leagues and data sources​

LeagueModuleSurfaces covered
NBAsportsdataverse.nbaESPN (Site v2 + Web v3 + Core v2) — 118 wrappers
WNBAsportsdataverse.wnbaESPN — 124 wrappers
MBB (NCAA M)sportsdataverse.mbbESPN + NCAA-only (bracketology, rankings, recruits) — 121 wrappers
WBB (NCAA W)sportsdataverse.wbbESPN + NCAA-only — 126 wrappers
CFBsportsdataverse.cfbESPN + NCAA + football-only (QBR) — 123 wrappers
NFLsportsdataverse.nflESPN + football-only (QBR) — 119 wrappers
MLBsportsdataverse.mlbESPN + MLB Stats API (statsapi.mlb.com) + Baseball Savant / Statcast — 175 wrappers
NHLsportsdataverse.nhlapi-web.nhle.com/v1/ (game-feed) + NHL EDGE (player tracking) + Stats REST + Records site — 132 wrappers
Total~1,030 wrappers

Polars / pandas parser layer​

Every wrapper returns raw Dict by default. The parser layer in sportsdataverse._common_espn_parsers (plus matching modules for the MLB and NHL sibling APIs) turns those payloads into tidy polars (or pandas) DataFrames.

For ESPN wrappers, pass return_parsed=True to get a DataFrame directly — the raw-Dict contract is unchanged when the kwarg is omitted, so existing callers are unaffected:

from sportsdataverse.nba import espn_nba_team_roster

raw = espn_nba_team_roster(team_id=13) # → Dict (default)
df = espn_nba_team_roster(team_id=13, return_parsed=True) # → polars
pdf = espn_nba_team_roster(team_id=13,
return_parsed=True,
return_as_pandas=True) # → pandas

For NHL / MLB sibling-API wrappers, compose the wrapper with its parser:

from sportsdataverse.nhl import nhl_web_pbp, parse_nhl_web_pbp
df = parse_nhl_web_pbp(nhl_web_pbp(2023030417)) # 331-row polars frame

See the Architecture and Parsers pages for full details.

Installation​

sportsdataverse-py can be installed via pip:

pip install sportsdataverse

or from the repo (which may at times be more up to date):

git clone https://github.com/sportsdataverse/sportsdataverse-py
cd sportsdataverse-py
pip install -e .

Our Authors

Citations​

To cite the sportsdataverse-py Python package in publications, use:

BibTex Citation

@misc{gilani_sdvpy_2021,
author = {Gilani, Saiem},
title = {sportsdataverse-py: The SportsDataverse's Python Package for Sports Data.},
url = {https://py.sportsdataverse.org},
season = {2021}
}