Skip to main content
Version: Next 🚧

Table of Contents generated with DocToc

sportsdataverse.nhl package

Submodules

sportsdataverse.nhl.nhl_api module

sportsdataverse.nhl.nhl_api.nhl_api_pbp(game_id: int, **kwargs) → Dict

nhl_api_pbp() - Pull the game by id. Data from API endpoints - nhl/playbyplay, nhl/summary

  • Parameters: game_id (int) – Unique game_id, can be obtained from nhl_schedule().
  • Returns: Dictionary of game data with keys - “gameId”, “plays”, “boxscore”, “header”, “broadcasts”, : ”videos”, “playByPlaySource”, “standings”, “leaders”, “seasonseries”, “pickcenter”, “againstTheSpread”, “odds”, “onIce”, “gameInfo”, “season”
  • Return type: Dict

Example

Pull a single game’s metadata via the legacy NHL Stats API endpoint:

from sportsdataverse.nhl import nhl_api_pbp
game = nhl_api_pbp(game_id=2021020079)
sorted(game.keys()) # ['datetime', 'game', 'gameId', 'gameLink', 'players', 'status', 'teams', 'venues']
print(game["gameId"], game["status"]["abstractGameState"])

Inspect the home / away team summary blocks:

game["teams"]["home"]["name"], game["teams"]["away"]["name"]

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_api.nhl_api_schedule(start_date: str, end_date: str, return_as_pandas=False, **kwargs) → DataFrame

nhl_api_schedule() - Pull the schedule by start and end date. Data from API endpoints - nhl/schedule

  • Parameters:
    • start_date (str) – Start date to pull the NHL API schedule.
    • end_date (str) – End date to pull the NHL API schedule.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing the schedule for the requested seasons.
  • Return type: pl.DataFrame

Example

Pull a one-week schedule slice:

from sportsdataverse.nhl import nhl_api_schedule
sched = nhl_api_schedule(start_date="2021-10-23", end_date="2021-10-28")
print(sched.shape)
sched.select(["gamePk", "gameDate", "teams.home.team.name", "teams.away.team.name"]).head()

Pandas round-trip:

sched_pd = nhl_api_schedule(
start_date="2021-10-23", end_date="2021-10-28", return_as_pandas=True
)
sched_pd[["gamePk", "gameDate", "status.detailedState"]].head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_game_rosters module

sportsdataverse.nhl.nhl_game_rosters.espn_nhl_game_rosters(game_id: int, raw=False, return_as_pandas=False, **kwargs) → DataFrame

espn_nhl_game_rosters() - Pull the game by id.

  • Parameters:
    • game_id (int) – Unique game_id, can be obtained from espn_nhl_schedule().
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe of game roster data with columns: ‘athlete_id’, ‘athlete_uid’, ‘athlete_guid’, ‘athlete_type’, ‘first_name’, ‘last_name’, ‘full_name’, ‘athlete_display_name’, ‘short_name’, ‘weight’, ‘display_weight’, ‘height’, ‘display_height’, ‘age’, ‘date_of_birth’, ‘slug’, ‘jersey’, ‘linked’, ‘active’, ‘alternate_ids_sdr’, ‘birth_place_city’, ‘birth_place_state’, ‘birth_place_country’, ‘headshot_href’, ‘headshot_alt’, ‘experience_years’, ‘experience_display_value’, ‘experience_abbreviation’, ‘status_id’, ‘status_name’, ‘status_type’, ‘status_abbreviation’, ‘hand_type’, ‘hand_abbreviation’, ‘hand_display_value’, ‘draft_display_text’, ‘draft_round’, ‘draft_year’, ‘draft_selection’, ‘player_id’, ‘starter’, ‘valid’, ‘did_not_play’, ‘display_name’, ‘ejected’, ‘athlete_href’, ‘position_href’, ‘statistics_href’, ‘team_id’, ‘team_guid’, ‘team_uid’, ‘team_slug’, ‘team_location’, ‘team_name’, ‘team_abbreviation’, ‘team_display_name’, ‘team_short_display_name’, ‘team_color’, ‘team_alternate_color’, ‘is_active’, ‘is_all_star’, ‘logo_href’, ‘logo_dark_href’, ‘game_id’
  • Return type: pl.DataFrame

Example

Pull both teams’ rosters for a single game (Stanley Cup Final 2023):

from sportsdataverse.nhl import espn_nhl_game_rosters
rosters = espn_nhl_game_rosters(game_id=401559395)
print(rosters.shape)
rosters.select(["athlete_display_name", "jersey", "team_abbreviation", "starter"]).head(10)

Just the starters:

import polars as pl
rosters.filter(pl.col("starter") == True).select(["athlete_display_name", "team_abbreviation"])

Pandas round-trip:

rosters_pd = espn_nhl_game_rosters(game_id=401559395, return_as_pandas=True)
rosters_pd[["athlete_display_name", "team_abbreviation", "did_not_play"]].head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_game_rosters.helper_nhl_athlete_items(teams_rosters, **kwargs)

sportsdataverse.nhl.nhl_game_rosters.helper_nhl_game_items(summary)

sportsdataverse.nhl.nhl_game_rosters.helper_nhl_roster_items(items, summary_url, **kwargs)

sportsdataverse.nhl.nhl_game_rosters.helper_nhl_team_items(items, **kwargs)

sportsdataverse.nhl.nhl_loaders module

sportsdataverse.nhl.nhl_loaders.load_nhl_pbp(seasons: List[int], return_as_pandas=False) → DataFrame

Load NHL play by play data going back to 2011

  • Parameters:
    • seasons (list) – Used to define different seasons. 2011 is the earliest available season.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing the play-by-plays available for the requested seasons.
  • Return type: pl.DataFrame
  • Raises: ValueError – If season is less than 2011.

Example

Pull a single season’s play-by-play parquet:

from sportsdataverse.nhl import load_nhl_pbp
pbp = load_nhl_pbp(seasons=2023)
print(pbp.shape)

Pull a range of seasons:

pbp = load_nhl_pbp(seasons=range(2018, 2024))
pbp.group_by("season").len().sort("season")

Filter to goal events and round-trip to pandas:

import polars as pl
goals = pbp.filter(pl.col("type_text") == "Goal")
goals_pd = goals.to_pandas()
goals_pd[["season", "period", "time", "text"]].head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_loaders.load_nhl_player_boxscore(seasons: List[int], return_as_pandas=False) → DataFrame

Load NHL player boxscore data

  • Parameters:
    • seasons (list) – Used to define different seasons. 2011 is the earliest available season.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing the player boxscores available for the requested seasons.
  • Return type: pl.DataFrame
  • Raises: ValueError – If season is less than 2011.

Example

Pull player box scores for a single season:

from sportsdataverse.nhl import load_nhl_player_boxscore
pb = load_nhl_player_boxscore(seasons=2023)
print(pb.shape)

Top 10 single-game point performers:

import polars as pl
pb.with_columns(points=pl.col("goals") + pl.col("assists")).sort(
"points", descending=True
).select(["game_id", "athlete_display_name", "goals", "assists", "points"]).head(10)

Pandas round-trip across multiple seasons:

pb_pd = load_nhl_player_boxscore(seasons=range(2020, 2024), return_as_pandas=True)
pb_pd.groupby("season")[["goals", "assists"]].sum()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_loaders.load_nhl_schedule(seasons: List[int], return_as_pandas=False) → DataFrame

Load NHL schedule data

  • Parameters:
    • seasons (list) – Used to define different seasons. 2002 is the earliest available season.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing the schedule for the requested seasons.
  • Return type: pl.DataFrame
  • Raises: ValueError – If season is less than 2002.

Example

Pull a single season’s schedule:

from sportsdataverse.nhl import load_nhl_schedule
sched = load_nhl_schedule(seasons=2023)
print(sched.shape)

Pull a range of seasons and count by status:

sched = load_nhl_schedule(seasons=range(2018, 2024))
sched.group_by(["season", "status_type_description"]).len().sort(["season", "len"])

Pandas round-trip with a single season:

sched_pd = load_nhl_schedule(seasons=[2023], return_as_pandas=True)
sched_pd[["game_id", "home_name", "away_name", "game_date"]].head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_loaders.load_nhl_team_boxscore(seasons: List[int], return_as_pandas=False) → DataFrame

Load NHL team boxscore data

  • Parameters:
    • seasons (list) – Used to define different seasons. 2011 is the earliest available season.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing the team boxscores available for the requested seasons.
  • Return type: pl.DataFrame
  • Raises: ValueError – If season is less than 2011.

Example

Pull team box scores for a single season:

from sportsdataverse.nhl import load_nhl_team_boxscore
tb = load_nhl_team_boxscore(seasons=2023)
print(tb.shape)

Pull a range of seasons:

tb = load_nhl_team_boxscore(seasons=range(2018, 2024))
tb.group_by("season").len().sort("season")

Tampa Bay Lightning (team_id 14) game-by-game scoring:

import polars as pl
tb.filter(pl.col("team_id") == 14).select(["game_id", "team_score", "opponent_team_score"]).head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_loaders.nhl_teams(return_as_pandas=False) → DataFrame

Load NHL team ID information and logos

  • Parameters: return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing teams available for the requested seasons.
  • Return type: pl.DataFrame

Example

Pull the static teams + logos table:

from sportsdataverse.nhl import nhl_teams
teams = nhl_teams()
print(teams.shape)
teams.head()

Pandas round-trip — convenient for joining against your own roster table:

teams_pd = nhl_teams(return_as_pandas=True)
list(teams_pd.columns)[:10]

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_pbp module

sportsdataverse.nhl.nhl_pbp.espn_nhl_pbp(game_id: int, raw=False, **kwargs) → Dict

espn_nhl_pbp() - Pull the game by id. Data from API endpoints - nhl/playbyplay, nhl/summary

  • Parameters: game_id (int) – Unique game_id, can be obtained from nhl_schedule().
  • Returns: Dictionary of game data with keys - “gameId”, “plays”, “boxscore”, “header”, “broadcasts”, : ”videos”, “playByPlaySource”, “standings”, “leaders”, “seasonseries”, “pickcenter”, “againstTheSpread”, “odds”, “onIce”, “gameInfo”, “season”
  • Return type: Dict

Example

Pull a single game’s parsed feed (Stanley Cup Finals 2023 game):

from sportsdataverse.nhl import espn_nhl_pbp
game = espn_nhl_pbp(game_id=401559395)
list(game.keys()) # 'gameId', 'plays', 'boxscore', ...

Inspect parsed plays and a quick filter on goal events:

import polars as pl
plays = pl.DataFrame(game["plays"])
print(plays.shape)
goals = plays.filter(pl.col("type.text") == "Goal")
goals.select(["period", "time", "text"]).head()

Pull the unparsed payload for custom downstream parsing:

raw = espn_nhl_pbp(game_id=401559395, raw=True)
sorted(raw.keys())[:5]

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_pbp.helper_nhl_game_data(pbp_txt, init)

sportsdataverse.nhl.nhl_pbp.helper_nhl_pbp(game_id, pbp_txt)

sportsdataverse.nhl.nhl_pbp.helper_nhl_pbp_features(game_id, pbp_txt, init)

sportsdataverse.nhl.nhl_pbp.helper_nhl_pickcenter(pbp_txt)

sportsdataverse.nhl.nhl_pbp.nhl_pbp_disk(game_id, path_to_json)

sportsdataverse.nhl.nhl_schedule module

sportsdataverse.nhl.nhl_schedule.espn_nhl_calendar(season=None, ondays=None, return_as_pandas=False, **kwargs) → DataFrame

espn_nhl_calendar - look up the NHL calendar for a given season

  • Parameters:
    • season (int) – Used to define different seasons. 2002 is the earliest available season.
    • ondays (boolean) – Used to return dates for calendar ondays
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing calendar dates for the requested season.
  • Return type: pl.DataFrame
  • Raises: ValueError – If season is less than 2002.

Example

Calendar dates for a season:

from sportsdataverse.nhl import espn_nhl_calendar
cal = espn_nhl_calendar(season=2023)
print(cal.shape)
cal.head()

Just the on-days (game-played dates), useful for batch loops:

ondays = espn_nhl_calendar(season=2023, ondays=True)
for url in ondays["url"].head(3).to_list():
print(url)

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_schedule.espn_nhl_schedule(dates=None, season_type=None, limit=500, return_as_pandas=False, **kwargs) → DataFrame

espn_nhl_schedule - look up the NHL schedule for a given date

  • Parameters:
    • dates (int) – Used to define different seasons. 2002 is the earliest available season.
    • season_type (int) – season type, 1 for pre-season, 2 for regular season, 3 for post-season, 4 for all-star, 5 for off-season
    • limit (int) – number of records to return, default: 500.
    • return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing schedule dates for the requested season. Returns None if no games
  • Return type: pl.DataFrame

Example

Pull a single date’s slate (YYYYMMDD):

from sportsdataverse.nhl import espn_nhl_schedule
sched = espn_nhl_schedule(dates=20230613) # 2023 Stanley Cup Final game date
print(sched.shape)
sched.select(["game_id", "home_name", "away_name", "status_type_description"]).head()

Pull a regular-season slate from a season-year:

reg = espn_nhl_schedule(dates=2023, season_type=2, limit=500)
reg.group_by("status_type_description").len().sort("len", descending=True)

Pandas round-trip for one date:

espn_nhl_schedule(dates=20230613, return_as_pandas=True).head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_schedule.most_recent_nhl_season()

most_recent_nhl_season - return the season year for “today”.

NHL seasons are labeled by the year they end in. October flips the label to next calendar year (the new season just started), otherwise the current calendar year is returned.

  • Returns: A season year suitable for season-aware loaders / schedule helpers.
  • Return type: int

Example

Use as a default season for downstream calls:

from sportsdataverse.nhl import most_recent_nhl_season, espn_nhl_calendar
season = most_recent_nhl_season()
cal = espn_nhl_calendar(season=season)
print(season, cal.height)

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_schedule.scoreboard_event_parsing(event)

sportsdataverse.nhl.nhl_schedule.year_to_season(year)

year_to_season - format a starting year as the canonical YYYY-YY season string.

NHL season strings (used by statsapi / api-web.nhle.com) are of the form "2023-24". This helper converts a starting year (2023) into that string.

  • Parameters: year – Starting calendar year of the season (e.g. 2023).
  • Returns: Season string formatted as "YYYY-YY".
  • Return type: str

Example

Convert a starting year:

from sportsdataverse.nhl import year_to_season
year_to_season(2023) # '2023-24'
year_to_season(2009) # '2009-10'
year_to_season(1999) # '1999-00'

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

sportsdataverse.nhl.nhl_teams module

sportsdataverse.nhl.nhl_teams.espn_nhl_teams(return_as_pandas=False, **kwargs) → DataFrame

espn_nhl_teams - look up NHL teams

  • Parameters: return_as_pandas (bool) – If True, returns a pandas dataframe. If False, returns a polars dataframe.
  • Returns: Polars dataframe containing teams for the requested league. This function caches by default, so if you want to refresh the data, use the command sportsdataverse.nhl.espn_nhl_teams.clear_cache().
  • Return type: pl.DataFrame

Example

Pull the full NHL team directory:

from sportsdataverse.nhl import espn_nhl_teams
teams = espn_nhl_teams()
print(teams.shape)
teams.select(["team_id", "team_abbreviation", "team_display_name"]).head()

Find Tampa Bay Lightning (team_id 14):

import polars as pl
teams.filter(pl.col("team_id") == "14").to_dicts()

Refresh the cache (the call is lru_cache’d) and round-trip to pandas:

espn_nhl_teams.cache_clear()
teams_pd = espn_nhl_teams(return_as_pandas=True)
teams_pd[["team_id", "team_abbreviation", "team_display_name"]].head()

See Also: : * fastRhockey — R companion package; mirrors this surface

  • nhl-api-py — alternative Python source for the NHL stats API

Module contents