Skip to main content
Version: 0.0.70

โšพ Baseball with sportsdataverse-py

Welcome to the ballpark! ๐ŸŸ๏ธ In just a few lines of Python you're about to pull official MLB data โ€” schedules, standings, rosters, box scores, play-by-play โ€” straight from the league's own MLB Stats API, plus pitch-level Statcast tracking from Baseball Savant. Every premium call hands you back a tidy polars DataFrame (or raw JSON when you want it), ready to model. ๐Ÿš€

If you've used the R package baseballr, or Python's pybaseball, the data shapes will feel right at home. Let's play ball! โšพ

๐Ÿงฐ The toolboxโ€‹

We lead with the premium sources โ€” the MLB Stats API (mlb_*, backed by statsapi.mlb.com) and the comprehensive Statcast surface (mlb_statcast_*, from Baseball Savant). ESPN (espn_mlb_*) is a handy secondary path. Click any name for the full reference:

FunctionWhat it gives youSource
mlb_schedule ยท parse_mlb_api_scheduleGames for a date / range โ€” one row per game (with game_pk)๐ŸŸข MLB Stats API
mlb_teams ยท parse_mlb_api_teamsEvery club โ€” one row per team๐ŸŸข MLB Stats API
mlb_standings ยท parse_mlb_api_standingsDivision standings โ€” wins, losses, run diff๐ŸŸข MLB Stats API
mlb_team_rosterA team's roster โ€” one row per player๐ŸŸข MLB Stats API
mlb_personA player's bio (one tidy row)๐ŸŸข MLB Stats API
mlb_person_stats ยท parse_mlb_api_person_statsA player's season stat splits๐ŸŸข MLB Stats API
mlb_boxscoreFull game box score๐ŸŸข MLB Stats API
mlb_play_by_playPlate-appearance-level play-by-play๐ŸŸข MLB Stats API
mlb_stats_leadersLeague leaders for any stat (HR, AVG, ERA, โ€ฆ)๐ŸŸข MLB Stats API
mlb_win_probabilityPer-play win probability + WPA for a game๐ŸŸข MLB Stats API
mlb_awards ยท mlb_award_recipientsAward catalog + season winners (MVP, Cy Young, โ€ฆ)๐ŸŸข MLB Stats API
mlb_draftAmateur draft board โ€” one row per pick๐ŸŸข MLB Stats API
mlb_statcast_searchEvery pitch matching a filter โ€” ~110 cols/pitch; auto date-chunks past the 25k cap; friendly filters (batters_lookup, pitch_type, at_bat_result, โ€ฆ)๐Ÿ”ต Statcast
mlb_statcast_search_minors ยท mlb_statcast_search_wbcSame pitch search for MiLB and the World Baseball Classic๐Ÿ”ต Statcast
mlb_statcast_leaderboard_* (37 of them) โ€” e.g. โ€ฆ_sprint_speed, โ€ฆ_expected_stats, โ€ฆ_bat_tracking, โ€ฆ_outs_above_averageEvery Savant leaderboard: expected stats, sprint speed, bat tracking, pitch arsenals/movement/tempo, OAA, arm strength, catcher framing/blocking/throwing, baserunning, park factors, โ€ฆ๐Ÿ”ต Statcast
mlb_statcast_gamefeedSavant single-game feed โ€” one tidy row per pitch๐Ÿ”ต Statcast
mlb_statcast_playerA player's Savant page metrics๐Ÿ”ต Statcast
espn_mlb_teams ยท espn_mlb_scheduleESPN teams / schedule (wide frames)โšช ESPN
most_recent_mlb_seasonCurrent season helperโšช helper

๐Ÿ”Œ Setupโ€‹

pip install sportsdataverse

No API key needed for any of the premium MLB endpoints โ€” the MLB Stats API and Baseball Savant are both public. ๐ŸŽ‰

import polars as pl
import sportsdataverse.mlb as mlb

pl.Config.set_tbl_rows(12)
print("most recent MLB season:", mlb.most_recent_mlb_season())
most recent MLB season: 2026

The MLB Stats API and Savant are public and reliable, but they're still live network calls โ€” a date with no games, an offseason day, or a blip can make a call come back empty. So we use a tiny safe() helper: you get the frame when the feed is up, and a friendly one-liner when it isn't (never a scary traceback). ๐Ÿ›Ÿ

We also pick a stable completed-season date for our examples so the page renders the same in June as in October.

def safe(label, thunk):
"""Run a live call defensively: return its result, or print a one-liner."""
try:
out = thunk()
print(f"โœ… {label}")
return out
except Exception as e: # noqa: BLE001 -- demo resilience
print(f"โญ๏ธ {label}: unavailable right now ({type(e).__name__})")
return None

# A known completed regular-season slate โ€” stable for the docs build.
SAMPLE_SEASON = 2024
SAMPLE_DATE = "2024-07-01" # YYYY-MM-DD for the Stats API
JUDGE_ID = 592450 # Aaron Judge, NYY โ€” our running example player
YANKEES_ID = 147 # New York Yankees team_id

๐Ÿ“… The schedule (MLB Stats API)โ€‹

mlb_schedule returns the raw JSON dict; its partner parse_mlb_api_schedule flattens it to one row per game. The most important column is game_pk โ€” that's the id you feed to the box score and play-by-play endpoints. Pass a single date=, or a start_date/end_date range, team_id, or season.

schedule = safe(
"schedule",
lambda: mlb.parse_mlb_api_schedule(mlb.mlb_schedule(date=SAMPLE_DATE)),
)
cols = ["game_pk", "status_detailed_state",
"teams_away_team_name", "teams_away_score",
"teams_home_team_name", "teams_home_score"]
(schedule.select([c for c in cols if c in schedule.columns]).head()
if schedule is not None else "schedule unavailable right now")
โœ… schedule





shape: (3, 6)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ game_pk โ”† status_detailed โ”† teams_away_team โ”† teams_away_scor โ”† teams_home_team โ”† teams_home_sco โ”‚
โ”‚ --- โ”† _state โ”† _name โ”† e โ”† _name โ”† re โ”‚
โ”‚ i64 โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ โ”† str โ”† str โ”† i64 โ”† str โ”† i64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 744914 โ”† Final โ”† Houston Astros โ”† 3 โ”† Toronto Blue โ”† 1 โ”‚
โ”‚ โ”† โ”† โ”† โ”† Jays โ”† โ”‚
โ”‚ 744840 โ”† Final โ”† New York Mets โ”† 9 โ”† Washington โ”† 7 โ”‚
โ”‚ โ”† โ”† โ”† โ”† Nationals โ”† โ”‚
โ”‚ 746535 โ”† Final โ”† Milwaukee โ”† 7 โ”† Colorado โ”† 8 โ”‚
โ”‚ โ”† โ”† Brewers โ”† โ”† Rockies โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ† Standings (MLB Stats API)โ€‹

mlb_standings covers both leagues by default (league_id="103,104"). parse_mlb_api_standings returns one row per team with wins/losses, division rank, and winning percentage.

standings = safe(
"standings",
lambda: mlb.parse_mlb_api_standings(mlb.mlb_standings(season=SAMPLE_SEASON)),
)
keep = ["team_name", "standings_division_name", "wins", "losses",
"winning_percentage", "division_rank"]
(standings.select([c for c in keep if c in standings.columns])
.sort("wins", descending=True).head(10)
if standings is not None else "standings unavailable right now")
โœ… standings





shape: (10, 6)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ team_name โ”† standings_division_name โ”† wins โ”† losses โ”† winning_percentage โ”† division_rank โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† i64 โ”† i64 โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Dodgers โ”† null โ”† 98 โ”† 64 โ”† .605 โ”† 1 โ”‚
โ”‚ Phillies โ”† null โ”† 95 โ”† 67 โ”† .586 โ”† 1 โ”‚
โ”‚ Yankees โ”† null โ”† 94 โ”† 68 โ”† .580 โ”† 1 โ”‚
โ”‚ Brewers โ”† null โ”† 93 โ”† 69 โ”† .574 โ”† 1 โ”‚
โ”‚ Padres โ”† null โ”† 93 โ”† 69 โ”† .574 โ”† 2 โ”‚
โ”‚ Guardians โ”† null โ”† 92 โ”† 69 โ”† .571 โ”† 1 โ”‚
โ”‚ Orioles โ”† null โ”† 91 โ”† 71 โ”† .562 โ”† 2 โ”‚
โ”‚ Braves โ”† null โ”† 89 โ”† 73 โ”† .549 โ”† 2 โ”‚
โ”‚ Mets โ”† null โ”† 89 โ”† 73 โ”† .549 โ”† 3 โ”‚
โ”‚ D-backs โ”† null โ”† 89 โ”† 73 โ”† .549 โ”† 3 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿงข Teams & rosters (MLB Stats API)โ€‹

mlb_teams + parse_mlb_api_teams lists every club โ€” grab a team_id here. mlb_team_roster then returns a tidy frame directly (one row per player).

teams = safe(
"teams",
lambda: mlb.parse_mlb_api_teams(mlb.mlb_teams(season=SAMPLE_SEASON)),
)
(teams.select(["id", "name", "abbreviation", "location_name", "team_name"]).head()
if teams is not None else "teams unavailable right now")
โœ… teams





shape: (5, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ id โ”† name โ”† abbreviation โ”† location_name โ”† team_name โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ i64 โ”† str โ”† str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 133 โ”† Oakland Athletics โ”† OAK โ”† Oakland โ”† Athletics โ”‚
โ”‚ 134 โ”† Pittsburgh Pirates โ”† PIT โ”† Pittsburgh โ”† Pirates โ”‚
โ”‚ 135 โ”† San Diego Padres โ”† SD โ”† San Diego โ”† Padres โ”‚
โ”‚ 136 โ”† Seattle Mariners โ”† SEA โ”† Seattle โ”† Mariners โ”‚
โ”‚ 137 โ”† San Francisco Giants โ”† SF โ”† San Francisco โ”† Giants โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
roster = safe(
"Yankees roster",
lambda: mlb.mlb_team_roster(team_id=YANKEES_ID, season=SAMPLE_SEASON),
)
rcols = ["jersey_number", "person_id", "person_full_name",
"position_abbreviation", "status_description"]
(roster.select([c for c in rcols if c in roster.columns]).head()
if roster is not None else "roster unavailable right now")
โœ… Yankees roster





shape: (5, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ jersey_number โ”† person_id โ”† person_full_name โ”† position_abbreviation โ”† status_description โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† i64 โ”† str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 74 โ”† 677076 โ”† Clayton Andrews โ”† P โ”† Minor League Contract โ”‚
โ”‚ 85 โ”† 690925 โ”† Clayton Beeter โ”† P โ”† Forty Man โ”‚
โ”‚ 19 โ”† 542932 โ”† Jon Berti โ”† 3B โ”† Forty Man โ”‚
โ”‚ 53 โ”† 641360 โ”† Phil Bickford โ”† P โ”† Minor League Contract โ”‚
โ”‚ 57 โ”† 595897 โ”† Nick Burdi โ”† P โ”† Minor League Contract โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿง Player bio & season stats (MLB Stats API)โ€‹

mlb_person returns a one-row bio frame. mlb_person_stats returns the raw stat-split dict; parse_mlb_api_person_stats flattens it. Our running example is Aaron Judge (person_id=592450).

bio = safe("Judge bio", lambda: mlb.mlb_person(person_id=JUDGE_ID))
bcols = ["id", "full_name", "primary_number", "birth_date",
"height", "weight", "mlb_debut_date"]
(bio.select([c for c in bcols if c in bio.columns])
if bio is not None else "bio unavailable right now")
โœ… Judge bio





shape: (1, 7)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ id โ”† full_name โ”† primary_number โ”† birth_date โ”† height โ”† weight โ”† mlb_debut_date โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ i64 โ”† str โ”† str โ”† str โ”† str โ”† i64 โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 592450 โ”† Aaron Judge โ”† 99 โ”† 1992-04-26 โ”† 6' 7" โ”† 282 โ”† 2016-08-13 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
hitting = safe(
"Judge 2024 hitting",
lambda: mlb.parse_mlb_api_person_stats(
mlb.mlb_person_stats(person_id=JUDGE_ID, stats="season",
group="hitting", season=SAMPLE_SEASON)
),
)
scols = ["season", "stat_games_played", "stat_home_runs", "stat_rbi",
"stat_avg", "stat_obp", "stat_slg", "stat_ops"]
(hitting.select([c for c in scols if c in hitting.columns])
if hitting is not None else "stats unavailable right now")
โœ… Judge 2024 hitting





shape: (1, 8)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ season โ”† stat_games_play โ”† stat_home_runs โ”† stat_rbi โ”† stat_avg โ”† stat_obp โ”† stat_slg โ”† stat_ops โ”‚
โ”‚ --- โ”† ed โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† --- โ”† i64 โ”† i64 โ”† str โ”† str โ”† str โ”† str โ”‚
โ”‚ โ”† i64 โ”† โ”† โ”† โ”† โ”† โ”† โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 2024 โ”† 158 โ”† 58 โ”† 144 โ”† .322 โ”† .458 โ”† .701 โ”† 1.159 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽฏ Pitch-level Statcast (Baseball Savant)โ€‹

Now the fun part โ€” every single pitch. mlb_statcast_search pulls each pitch matching your filter, with 100+ columns (velocity, spin, launch angle, expected stats). Keep windows small (one player, one game, or a 1โ€“2 day slice) โ€” a full season is millions of pitches. Here's every pitch Aaron Judge saw over a two-day window.

pitches = safe(
"Judge pitches (2-day)",
lambda: mlb.mlb_statcast_search(start_dt="2024-07-01", end_dt="2024-07-02",
batters_lookup=JUDGE_ID),
)
if pitches is not None and pitches.height:
print("shape:", pitches.shape)
out = pitches.select(["game_date", "player_name", "pitch_type", "release_speed",
"launch_speed", "launch_angle", "events", "description"]).head()
else:
out = "no pitches in that window right now"
out
โœ… Judge pitches (2-day)
shape: (11, 119)





shape: (5, 8)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ game_date โ”† player_nam โ”† pitch_type โ”† release_s โ”† launch_sp โ”† launch_an โ”† events โ”† descripti โ”‚
โ”‚ --- โ”† e โ”† --- โ”† peed โ”† eed โ”† gle โ”† --- โ”† on โ”‚
โ”‚ str โ”† --- โ”† str โ”† --- โ”† --- โ”† --- โ”† str โ”† --- โ”‚
โ”‚ โ”† str โ”† โ”† f64 โ”† f64 โ”† f64 โ”† โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 2024-07-02 โ”† Judge, โ”† FC โ”† 97.1 โ”† 91.0 โ”† 19.0 โ”† single โ”† hit_into_ โ”‚
โ”‚ โ”† Aaron โ”† โ”† โ”† โ”† โ”† โ”† play โ”‚
โ”‚ 2024-07-02 โ”† Judge, โ”† FC โ”† 97.2 โ”† null โ”† null โ”† null โ”† swinging_ โ”‚
โ”‚ โ”† Aaron โ”† โ”† โ”† โ”† โ”† โ”† strike โ”‚
โ”‚ 2024-07-02 โ”† Judge, โ”† SL โ”† 87.5 โ”† 94.3 โ”† 42.0 โ”† field_out โ”† hit_into_ โ”‚
โ”‚ โ”† Aaron โ”† โ”† โ”† โ”† โ”† โ”† play โ”‚
โ”‚ 2024-07-02 โ”† Judge, โ”† FC โ”† 96.3 โ”† null โ”† null โ”† null โ”† ball โ”‚
โ”‚ โ”† Aaron โ”† โ”† โ”† โ”† โ”† โ”† โ”‚
โ”‚ 2024-07-02 โ”† Judge, โ”† SL โ”† 90.1 โ”† null โ”† null โ”† null โ”† foul โ”‚
โ”‚ โ”† Aaron โ”† โ”† โ”† โ”† โ”† โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿณ Cookbook: common baseball tasksโ€‹

A handful of recipes you'll reach for constantly โ€” every one leads with a premium source.

Recipe 1 โ€” A team's schedule + where they sit in the standings ๐Ÿ“‹โ€‹

Pull one club's slate with mlb_schedule(team_id=...), then find their row in the standings. Two premium calls, one tidy snapshot.

yanks_sched = safe(
"Yankees July schedule",
lambda: mlb.parse_mlb_api_schedule(
mlb.mlb_schedule(team_id=YANKEES_ID,
start_date="2024-07-01", end_date="2024-07-07")
),
)
sched_cols = ["game_pk", "official_date", "teams_away_team_name",
"teams_home_team_name", "teams_away_score", "teams_home_score"]
if yanks_sched is not None and yanks_sched.height:
games = yanks_sched.select([c for c in sched_cols if c in yanks_sched.columns])
else:
games = "schedule unavailable right now"

if standings is not None and "team_name" in standings.columns:
rank = (standings.filter(pl.col("team_name").str.contains("Yankees"))
.select([c for c in ["team_name", "wins", "losses", "division_rank"]
if c in standings.columns]))
else:
rank = "standings unavailable"
print(rank)
games
โœ… Yankees July schedule
shape: (1, 4)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ team_name โ”† wins โ”† losses โ”† division_rank โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† i64 โ”† i64 โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Yankees โ”† 94 โ”† 68 โ”† 1 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜





shape: (6, 6)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ game_pk โ”† official_date โ”† teams_away_team_ โ”† teams_home_team โ”† teams_away_scor โ”† teams_home_scor โ”‚
โ”‚ --- โ”† --- โ”† name โ”† _name โ”† e โ”† e โ”‚
โ”‚ i64 โ”† str โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ โ”† โ”† str โ”† str โ”† i64 โ”† i64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 745730 โ”† 2024-07-02 โ”† Cincinnati Reds โ”† New York โ”† 5 โ”† 4 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ”‚ 745728 โ”† 2024-07-03 โ”† Cincinnati Reds โ”† New York โ”† 3 โ”† 2 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ”‚ 745726 โ”† 2024-07-04 โ”† Cincinnati Reds โ”† New York โ”† 8 โ”† 4 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ”‚ 745725 โ”† 2024-07-05 โ”† Boston Red Sox โ”† New York โ”† 5 โ”† 3 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ”‚ 745724 โ”† 2024-07-06 โ”† Boston Red Sox โ”† New York โ”† 4 โ”† 14 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ”‚ 745723 โ”† 2024-07-07 โ”† Boston Red Sox โ”† New York โ”† 3 โ”† 0 โ”‚
โ”‚ โ”† โ”† โ”† Yankees โ”† โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 2 โ€” A Statcast leaderboard ๐Ÿƒโ€‹

The mlb_statcast_leaderboard_* family wraps Savant's pre-aggregated season leaderboards โ€” fast, because the heavy lifting happens server-side. Here's the 2024 sprint speed leaderboard, fastest first.

sprint = safe(
"sprint speed leaderboard",
lambda: mlb.mlb_statcast_leaderboard_sprint_speed(year=SAMPLE_SEASON, min_opp=10),
)
spcols = ["last_name, first_name", "team", "position", "competitive_runs", "sprint_speed"]
(sprint.select([c for c in spcols if c in sprint.columns])
.sort("sprint_speed", descending=True).head(10)
if sprint is not None and sprint.height else "leaderboard unavailable right now")
โœ… sprint speed leaderboard





shape: (10, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ last_name, first_name โ”† team โ”† position โ”† competitive_runs โ”† sprint_speed โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† str โ”† i64 โ”† f64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Witt Jr., Bobby โ”† KC โ”† SS โ”† 298 โ”† 30.5 โ”‚
โ”‚ Rojas, Johan โ”† PHI โ”† CF โ”† 176 โ”† 30.1 โ”‚
โ”‚ De La Cruz, Elly โ”† CIN โ”† SS โ”† 249 โ”† 30.0 โ”‚
โ”‚ Fitzgerald, Tyler โ”† SF โ”† SS โ”† 99 โ”† 30.0 โ”‚
โ”‚ Clase, Jonatan โ”† TOR โ”† LF โ”† 20 โ”† 30.0 โ”‚
โ”‚ Crow-Armstrong, Pete โ”† CHC โ”† CF โ”† 149 โ”† 30.0 โ”‚
โ”‚ Scott II, Victor โ”† STL โ”† CF โ”† 62 โ”† 30.0 โ”‚
โ”‚ Mateo, Jorge โ”† BAL โ”† 2B โ”† 77 โ”† 29.9 โ”‚
โ”‚ Siri, Jose โ”† TB โ”† CF โ”† 116 โ”† 29.9 โ”‚
โ”‚ Hampson, Garrett โ”† KC โ”† CF โ”† 89 โ”† 29.8 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 3 โ€” Box score for one game ๐Ÿ“Šโ€‹

Take a game_pk from any schedule and pull the full box score with mlb_boxscore. Asking for return_parsed=False gives the raw dict, which carries per-team batting and pitching lines under teams.home / teams.away.

def team_line(game_pk):
box = mlb.mlb_boxscore(game_pk=game_pk, return_parsed=False)
rows = []
for side in ("away", "home"):
t = box["teams"][side]
bat = t["teamStats"]["batting"]
rows.append({"side": side, "team": t["team"]["name"],
"runs": bat["runs"], "hits": bat["hits"],
"home_runs": bat["homeRuns"], "rbi": bat["rbi"], "avg": bat["avg"]})
return pl.DataFrame(rows)

# Use a game_pk from the schedule we pulled, or fall back to a known game.
gid = int(schedule["game_pk"][0]) if (schedule is not None and schedule.height) else 744914
box_df = safe(f"boxscore {gid}", lambda: team_line(gid))
out = box_df if box_df is not None else "boxscore unavailable right now"
out
โœ… boxscore 744914





shape: (2, 7)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ side โ”† team โ”† runs โ”† hits โ”† home_runs โ”† rbi โ”† avg โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† i64 โ”† i64 โ”† i64 โ”† i64 โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•ก
โ”‚ away โ”† Houston Astros โ”† 3 โ”† 4 โ”† 2 โ”† 3 โ”† .264 โ”‚
โ”‚ home โ”† Toronto Blue Jays โ”† 1 โ”† 4 โ”† 1 โ”† 1 โ”† .234 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 4 โ€” Plate-appearance play-by-play + outcome mix โšพโ€‹

mlb_play_by_play returns a dict with an allPlays list โ€” one entry per plate appearance. Flatten it with pl.json_normalize (dot-notation columns), then tally the plate-appearance outcomes.

def pbp_frame(game_pk):
raw = mlb.mlb_play_by_play(game_pk=game_pk, return_parsed=False)
return pl.json_normalize(raw["allPlays"], separator=".", max_level=2)

plays = safe(f"play-by-play {gid}", lambda: pbp_frame(gid))
if plays is not None and plays.height:
pcols = ["about.inning", "about.halfInning", "matchup.batter.fullName",
"matchup.pitcher.fullName", "result.event"]
out = plays.select([c for c in pcols if c in plays.columns]).head()
else:
out = "play-by-play unavailable right now"
out
โœ… play-by-play 744914





shape: (5, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ about.inning โ”† about.halfInning โ”† matchup.batter.fullNam โ”† matchup.pitcher.fullNa โ”† result.event โ”‚
โ”‚ --- โ”† --- โ”† e โ”† me โ”† --- โ”‚
โ”‚ i64 โ”† str โ”† --- โ”† --- โ”† str โ”‚
โ”‚ โ”† โ”† str โ”† str โ”† โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 1 โ”† top โ”† Alex Bregman โ”† Yariel Rodrรญguez โ”† Flyout โ”‚
โ”‚ 1 โ”† top โ”† Jake Meyers โ”† Yariel Rodrรญguez โ”† Strikeout โ”‚
โ”‚ 1 โ”† top โ”† Yordan Alvarez โ”† Yariel Rodrรญguez โ”† Groundout โ”‚
โ”‚ 1 โ”† bottom โ”† Bo Bichette โ”† Hunter Brown โ”† Groundout โ”‚
โ”‚ 1 โ”† bottom โ”† Spencer Horwitz โ”† Hunter Brown โ”† Lineout โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
# Outcome mix for the game โ€” the shape of every plate appearance.
if plays is not None and plays.height and "result.event" in plays.columns:
out = (plays.group_by("result.event")
.agg(pl.len().alias("count"))
.sort("count", descending=True).head(10))
else:
out = "no play-by-play to summarize right now"
out
shape: (10, 2)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ result.event โ”† count โ”‚
โ”‚ --- โ”† --- โ”‚
โ”‚ str โ”† u32 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Strikeout โ”† 16 โ”‚
โ”‚ Groundout โ”† 14 โ”‚
โ”‚ Pop Out โ”† 10 โ”‚
โ”‚ Flyout โ”† 7 โ”‚
โ”‚ Walk โ”† 6 โ”‚
โ”‚ Lineout โ”† 4 โ”‚
โ”‚ Single โ”† 3 โ”‚
โ”‚ Home Run โ”† 3 โ”‚
โ”‚ Double โ”† 2 โ”‚
โ”‚ Grounded Into DP โ”† 1 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 5 โ€” League leaders for any stat ๐Ÿฅ‡โ€‹

mlb_stats_leaders gives you the league leaderboard for any category โ€” homeRuns, avg, era, strikeouts, you name it. The leaders come back nested under each category, so we flatten the top-N into a tidy frame. Here's the 2024 home-run race.

def hr_leaders(season, category="homeRuns", group="hitting", n=10):
raw = mlb.mlb_stats_leaders(leader_categories=category, season=season,
stat_group=group, limit=n)
leaders = raw["leagueLeaders"][0]["leaders"]
rows = [{"rank": l["rank"], "player": l["person"]["fullName"],
"team": l.get("team", {}).get("name"), "value": l["value"]}
for l in leaders]
return pl.DataFrame(rows)

leaders = safe("2024 HR leaders",
lambda: hr_leaders(SAMPLE_SEASON, "homeRuns", "hitting", 10))
leaders if leaders is not None else "leaders unavailable right now"
โœ… 2024 HR leaders





shape: (10, 4)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ rank โ”† player โ”† team โ”† value โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ i64 โ”† str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 1 โ”† Aaron Judge โ”† New York Yankees โ”† 58 โ”‚
โ”‚ 2 โ”† Shohei Ohtani โ”† Los Angeles Dodgers โ”† 54 โ”‚
โ”‚ 3 โ”† Anthony Santander โ”† Baltimore Orioles โ”† 44 โ”‚
โ”‚ 4 โ”† Juan Soto โ”† New York Yankees โ”† 41 โ”‚
โ”‚ 5 โ”† Marcell Ozuna โ”† Atlanta Braves โ”† 39 โ”‚
โ”‚ 5 โ”† Josรฉ Ramรญrez โ”† Cleveland Guardians โ”† 39 โ”‚
โ”‚ 5 โ”† Brent Rooker โ”† Oakland Athletics โ”† 39 โ”‚
โ”‚ 8 โ”† Kyle Schwarber โ”† Philadelphia Phillies โ”† 38 โ”‚
โ”‚ 9 โ”† Gunnar Henderson โ”† Baltimore Orioles โ”† 37 โ”‚
โ”‚ 10 โ”† Ketel Marte โ”† Arizona Diamondbacks โ”† 36 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 6 โ€” Who's beating their expected stats? ๐ŸŽฒโ€‹

Statcast's expected stats ask what should have happened given each ball's exit velocity and launch angle. mlb_statcast_leaderboard_expected_stats hands you ba/est_ba, slg/est_slg, woba/est_woba side by side โ€” sort by the diff to find the luckiest (and unluckiest) hitters.

xstats = safe(
"expected stats",
lambda: mlb.mlb_statcast_leaderboard_expected_stats(
year=SAMPLE_SEASON, type="batter", min="q"),
)
if xstats is not None and xstats.height and "est_woba_minus_woba_diff" in xstats.columns:
cols = ["last_name, first_name", "pa", "woba", "est_woba",
"est_woba_minus_woba_diff"]
# Most negative diff = outperforming their expected wOBA the most.
out = (xstats.select([c for c in cols if c in xstats.columns])
.sort("est_woba_minus_woba_diff").head(10))
else:
out = "expected-stats leaderboard unavailable right now"
out
โœ… expected stats





shape: (10, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ last_name, first_name โ”† pa โ”† woba โ”† est_woba โ”† est_woba_minus_woba_diff โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† i64 โ”† f64 โ”† f64 โ”† f64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Drury, Brandon โ”† 360 โ”† 0.217 โ”† 0.264 โ”† -0.047 โ”‚
โ”‚ Soto, Juan โ”† 713 โ”† 0.421 โ”† 0.463 โ”† -0.042 โ”‚
โ”‚ Bailey, Patrick โ”† 448 โ”† 0.281 โ”† 0.322 โ”† -0.041 โ”‚
โ”‚ Sosa, Lenyn โ”† 369 โ”† 0.28 โ”† 0.321 โ”† -0.041 โ”‚
โ”‚ Morel, Christopher โ”† 611 โ”† 0.28 โ”† 0.316 โ”† -0.036 โ”‚
โ”‚ Garcia, Maikel โ”† 626 โ”† 0.27 โ”† 0.305 โ”† -0.035 โ”‚
โ”‚ Martinez, J.D. โ”† 495 โ”† 0.318 โ”† 0.353 โ”† -0.035 โ”‚
โ”‚ Harris II, Michael โ”† 470 โ”† 0.312 โ”† 0.346 โ”† -0.034 โ”‚
โ”‚ Margot, Manuel โ”† 343 โ”† 0.276 โ”† 0.31 โ”† -0.034 โ”‚
โ”‚ Kirk, Alejandro โ”† 386 โ”† 0.297 โ”† 0.329 โ”† -0.032 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 7 โ€” The fastest bats in baseball ๐Ÿ’จโ€‹

Bat tracking is one of Statcast's newest toys. mlb_statcast_leaderboard_bat_tracking returns average bat speed, swing length, and "hard-swing rate" per hitter โ€” sort by avg_bat_speed to see who's swinging the hardest.

bats = safe(
"bat tracking",
lambda: mlb.mlb_statcast_leaderboard_bat_tracking(year=SAMPLE_SEASON, type="batter"),
)
if bats is not None and bats.height and "avg_bat_speed" in bats.columns:
cols = ["name", "swings_competitive", "avg_bat_speed",
"hard_swing_rate", "swing_length"]
out = (bats.select([c for c in cols if c in bats.columns])
.sort("avg_bat_speed", descending=True).head(10))
else:
out = "bat-tracking leaderboard unavailable right now"
out
โœ… bat tracking





shape: (10, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ name โ”† swings_competitive โ”† avg_bat_speed โ”† hard_swing_rate โ”† swing_length โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† i64 โ”† f64 โ”† f64 โ”† f64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Caminero, Junior โ”† 433 โ”† 79.917064 โ”† 0.896074 โ”† 8.5657 โ”‚
โ”‚ Walker, Jordan โ”† 506 โ”† 79.072949 โ”† 0.859684 โ”† 8.319791 โ”‚
โ”‚ Cruz, Oneil โ”† 447 โ”† 78.461902 โ”† 0.780761 โ”† 7.622225 โ”‚
โ”‚ Kurtz, Nick โ”† 471 โ”† 78.17323 โ”† 0.808917 โ”† 7.771097 โ”‚
โ”‚ Adell, Jo โ”† 594 โ”† 77.347233 โ”† 0.703704 โ”† 7.74126 โ”‚
โ”‚ Smith, Cam โ”† 470 โ”† 77.064652 โ”† 0.723404 โ”† 7.690357 โ”‚
โ”‚ Schwarber, Kyle โ”† 514 โ”† 77.027339 โ”† 0.754864 โ”† 7.471818 โ”‚
โ”‚ Bauers, Jake โ”† 353 โ”† 77.014821 โ”† 0.753541 โ”† 7.692569 โ”‚
โ”‚ Caglianone, Jac โ”† 431 โ”† 76.845271 โ”† 0.712297 โ”† 7.914317 โ”‚
โ”‚ Mitchell, Garrett โ”† 333 โ”† 76.823575 โ”† 0.744745 โ”† 7.198839 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 8 โ€” The best gloves: Outs Above Average ๐Ÿงคโ€‹

Offense is easy to measure; defense is hard. Statcast's mlb_statcast_leaderboard_outs_above_average credits fielders for the plays they make relative to expectation. Sort by outs_above_average to find the season's best defenders.

oaa = safe(
"outs above average",
lambda: mlb.mlb_statcast_leaderboard_outs_above_average(year=SAMPLE_SEASON),
)
if oaa is not None and oaa.height and "outs_above_average" in oaa.columns:
cols = ["last_name, first_name", "display_team_name",
"primary_pos_formatted", "outs_above_average",
"fielding_runs_prevented"]
out = (oaa.select([c for c in cols if c in oaa.columns])
.sort("outs_above_average", descending=True).head(10))
else:
out = "OAA leaderboard unavailable right now"
out
โœ… outs above average





shape: (10, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ last_name, โ”† display_team_name โ”† primary_pos_forma โ”† outs_above_averag โ”† fielding_runs_pr โ”‚
โ”‚ first_name โ”† --- โ”† tted โ”† e โ”† evented โ”‚
โ”‚ --- โ”† str โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† โ”† str โ”† i64 โ”† i64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ Gimรฉnez, Andrรฉs โ”† Guardians โ”† 2B โ”† 20 โ”† 15 โ”‚
โ”‚ Young, Jacob โ”† Nationals โ”† CF โ”† 20 โ”† 18 โ”‚
โ”‚ Semien, Marcus โ”† Rangers โ”† 2B โ”† 19 โ”† 14 โ”‚
โ”‚ Swanson, Dansby โ”† Cubs โ”† SS โ”† 17 โ”† 13 โ”‚
โ”‚ Siani, Michael โ”† Cardinals โ”† CF โ”† 16 โ”† 14 โ”‚
โ”‚ Siri, Jose โ”† Rays โ”† CF โ”† 16 โ”† 14 โ”‚
โ”‚ Witt Jr., Bobby โ”† Royals โ”† SS โ”† 16 โ”† 12 โ”‚
โ”‚ Lindor, Francisco โ”† Mets โ”† SS โ”† 15 โ”† 11 โ”‚
โ”‚ Santana, Carlos โ”† Twins โ”† 1B โ”† 15 โ”† 11 โ”‚
โ”‚ Tovar, Ezequiel โ”† Rockies โ”† SS โ”† 15 โ”† 11 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 9 โ€” Find the X: the hardest-hit homers ๐Ÿš€โ€‹

mlb_statcast_search isn't just for one player โ€” point its filters at an outcome. Pass at_bat_result="home_run" over a short window to pull every homer, then sort by launch_speed to find the ones that were absolutely crushed. (Keep the window small โ€” a couple of days at a time.)

homers = safe(
"home runs (2-day)",
lambda: mlb.mlb_statcast_search(start_dt="2024-07-01", end_dt="2024-07-02",
at_bat_result="home_run"),
)
if homers is not None and homers.height and "launch_speed" in homers.columns:
print("homers in window:", homers.height)
cols = ["game_date", "player_name", "launch_speed",
"launch_angle", "hit_distance_sc"]
out = (homers.select([c for c in cols if c in homers.columns])
.sort("launch_speed", descending=True).head(10))
else:
out = "no homers in that window right now"
out
โœ… home runs (2-day)
homers in window: 51





shape: (10, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ game_date โ”† player_name โ”† launch_speed โ”† launch_angle โ”† hit_distance_sc โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† f64 โ”† i64 โ”† i64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 2024-07-02 โ”† De La Cruz, Elly โ”† 114.1 โ”† 21 โ”† 425 โ”‚
โ”‚ 2024-07-02 โ”† Judge, Aaron โ”† 112.5 โ”† 25 โ”† 381 โ”‚
โ”‚ 2024-07-02 โ”† Sรกnchez, Jesรบs โ”† 112.5 โ”† 24 โ”† 448 โ”‚
โ”‚ 2024-07-02 โ”† Ohtani, Shohei โ”† 112.0 โ”† 37 โ”† 433 โ”‚
โ”‚ 2024-07-02 โ”† Soler, Jorge โ”† 109.0 โ”† 21 โ”† 394 โ”‚
โ”‚ 2024-07-02 โ”† Rooker, Brent โ”† 108.7 โ”† 34 โ”† 405 โ”‚
โ”‚ 2024-07-02 โ”† Turner, Trea โ”† 108.5 โ”† 20 โ”† 422 โ”‚
โ”‚ 2024-07-02 โ”† Schneemann, Daniel โ”† 108.3 โ”† 28 โ”† 408 โ”‚
โ”‚ 2024-07-02 โ”† Riley, Austin โ”† 108.3 โ”† 36 โ”† 407 โ”‚
โ”‚ 2024-07-02 โ”† Witt Jr., Bobby โ”† 108.0 โ”† 24 โ”† 399 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 13 โ€” Every pitch of a single game (Savant gamefeed) ๐ŸŽฎโ€‹

mlb_statcast_gamefeed pulls Baseball Savant's rich single-game feed and tidies it to one row per pitch โ€” pitch type, velocity, plate location, and the batted-ball result โ€” across both teams. Feed it any game_pk from a schedule.

gf = safe(
f"gamefeed {gid}",
lambda: mlb.mlb_statcast_gamefeed(game_pk=gid),
)
if gf is not None and gf.height:
print("pitches tracked:", gf.height)
gcols = ["inning", "half_inning", "batter_name", "pitcher_name",
"pitch_type", "start_speed", "launch_speed", "events"]
out = gf.select([c for c in gcols if c in gf.columns]).head()
else:
out = "gamefeed unavailable right now"
out
โœ… gamefeed 744914
pitches tracked: 244





shape: (5, 8)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ inning โ”† half_innin โ”† batter_nam โ”† pitcher_na โ”† pitch_type โ”† start_spee โ”† launch_spe โ”† events โ”‚
โ”‚ --- โ”† g โ”† e โ”† me โ”† --- โ”† d โ”† ed โ”† --- โ”‚
โ”‚ i64 โ”† --- โ”† --- โ”† --- โ”† str โ”† --- โ”† --- โ”† str โ”‚
โ”‚ โ”† str โ”† str โ”† str โ”† โ”† f64 โ”† str โ”† โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 1 โ”† top โ”† Alex โ”† Yariel โ”† FF โ”† 95.9 โ”† null โ”† Flyout โ”‚
โ”‚ โ”† โ”† Bregman โ”† Rodrรญguez โ”† โ”† โ”† โ”† โ”‚
โ”‚ 1 โ”† top โ”† Alex โ”† Yariel โ”† FF โ”† 94.1 โ”† null โ”† Flyout โ”‚
โ”‚ โ”† โ”† Bregman โ”† Rodrรญguez โ”† โ”† โ”† โ”† โ”‚
โ”‚ 1 โ”† top โ”† Alex โ”† Yariel โ”† SL โ”† 85.8 โ”† 92.9 โ”† Flyout โ”‚
โ”‚ โ”† โ”† Bregman โ”† Rodrรญguez โ”† โ”† โ”† โ”† โ”‚
โ”‚ 1 โ”† top โ”† Jake โ”† Yariel โ”† FF โ”† 94.1 โ”† null โ”† Strikeout โ”‚
โ”‚ โ”† โ”† Meyers โ”† Rodrรญguez โ”† โ”† โ”† โ”† โ”‚
โ”‚ 1 โ”† top โ”† Jake โ”† Yariel โ”† FF โ”† 95.4 โ”† null โ”† Strikeout โ”‚
โ”‚ โ”† โ”† Meyers โ”† Rodrรญguez โ”† โ”† โ”† โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 10 โ€” The biggest swings of a game (WPA) ๐Ÿ“ˆโ€‹

mlb_win_probability returns every play with the live win-probability before and after, plus Win Probability Added (homeTeamWinProbabilityAdded). Sort by its absolute value to surface the most pivotal moments of the game.

def wpa_swings(game_pk, n=8):
plays = mlb.mlb_win_probability(game_pk=game_pk, return_parsed=False)
df = pl.json_normalize(plays, separator=".", max_level=2)
keep = ["about.inning", "about.halfInning", "result.event",
"result.description", "homeTeamWinProbabilityAdded"]
df = df.select([c for c in keep if c in df.columns])
if "homeTeamWinProbabilityAdded" in df.columns:
df = (df.with_columns(
pl.col("homeTeamWinProbabilityAdded").abs().alias("wpa_abs"))
.sort("wpa_abs", descending=True).drop("wpa_abs").head(n))
return df

# Reuse the game_pk we pulled earlier (falls back to a known game).
wpa = safe(f"WPA swings {gid}", lambda: wpa_swings(gid))
wpa if wpa is not None else "win-probability unavailable right now"
โœ… WPA swings 744914





shape: (8, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ about.inning โ”† about.halfInning โ”† result.event โ”† result.description โ”† homeTeamWinProbabili โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† tyAdded โ”‚
โ”‚ i64 โ”† str โ”† str โ”† str โ”† --- โ”‚
โ”‚ โ”† โ”† โ”† โ”† f64 โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 8 โ”† bottom โ”† Double โ”† Spencer Horwitz โ”† 23.7 โ”‚
โ”‚ โ”† โ”† โ”† doubles (3) onโ€ฆ โ”† โ”‚
โ”‚ 8 โ”† bottom โ”† Groundout โ”† Daulton Varsho โ”† -20.4 โ”‚
โ”‚ โ”† โ”† โ”† grounds out shaโ€ฆ โ”† โ”‚
โ”‚ 8 โ”† bottom โ”† Lineout โ”† George Springer โ”† -19.3 โ”‚
โ”‚ โ”† โ”† โ”† lines out to tโ€ฆ โ”† โ”‚
โ”‚ 5 โ”† top โ”† Home Run โ”† Jeremy Peรฑa homers โ”† -14.8 โ”‚
โ”‚ โ”† โ”† โ”† (6) on a flโ€ฆ โ”† โ”‚
โ”‚ 9 โ”† top โ”† Home Run โ”† Yordan Alvarez โ”† -12.6 โ”‚
โ”‚ โ”† โ”† โ”† homers (17) on โ€ฆ โ”† โ”‚
โ”‚ 8 โ”† bottom โ”† Walk โ”† Addison Barger โ”† 9.8 โ”‚
โ”‚ โ”† โ”† โ”† walks. โ”† โ”‚
โ”‚ 8 โ”† bottom โ”† Strikeout โ”† Bo Bichette strikes โ”† -9.0 โ”‚
โ”‚ โ”† โ”† โ”† out swingiโ€ฆ โ”† โ”‚
โ”‚ 7 โ”† top โ”† Grounded Into DP โ”† Yainer Diaz grounds โ”† 8.1 โ”‚
โ”‚ โ”† โ”† โ”† into a douโ€ฆ โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 11 โ€” Season award winners (MVP, Cy Young) ๐Ÿ…โ€‹

mlb_awards is the catalog of every award id; mlb_award_recipients names the season's winner for one id. We grab the four marquee awards โ€” AL/NL MVP and AL/NL Cy Young โ€” and stack them into one tidy board.

def award_board(season, award_ids):
frames = []
for label, aid in award_ids.items():
df = mlb.mlb_award_recipients(award_id=aid, season=season)
if df is not None and df.height:
name_col = ("player_name_first_last" if "player_name_first_last"
in df.columns else "name")
frames.append(df.select([
pl.lit(label).alias("award"),
pl.col("season"),
pl.col(name_col).alias("winner"),
]))
return pl.concat(frames, how="vertical") if frames else pl.DataFrame()

AWARDS = {"AL MVP": "ALMVP", "NL MVP": "NLMVP",
"AL Cy Young": "ALCY", "NL Cy Young": "NLCY"}
board = safe("2024 award winners", lambda: award_board(SAMPLE_SEASON, AWARDS))
board if (board is not None and board.height) else "awards unavailable right now"
โœ… 2024 award winners





shape: (4, 3)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ award โ”† season โ”† winner โ”‚
โ”‚ --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ AL MVP โ”† 2024 โ”† Aaron Judge โ”‚
โ”‚ NL MVP โ”† 2024 โ”† Shohei Ohtani โ”‚
โ”‚ AL Cy Young โ”† 2024 โ”† Tarik Skubal โ”‚
โ”‚ NL Cy Young โ”† 2024 โ”† Chris Sale โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Recipe 12 โ€” The first-round draft board ๐ŸŽ“โ€‹

mlb_draft returns the amateur draft, organized into rounds of picks. Pass round_=1 and flatten the picks into one row per selection โ€” who went where, and from which school.

def draft_board(year, round_=1):
raw = mlb.mlb_draft(year=year, round_=round_, return_parsed=False)
picks = raw["drafts"]["rounds"][0]["picks"]
rows = [{
"pick": p.get("pickNumber"),
"player": p.get("person", {}).get("fullName"),
"team": p.get("team", {}).get("name"),
"school": p.get("school", {}).get("name"),
} for p in picks]
return pl.DataFrame(rows)

draft = safe("2024 first round", lambda: draft_board(2024, round_=1))
draft.head(12) if (draft is not None and draft.height) else "draft unavailable right now"
โœ… 2024 first round





shape: (12, 4)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ pick โ”† player โ”† team โ”† school โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ i64 โ”† str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 1 โ”† Travis Bazzana โ”† Cleveland Guardians โ”† Oregon State โ”‚
โ”‚ 2 โ”† Chase Burns โ”† Cincinnati Reds โ”† Wake Forest โ”‚
โ”‚ 3 โ”† Charlie Condon โ”† Colorado Rockies โ”† Georgia โ”‚
โ”‚ 4 โ”† Nick Kurtz โ”† Athletics โ”† Wake Forest โ”‚
โ”‚ 5 โ”† Hagen Smith โ”† Chicago White Sox โ”† Arkansas โ”‚
โ”‚ 6 โ”† Jac Caglianone โ”† Kansas City Royals โ”† Florida โ”‚
โ”‚ 7 โ”† JJ Wetherholt โ”† St. Louis Cardinals โ”† West Virginia โ”‚
โ”‚ 8 โ”† Christian Moore โ”† Los Angeles Angels โ”† Tennessee โ”‚
โ”‚ 9 โ”† Konnor Griffin โ”† Pittsburgh Pirates โ”† Jackson Prep School โ”‚
โ”‚ 10 โ”† Seaver King โ”† Washington Nationals โ”† Wake Forest โ”‚
โ”‚ 11 โ”† Bryce Rainer โ”† Detroit Tigers โ”† Harvard-Westlake HS โ”‚
โ”‚ 12 โ”† Braden Montgomery โ”† Boston Red Sox โ”† Texas A&M โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“… A whole season's schedule via ESPNโ€‹

Want every game in a season without looping over dates? The bulk load_mlb_* release-parquet loaders are still being wired up (they raise a friendly NotImplementedError for now), and they point you to the working path: espn_mlb_schedule with dates=<season year> pulls the full slate as one wide frame. Scores come back as strings โ€” cast before doing arithmetic.

season_sched = safe(
"ESPN 2024 season schedule",
lambda: mlb.espn_mlb_schedule(dates=2024),
)
if season_sched is not None and season_sched.height:
print("games:", season_sched.height)
scols = ["game_id", "away_display_name", "away_score",
"home_display_name", "home_score", "status_type_completed"]
out = season_sched.select([c for c in scols if c in season_sched.columns]).head()
else:
out = "ESPN schedule unavailable right now"
out
โœ… ESPN 2024 season schedule
games: 500





shape: (5, 6)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ game_id โ”† away_display_name โ”† away_score โ”† home_display_name โ”† home_score โ”† status_type_compl โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”† eted โ”‚
โ”‚ str โ”† str โ”† str โ”† str โ”† str โ”† --- โ”‚
โ”‚ โ”† โ”† โ”† โ”† โ”† bool โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 401576167 โ”† Los Angeles โ”† 14 โ”† San Diego Padres โ”† 1 โ”† true โ”‚
โ”‚ โ”† Dodgers โ”† โ”† โ”† โ”† โ”‚
โ”‚ 401576169 โ”† Kansas City Royals โ”† 4 โ”† Texas Rangers โ”† 5 โ”† true โ”‚
โ”‚ 401576643 โ”† Chicago White Sox โ”† 1 โ”† Chicago Cubs โ”† 8 โ”† true โ”‚
โ”‚ 401576170 โ”† San Diego Padres โ”† 1 โ”† Los Angeles โ”† 4 โ”† true โ”‚
โ”‚ โ”† โ”† โ”† Dodgers โ”† โ”† โ”‚
โ”‚ 401576168 โ”† Arizona โ”† 0 โ”† Colorado Rockies โ”† 3 โ”† true โ”‚
โ”‚ โ”† Diamondbacks โ”† โ”† โ”† โ”† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โšช Secondary path: ESPN teams (espn_mlb_*)โ€‹

espn_mlb_teams returns one wide polars frame โ€” handy as a cross-check, or when you want ESPN's display names and ids alongside the MLB Stats API ones.

espn_teams = safe("ESPN teams", lambda: mlb.espn_mlb_teams())
ecols = ["team_id", "team_location", "team_name", "team_abbreviation", "team_display_name"]
(espn_teams.select([c for c in ecols if c in espn_teams.columns]).head()
if espn_teams is not None else "ESPN teams unavailable right now")
โœ… ESPN teams





shape: (5, 5)
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ team_id โ”† team_location โ”† team_name โ”† team_abbreviation โ”† team_display_name โ”‚
โ”‚ --- โ”† --- โ”† --- โ”† --- โ”† --- โ”‚
โ”‚ str โ”† str โ”† str โ”† str โ”† str โ”‚
โ•žโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ก
โ”‚ 29 โ”† Arizona โ”† Diamondbacks โ”† ARI โ”† Arizona Diamondbacks โ”‚
โ”‚ 11 โ”† Athletics โ”† Athletics โ”† ATH โ”† Athletics โ”‚
โ”‚ 15 โ”† Atlanta โ”† Braves โ”† ATL โ”† Atlanta Braves โ”‚
โ”‚ 1 โ”† Baltimore โ”† Orioles โ”† BAL โ”† Baltimore Orioles โ”‚
โ”‚ 2 โ”† Boston โ”† Red Sox โ”† BOS โ”† Boston Red Sox โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐ŸŽ‰ Where to nextโ€‹

  • Everything returns polars by default โ€” pass return_as_pandas=True for a pandas frame, or return_parsed=False on the mlb_* wrappers for raw JSON.
  • Full reference: the MLB pages in the sidebar โ€” MLB Stats API + Statcast helpers, the full MLB Stats API surface, and the ESPN core / site / web endpoints.
  • R user? The same data lives in baseballr.
  • Compare conventions with the other league intros (04_nba_intro.ipynb, 07_nhl_intro.ipynb) or the cross-sport 01_quickstart.ipynb.

Now go find the next 60-homer season. โšพ๐Ÿ”ฅ