Skip to main content
Version: 0.0.56

MLB — additional Python functions

Hand-written wrappers, loaders, and helpers in sportsdataverse.mlb not covered by the generated API-endpoint reference above.

Statcast

statcast_gamefeed(game_pk: 'int', at_bat_number: 'Optional[int]' = None, **kwargs) -> 'Dict'

GET /gf?game_pk=... — Savant per-game JSON feed (richer than the Stats API live feed).

Returns a dict with team_home, team_away, scoreboard, game_status, … plus per-play pitch tracking and shift positioning details.

Parameters

ParameterTypeDefaultDescription
game_pkint
at_bat_numberOptional[int]None

statcast_leaderboard_arm_strength(year: 'Union[int, str]', pos: 'Optional[str]' = None, csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/arm-strength — outfielder + infielder arm-strength leaders.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
posOptional[str]None
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
fielder_namecharacter
player_idintegerMLBAM player ID.
team_namecharacterTeam name.
primary_positioninteger
primary_position_namecharacterPrimary fielding position name.
total_throwsinteger
total_throws_1binteger
total_throws_2binteger
total_throws_3binteger
total_throws_ssinteger
total_throws_lfinteger
total_throws_cfinteger
total_throws_rfinteger
total_throws_infinteger
total_throws_ofinteger
max_arm_strengthdouble
arm_1bdouble
arm_2bdouble
arm_3bcharacter
arm_ssdouble
arm_lfdouble
arm_cfcharacter
arm_rfdouble
arm_infdouble
arm_ofdouble
arm_overalldouble

statcast_leaderboard_bat_tracking(year: 'Union[int, str]', type_: 'str' = 'batter-swings', min_: 'Optional[Union[int, str]]' = 'q', attack_zone: 'Optional[str]' = None, csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/bat-tracking — swing speed / attack angle (2024+).

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
type_str'batter-swings'
min_Optional[Union[int, str]]'q'
attack_zoneOptional[str]None
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
idintegerId.
namecharacterDisplay name.
swings_competitiveinteger
percent_swings_competitivedouble
contactinteger
avg_bat_speeddouble
hard_swing_ratedouble
squared_up_per_bat_contactdouble
squared_up_per_swingdouble
blast_per_bat_contactdouble
blast_per_swingdouble
swing_lengthdoubleLength of the swing path to contact (feet).
swordsinteger
batter_run_valuedouble
whiffscharacter
whiff_per_swingdouble
batted_ball_eventsinteger
batted_ball_event_per_swingdouble

statcast_leaderboard_catch_probability(year: 'Union[int, str]', csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/catch_probability — outfielder catch-probability leaderboard.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
last_name, first_namecharacterPlayer name as "Last, First".
player_idintegerMLBAM player ID.
oaainteger
n_fieldout_5starsinteger
n_opp_5starsinteger
n_5star_percentdouble
n_fieldout_4starsinteger
n_opp_4starsinteger
n_4star_percentdouble
n_fieldout_3starsinteger
n_opp_3starsinteger
n_3star_percentdouble
n_fieldout_2starsinteger
n_opp_2starsinteger
n_2star_percentdouble
n_fieldout_1starsinteger
n_opp_1starsinteger
n_1star_percentdouble

statcast_leaderboard_custom(year: 'Union[int, str]', type_: 'str', selections: 'str', filter_: 'Optional[str]' = None, min_: 'Optional[Union[int, str]]' = 'q', sort: 'Optional[str]' = None, sort_dir: 'str' = 'desc', csv: 'bool' = False, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/custom — build-your-own metric leaderboard.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]season year.
type_strleaderboard type (batter / pitcher / fielder).
selectionsstrcomma-separated metric ids (e.g. "xba,xslg,xwoba").
filter_Optional[str]Nonerow filter (e.g. "hand_R").
min_Optional[Union[int, str]]'q'minimum threshold; "q" for qualified.
sortOptional[str]Nonemetric to sort by; sort_dir "desc" or "asc".
sort_dirstr'desc'
csvboolFalsewhen True, request CSV; otherwise JSON.
return_as_pandasboolFalse

statcast_leaderboard_expected_statistics(year: 'Union[int, str]', type_: 'str' = 'batter', position: 'Optional[str]' = None, team: 'Optional[str]' = None, min_: 'Optional[Union[int, str]]' = 'q', csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/expected_statistics — xBA / xSLG / xwOBA / xISO leaders.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
type_str'batter'
positionOptional[str]None
teamOptional[str]None
min_Optional[Union[int, str]]'q'
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
last_name, first_namecharacterPlayer name as "Last, First".
player_idintegerMLBAM player ID.
yearintegerDraft year (YYYY).
painteger
bipinteger
badouble
est_badouble
est_ba_minus_ba_diffdouble
slgdoubleSlugging percentage.
est_slgdouble
est_slg_minus_slg_diffdouble
wobadouble
est_wobadouble
est_woba_minus_woba_diffdouble

statcast_leaderboard_outs_above_average(year: 'Union[int, str]', pos: 'Optional[str]' = None, team: 'Optional[str]' = None, min_: 'Optional[Union[int, str]]' = 'q', csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/outs_above_average — OAA fielding leaderboard.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
posOptional[str]None
teamOptional[str]None
min_Optional[Union[int, str]]'q'
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
last_name, first_namecharacterPlayer name as "Last, First".
player_idintegerMLBAM player ID.
display_team_namecharacter
yearintegerDraft year (YYYY).
primary_pos_formattedcharacter
fielding_runs_preventedinteger
outs_above_averageinteger
outs_above_average_infrontinteger
outs_above_average_lateral_toward3blineinteger
outs_above_average_lateral_toward1blineinteger
outs_above_average_behindinteger
outs_above_average_rhhinteger
outs_above_average_lhhinteger
actual_success_rate_formattedcharacter
adj_estimated_success_rate_formattedcharacter
diff_success_rate_formattedcharacter

statcast_leaderboard_pitch_arsenal(year: 'Union[int, str]', team: 'Optional[str]' = None, min_: 'Optional[Union[int, str]]' = 'q', pitch_hand: 'Optional[str]' = None, csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/pitch-arsenal-stats — per-pitch outcome stats by pitcher.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
teamOptional[str]None
min_Optional[Union[int, str]]'q'
pitch_handOptional[str]None
csvboolTrue
return_as_pandasboolFalse

statcast_leaderboard_poptime(year: 'Union[int, str]', min2b: 'Optional[int]' = None, min3b: 'Optional[int]' = None, csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/poptime — catcher pop-time leaders.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
min2bOptional[int]None
min3bOptional[int]None
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
entity_namecharacter
entity_idinteger
team_idintegerUnique ESPN team identifier.
ageintegerPlayer age (in years).
maxeff_arm_2b_3b_sbadouble
exchange_2b_3b_sbadouble
pop_2b_sba_countinteger
pop_2b_sbadouble
pop_2b_csdouble
pop_2b_sbdouble
pop_3b_sba_countinteger
pop_3b_sbadouble
pop_3b_csdouble
pop_3b_sbdouble

statcast_leaderboard_sprint_speed(year: 'Union[int, str]', position: 'Optional[str]' = None, team: 'Optional[str]' = None, min_opp: 'Optional[int]' = None, csv: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs)

GET /leaderboard/sprint_speed — sprint-speed (ft/sec) leaders.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
positionOptional[str]None
teamOptional[str]None
min_oppOptional[int]None
csvboolTrue
return_as_pandasboolFalse

Returns

col_nametypedescription
last_name, first_namecharacterPlayer name as "Last, First".
player_idintegerMLBAM player ID.
team_idintegerUnique ESPN team identifier.
teamcharacterTeam.
positioncharacterListed roster position (G, F, C, etc.).
ageintegerPlayer age (in years).
competitive_runsinteger
boltsinteger
hp_to_1bdouble
sprint_speeddouble

statcast_player_page(player_id: 'int', stats: 'Optional[str]' = None, **kwargs) -> 'str'

GET /savant-player/{playerId} — Savant player profile page (HTML with embedded JSON).

Returns the raw HTML text. The page embeds JSON blobs under <script id="player-data" type="application/json">…</script> (and a handful of others) that carry the canonical Statcast snapshots for the player. Extracting those blobs is a follow-up — for now the wrapper returns the full HTML so callers can mine it.

TODO: add a sibling statcast_player_data that does the BS4 / regex extraction and returns a typed dict.

Parameters

ParameterTypeDefaultDescription
player_idint
statsOptional[str]None

GET /statcast_search/csv — pitch-by-pitch Statcast search.

Returns a polars DataFrame of pitches matching the filter set. The Savant endpoint caps results at 25,000 rows per response with no pagination; if the wrapper detects exactly 25,000 rows in the response and raise_on_truncation=True (default), it raises RuntimeError rather than silently returning a partial frame. Use statcast_search_chunked for date ranges that may exceed 25k pitches.

Most filter args accept either a scalar or an iterable; the wrapper joins iterables with Savant's trailing-pipe convention (e.g. ["FF","SL"]"FF|SL|").

Parameters

ParameterTypeDefaultDescription
start_datestr
end_datestr
player_typestr'batter'"batter" (default) or "pitcher" — controls which side of the matchup batters_lookup / pitchers_lookup / team filters apply to.
seasonOptional[Union[str, Iterable[str]]]None
game_typeOptional[Union[str, Iterable[str]]]None
batters_lookupOptional[Union[int, Iterable[int]]]None
pitchers_lookupOptional[Union[int, Iterable[int]]]None
teamOptional[str]None
opponentOptional[str]None
home_roadOptional[str]None"home" / "road".
stadiumOptional[Union[int, str]]Nonevenue id.
pitcher_throwsOptional[str]None
batter_standsOptional[str]None
positionOptional[Union[str, Iterable[str]]]None
pitch_typeOptional[Union[str, Iterable[str]]]Nonepipe-list of pitch codes ("FF","SL","CU","CH","SI","FC"…).
countOptional[Union[str, Iterable[str]]]Nonepipe-list of pitcher–batter counts (e.g. ["00","11"]).
at_bat_resultOptional[Union[str, Iterable[str]]]Nonepipe-list of PA outcomes ("single","home_run","walk"…).
batted_ball_typeOptional[Union[str, Iterable[str]]]None"fly_ball","ground_ball","line_drive","popup".
pitch_resultOptional[Union[str, Iterable[str]]]None"called_strike","ball","swinging_strike","foul",….
zoneOptional[Union[str, Iterable[str]]]Nonegameday zone (114).
outsOptional[Union[int, Iterable[int]]]None
inningOptional[Union[int, Iterable[int]]]None
runners_onOptional[Union[str, Iterable[str]]]None"none","on_first","on_second","on_third","RISP"
flagOptional[Union[str, Iterable[str]]]Nonespecial flags ("is_barrel","is_solidcontact","is_putaway"…).
return_as_pandasboolFalseconvert the returned polars frame to pandas.
raise_on_truncationboolTruewhen True (default), raise if the response has exactly 25,000 rows.

Returns

polars.DataFrame (or pandas if return_as_pandas=True) with one row per pitch, ~90 columns covering pitch tracking, batted-ball metrics, Statcast outcomes, and game/play context.

col_nametypedescription
pitch_typecharacterAbbreviation of the pitch type thrown (e.g. FF, SL, CH).
game_datecharacterGame date (YYYY-MM-DD).
release_speeddoublePitch velocity out of the hand (mph).
release_pos_xdoubleHorizontal release position of the ball, catcher's perspective (feet).
release_pos_zdoubleVertical release position of the ball, catcher's perspective (feet).
player_namecharacterPitcher (or batter, by query) name, Last, First.
batterintegerFull name of the batter for this swing record.
pitcherintegerWhether the position is a pitcher.
eventscharacterNested list of non-game events.
descriptioncharacterLong-form description text.
spin_dircharacterDeprecated spin direction field, no longer populated.
spin_rate_deprecatedcharacterDeprecated legacy spin-rate field, no longer populated.
break_angle_deprecatedcharacterDeprecated legacy break-angle field, no longer populated.
break_length_deprecatedcharacterDeprecated legacy break-length field, no longer populated.
zoneintegerStrike-zone region the pitch crossed (1-14 Gameday zone).
descharacterFull text description of the play.
game_typecharacterGame type code (R, P, etc.).
standcharacterSide of the plate the batter is standing (L or R).
p_throwscharacterHand the pitcher throws with (L or R).
home_teamcharacterHome team name.
away_teamcharacterAway team name.
typecharacterRecord type / category.
hit_locationintegerFielder position number that fielded the ball.
bb_typecharacterBatted-ball type (ground_ball, line_drive, fly_ball, popup).
ballsintegerBall count before the pitch.
strikesintegerStrike count before the pitch.
game_yearintegerSeason year of the game.
pfx_xdoubleHorizontal pitch movement from the catcher's perspective (feet).
pfx_zdoubleVertical pitch movement from the catcher's perspective (feet).
plate_xdoubleHorizontal position of the pitch crossing the plate (feet from center).
plate_zdoubleVertical position of the pitch crossing the plate (feet above ground).
on_3bcharacterMLBAM ID of the runner on third base, if any.
on_2bintegerMLBAM ID of the runner on second base, if any.
on_1bintegerMLBAM ID of the runner on first base, if any.
outs_when_upintegerNumber of outs when the batter came to the plate.
inningintegerInning number.
inning_topbotcharacterHalf of the inning (Top or Bot).
hc_xdoubleHit coordinate X on the field diagram.
hc_ydoubleHit coordinate Y on the field diagram.
tfs_deprecatedcharacterDeprecated time-from-start field, no longer populated.
tfs_zulu_deprecatedcharacterDeprecated Zulu time-from-start field, no longer populated.
umpirecharacterDeprecated umpire field, no longer populated.
sv_idcharacterDeprecated Sportvision/Statcast pitch identifier, no longer populated.
vx0doubleVelocity of the pitch in the x-direction at y=50 ft (ft/s).
vy0doubleVelocity of the pitch in the y-direction at y=50 ft (ft/s).
vz0doubleVelocity of the pitch in the z-direction at y=50 ft (ft/s).
axdoubleAcceleration of the pitch in the x-direction at y=50 ft (ft/s^2).
aydoubleAcceleration of the pitch in the y-direction at y=50 ft (ft/s^2).
azdoubleAcceleration of the pitch in the z-direction at y=50 ft (ft/s^2).
sz_topdoubleTop of the batter's strike zone for the pitch (feet).
sz_botdoubleBottom of the batter's strike zone for the pitch (feet).
hit_distance_scintegerStatcast-measured projected distance of the batted ball (feet).
launch_speeddoubleExit velocity of the batted ball (mph).
launch_angleintegerVertical launch angle of the batted ball (degrees).
effective_speeddoublePerceived velocity adjusted for release extension (mph).
release_spin_rateintegerSpin rate of the pitch at release (rpm).
release_extensiondoubleDistance toward the plate at release (feet).
game_pkintegerUnique game identifier.
fielder_2integerMLBAM ID of the catcher.
fielder_3integerMLBAM ID of the first baseman.
fielder_4integerMLBAM ID of the second baseman.
fielder_5integerMLBAM ID of the third baseman.
fielder_6integerMLBAM ID of the shortstop.
fielder_7integerMLBAM ID of the left fielder.
fielder_8integerMLBAM ID of the center fielder.
fielder_9integerMLBAM ID of the right fielder.
release_pos_ydoubleRelease position of the ball toward the plate (feet).
estimated_ba_using_speedangledoubleExpected batting average based on exit velocity and launch angle.
estimated_woba_using_speedangledoubleExpected wOBA based on exit velocity and launch angle.
woba_valuedoublewOBA value assigned to the event.
woba_denomintegerwOBA denominator (plate-appearance weight) for the event.
babip_valueintegerBABIP value assigned to the event (0 or 1).
iso_valueintegerIsolated power value assigned to the event.
launch_speed_angleintegerBatted-ball classification code (1-6) from exit velocity and angle.
at_bat_numberintegerSequential plate-appearance number within the game.
pitch_numberintegerPitch number within the plate appearance.
pitch_namecharacterFull name of the pitch type (e.g. 4-Seam Fastball, Slider).
home_scoreintegerHome team run total after the play.
away_scoreintegerAway team run total after the play.
bat_scoreintegerBatting team score before the pitch.
fld_scoreintegerFielding team score before the pitch.
post_away_scoreintegerAway team score after the pitch.
post_home_scoreintegerHome team score after the pitch.
post_bat_scoreintegerBatting team score after the pitch.
post_fld_scoreintegerFielding team score after the pitch.
if_fielding_alignmentcharacterInfield defensive alignment (Standard, Strategic, Infield shift).
of_fielding_alignmentcharacterOutfield defensive alignment (Standard, Strategic, 4th outfielder).
spin_axisintegerSpin axis of the pitch as a clock-face angle (degrees).
delta_home_win_expdoubleChange in home team win expectancy on the play.
delta_run_expdoubleChange in run expectancy on the play.
bat_speeddoubleBat speed at the point of contact (mph).
swing_lengthdoubleLength of the swing path to contact (feet).
miss_distancedouble
estimated_slg_using_speedangledoubleExpected slugging based on exit velocity and launch angle.
delta_pitcher_run_expdoubleChange in run expectancy credited to the pitcher.
hyper_speeddoubleAdjusted (90th-percentile) exit velocity (mph).
home_score_diffintegerHome team score minus away team score before the pitch.
bat_score_diffintegerBatting team score minus fielding team score before the pitch.
home_win_expdoubleHome team win expectancy before the play.
bat_win_expdoubleBatting team win expectancy before the play.
age_pit_legacyintegerPitcher age using the legacy calculation.
age_bat_legacyintegerBatter age using the legacy calculation.
age_pitintegerPitcher age for the season.
age_batintegerBatter age for the season.
n_thruorder_pitcherintegerTimes through the order the pitcher is facing the lineup.
n_priorpa_thisgame_player_at_batintegerNumber of prior plate appearances by the batter in the game.
pitcher_days_since_prev_gameintegerDays since the pitcher's previous game appearance.
batter_days_since_prev_gameintegerDays since the batter's previous game appearance.
pitcher_days_until_next_gameintegerDays until the pitcher's next game appearance.
batter_days_until_next_gameintegerDays until the batter's next game appearance.
api_break_z_with_gravitydoubleVertical pitch break including gravity (inches).
api_break_x_armdoubleHorizontal pitch break to the pitcher's arm side (inches).
api_break_x_batter_indoubleHorizontal pitch break toward/away from the batter (inches).
arm_angledoublePitcher's arm angle at release (degrees).
attack_angledoubleAngle of the bat's path at contact (degrees).
attack_directiondoubleHorizontal direction of the swing at contact (degrees).
swing_path_tiltdoubleVertical tilt of the swing path (degrees).
intercept_ball_minus_batter_pos_x_inchesdoubleHorizontal offset of ball-bat intercept from batter position (inches).
intercept_ball_minus_batter_pos_y_inchesdoubleDepth offset of ball-bat intercept from batter position (inches).

statcast_search_chunked(start_date: 'str', end_date: 'str', *, chunk_days: 'int' = 5, return_as_pandas: 'bool' = False, **kwargs)

Auto-chunk a date range into chunk_days-day windows and concatenate.

Wraps statcast_search and stitches results client-side. Useful for multi-month or full-season pulls that would exceed the 25k row cap in a single request.

Parameters

ParameterTypeDefaultDescription
start_datestr
end_datestr
chunk_daysint5window size in days (default 5 — typical for the regular season; smaller for postseason when there are more high-event games).
return_as_pandasboolFalseconvert the concatenated frame to pandas.

Returns

polars.DataFrame (or pandas) of all pitches in the range.

MLB Stats API

mlb_api_attendance(team_id: 'Optional[int]' = None, league_id: 'Optional[Union[int, str]]' = None, season: 'Optional[Union[int, str]]' = None, league_list_id: 'Optional[str]' = None, game_type: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/attendance — game attendance figures.

Parameters

ParameterTypeDefaultDescription
team_idOptional[int]None
league_idOptional[Union[int, str]]None
seasonOptional[Union[int, str]]None
league_list_idOptional[str]None
game_typeOptional[str]None

mlb_api_divisions(sport_id: 'int' = 1, league_id: 'Optional[Union[int, str]]' = None, division_id: 'Optional[int]' = None, **kwargs) -> 'Dict'

GET /api/v1/divisions — list divisions.

Parameters

ParameterTypeDefaultDescription
sport_idint1
league_idOptional[Union[int, str]]None
division_idOptional[int]None

mlb_api_draft_prospects(year: 'Union[int, str]', scouting_report: 'Optional[bool]' = None, limit: 'int' = 100, **kwargs) -> 'Dict'

GET /api/v1/draft/prospects/{year} — draft prospect list for a year.

Parameters

ParameterTypeDefaultDescription
yearUnion[int, str]
scouting_reportOptional[bool]None
limitint100

mlb_api_pbp_diff(game_pk: 'int', start_timecode: 'str', end_timecode: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/game/{gamePk}/feed/live/diffPatch — JSON-patch diff of the live feed.

Replays of in-game state for low-bandwidth clients.

Parameters

ParameterTypeDefaultDescription
game_pkint
start_timecodestr
end_timecodeOptional[str]None

mlb_api_pbp_live(game_pk: 'int', language: 'Optional[str]' = None, timecode: 'Optional[str]' = None, hydrate: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1.1/game/{gamePk}/feed/live — live firehose (v1.1).

Top-level keys: copyright, gamePk, link, metaData, gameData, liveData. Includes Statcast metrics where available. The historical name mlb_api_pbp is preserved as an alias.

Parameters

ParameterTypeDefaultDescription
game_pkint
languageOptional[str]None
timecodeOptional[str]None
hydrateOptional[str]None
fieldsOptional[str]None

mlb_api_person_stats(person_id: 'int', stats: 'str', group: 'str' = 'hitting', season: 'Optional[Union[int, str]]' = None, season_type: 'Optional[str]' = None, sport_ids: 'Optional[Union[int, List[int]]]' = None, game_type: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/people/{personId}/stats — player aggregate stats.

stats: season, career, yearByYear, vsTeam, vsPlayer, byMonth, byDayOfWeek, homeAndAway, gameLog, lastXGames, …

Parameters

ParameterTypeDefaultDescription
person_idint
statsstr
groupstr'hitting'
seasonOptional[Union[int, str]]None
season_typeOptional[str]None
sport_idsOptional[Union[int, List[int]]]None
game_typeOptional[str]None
fieldsOptional[str]None

mlb_api_schedule(date: 'Optional[str]' = None, start_date: 'Optional[str]' = None, end_date: 'Optional[str]' = None, team_id: 'Optional[int]' = None, opponent_id: 'Optional[int]' = None, season: 'Optional[Union[int, str]]' = None, sport_id: 'int' = 1, game_type: 'Optional[str]' = None, league_id: 'Optional[Union[int, str]]' = None, hydrate: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/schedule — schedule of games for a date, range, team, or season.

Response: dates[].games[].

Parameters

ParameterTypeDefaultDescription
dateOptional[str]None
start_dateOptional[str]None
end_dateOptional[str]None
team_idOptional[int]None
opponent_idOptional[int]None
seasonOptional[Union[int, str]]None
sport_idint1
game_typeOptional[str]None
league_idOptional[Union[int, str]]None
hydrateOptional[str]None
fieldsOptional[str]None

mlb_api_seasons(sport_id: 'int' = 1, season: 'Optional[Union[int, str]]' = None, all_seasons: 'bool' = False, **kwargs) -> 'Dict'

GET /api/v1/seasons — list of seasons for a sport.

Parameters

ParameterTypeDefaultDescription
sport_idint1
seasonOptional[Union[int, str]]None
all_seasonsboolFalse

mlb_api_standings(league_id: 'Union[int, str, List[int]]' = '103,104', season: 'Optional[Union[int, str]]' = None, date: 'Optional[str]' = None, standings_types: 'Optional[str]' = None, hydrate: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/standings — league standings.

league_id: 103 AL, 104 NL (comma-separated for both, the default). standings_types e.g. regularSeason, wildCard, divisionLeaders.

Parameters

ParameterTypeDefaultDescription
league_idUnion[int, str, List[int]]'103,104'
seasonOptional[Union[int, str]]None
dateOptional[str]None
standings_typesOptional[str]None
hydrateOptional[str]None
fieldsOptional[str]None

mlb_api_stats(stats: 'str', group: 'str', season: 'Optional[Union[int, str]]' = None, sport_id: 'int' = 1, league_id: 'Optional[Union[int, str]]' = None, team_id: 'Optional[int]' = None, player_pool: 'Optional[str]' = None, game_type: 'Optional[str]' = None, limit: 'int' = 50, offset: 'int' = 0, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/stats — generic stats query.

stats selects the slice (season, career, yearByYear, …) and group selects the stat group (hitting, pitching, fielding). Filters: season, team_id, league_id, game_type, player_pool.

Parameters

ParameterTypeDefaultDescription
statsstr
groupstr
seasonOptional[Union[int, str]]None
sport_idint1
league_idOptional[Union[int, str]]None
team_idOptional[int]None
player_poolOptional[str]None
game_typeOptional[str]None
limitint50
offsetint0
fieldsOptional[str]None

mlb_api_stats_leaders(leader_categories: 'str', season: 'Optional[Union[int, str]]' = None, leader_game_types: 'Optional[str]' = None, stat_group: 'Optional[str]' = None, league_id: 'Optional[Union[int, str]]' = None, sport_id: 'int' = 1, limit: 'int' = 10, **kwargs) -> 'Dict'

GET /api/v1/stats/leaders — top-N leaders for a stat category.

Parameters

ParameterTypeDefaultDescription
leader_categoriesstr
seasonOptional[Union[int, str]]None
leader_game_typesOptional[str]None
stat_groupOptional[str]None
league_idOptional[Union[int, str]]None
sport_idint1
limitint10

mlb_api_stats_streaks(streak_type: 'str', streak_threshold: 'int' = 1, season: 'Optional[Union[int, str]]' = None, stat_group: 'Optional[str]' = None, active_streak: 'Optional[bool]' = None, sport_id: 'int' = 1, **kwargs) -> 'Dict'

GET /api/v1/stats/streaks — active or historical streaks.

streak_type e.g. hittingStreakOverall, onBaseOverall.

Parameters

ParameterTypeDefaultDescription
streak_typestr
streak_thresholdint1
seasonOptional[Union[int, str]]None
stat_groupOptional[str]None
active_streakOptional[bool]None
sport_idint1

mlb_api_team_leaders(team_id: 'int', leader_categories: 'str', season: 'Optional[Union[int, str]]' = None, leader_game_types: 'Optional[str]' = None, limit: 'int' = 10, **kwargs) -> 'Dict'

GET /api/v1/teams/{teamId}/leaders — team leaders.

leader_categories e.g. homeRuns, battingAverage, wins, earnedRunAverage (comma-separated for multi).

Parameters

ParameterTypeDefaultDescription
team_idint
leader_categoriesstr
seasonOptional[Union[int, str]]None
leader_game_typesOptional[str]None
limitint10

mlb_api_team_stats(team_id: 'int', season: 'Union[int, str]', stats: 'str' = 'season', group: 'str' = 'hitting', sport_ids: 'Optional[Union[int, List[int]]]' = None, game_type: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/teams/{teamId}/stats — team-level stats.

stats: season, career, yearByYear, byMonth, byDayOfWeek, … group: hitting, pitching, fielding.

Parameters

ParameterTypeDefaultDescription
team_idint
seasonUnion[int, str]
statsstr'season'
groupstr'hitting'
sport_idsOptional[Union[int, List[int]]]None
game_typeOptional[str]None
fieldsOptional[str]None

mlb_api_teams(season: 'Optional[Union[int, str]]' = None, sport_id: 'int' = 1, league_ids: 'Optional[Union[int, List[int], str]]' = None, active_status: 'Optional[str]' = None, all_star_statuses: 'Optional[str]' = None, hydrate: 'Optional[str]' = None, fields: 'Optional[str]' = None, **kwargs) -> 'Dict'

GET /api/v1/teams — list teams. sport_id=1 = MLB.

Parameters

ParameterTypeDefaultDescription
seasonOptional[Union[int, str]]None
sport_idint1
league_idsOptional[Union[int, List[int], str]]None
active_statusOptional[str]None
all_star_statusesOptional[str]None
hydrateOptional[str]None
fieldsOptional[str]None

Play-by-play, schedule & rosters

espn_mlb_game_rosters(game_id: 'int', raw: 'bool' = False, return_as_pandas: 'bool' = False, **kwargs)

espn_mlb_game_rosters - pull the active game rosters for both teams.

Parameters

ParameterTypeDefaultDescription
game_idintESPN game id.
rawboolFalseWhen True, returns the merged competitor + roster payload dict.
return_as_pandasboolFalseWhen True, returns a pandas dataframe; otherwise polars.

Returns

One row per (game × team × athlete) with columns game_id, team_id, home_away, athlete_id, athlete_full_name, athlete_jersey, athlete_position_id, athlete_position_abbreviation, athlete_starter.

Example

from sportsdataverse.mlb import espn_mlb_game_rosters
ros = espn_mlb_game_rosters(game_id=401569461)
print(ros.shape)
ros.group_by("home_away").len()

espn_mlb_pbp(game_id: 'int', raw: 'bool' = False, **kwargs) -> 'Dict'

espn_mlb_pbp - pull the full ESPN game-summary payload for one MLB game.

Parameters

ParameterTypeDefaultDescription
game_idintESPN game id (the "event id"). Obtainable from espn_mlb_schedule.
rawboolFalseWhen True, returns the full nested payload unchanged. When False (default), the same payload is returned for now — full parsing into a tidy plays / boxscore dict is not yet implemented; see the TODO below.

Returns

The Site v2 summary payload. Top-level keys typically include header, boxscore, plays, leaders, scoringPlays, gameInfo, winprobability, pickcenter, news, videos, standings, article, seasonseries, broadcasts, predictor.

Example

from sportsdataverse.mlb import espn_mlb_pbp
game = espn_mlb_pbp(game_id=401569461, raw=True)
sorted(game.keys())
print(game.get("header", {}).get("competitions", [{}])[0].get("date"))

# Iterate the plays array

plays = game.get("plays") or []
print(f"{len(plays)} plays")
for p in plays[:3]:
print(p.get("text"))

espn_mlb_player_stats(athlete_id: 'int', season: 'int', *, season_type: 'str' = 'regular', total: 'bool' = False, raw: 'bool' = False, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'pl.DataFrame | pd.DataFrame | dict[str, Any]'

Pull an MLB athlete's ESPN season stat line as one wide row.

See sportsdataverse.wbb.espn_wbb_player_stats for full documentation of the wide return shape, the {category}_{stat} stat columns (for baseball: batting_*, pitching_*, fielding_*), the athlete / team metadata blocks, and the season_type / total parameters. For the richer multi-category web-v3 payload use sportsdataverse.mlb.espn_mlb_player_stats_v3.

Parameters

ParameterTypeDefaultDescription
athlete_idintESPN MLB athlete identifier (e.g. 33192 for Aaron Judge).
seasonintSeason year, used in the core-v2 path.
season_typestr'regular'"regular" (type 2) or "postseason" (type 3).
totalboolFalseForward-compat totals passthrough.
rawboolFalseIf True, returns the raw core-v2 statistics JSON dict.
return_as_pandasboolFalseIf True, returns a pandas DataFrame; else polars.

Returns

A single-row wide DataFrame (polars by default). When raw=True returns the raw statistics JSON dict.

col_nametypedescription
seasonintegerSeason year.
season_typecharacterSeason-type id.
totallogicalTotal.
athlete_idintegerUnique ESPN athlete identifier.
athlete_uidcharacterAthlete uid.
athlete_guidcharacterAthlete guid.
athlete_typecharacterAthlete type.
first_namecharacterPlayer first name.
last_namecharacterPlayer last name.
full_namecharacterPlayer's full name.
display_namecharacterDisplay name.
short_namecharacterShort display name.
weightdoubleWeight in pounds.
display_weightcharacterDisplay weight.
heightdoubleHeight (feet and inches).
display_heightcharacterDisplay height.
ageintegerPlayer age (in years).
date_of_birthcharacterDate of birth.
jerseycharacterJersey number worn by the player.
slugcharacterURL-safe identifier.
activelogicalWhether the player is currently active.
position_idintegerUnique position identifier.
position_namecharacterPosition name.
position_display_namecharacterPosition display name.
position_abbreviationcharacterPosition abbreviation.
college_namecharacterCollege name.
status_idintegerStatus id.
status_namecharacterGame status (e.g. 'STATUS_FINAL').
batting_games_playeddoubleTeam batting: batting games played.
batting_team_games_playeddoubleTeam batting: batting team games played.
batting_hit_by_pitchdoubleTeam batting: batting hit by pitch.
batting_ground_ballsdoubleTeam batting: batting ground balls.
batting_strikeoutsdoubleTeam batting: batting strikeouts.
batting_rb_isdoubleTeam batting: batting rb is.
batting_sac_hitsdoubleTeam batting: batting sac hits.
batting_hitsdoubleTeam batting: batting hits.
batting_stolen_basesdoubleTeam batting: batting stolen bases.
batting_walksdoubleTeam batting: batting walks.
batting_catcher_interferencedoubleTeam batting: batting catcher interference.
batting_runsdoubleTeam batting: batting runs.
batting_gid_psdoubleTeam batting: batting gid ps.
batting_sac_fliesdoubleTeam batting: batting sac flies.
batting_at_batsdoubleTeam batting: batting at bats.
batting_home_runsdoubleTeam batting: batting home runs.
batting_grand_slam_home_runsdoubleTeam batting: batting grand slam home runs.
batting_runners_left_on_basedoubleTeam batting: batting runners left on base.
batting_triplesdoubleTeam batting: batting triples.
batting_game_winning_rb_isdoubleTeam batting: batting game winning rb is.
batting_intentional_walksdoubleTeam batting: batting intentional walks.
batting_doublesdoubleTeam batting: batting doubles.
batting_fly_ballsdoubleTeam batting: batting fly balls.
batting_caught_stealingdoubleTeam batting: batting caught stealing.
batting_pitchesdoubleTeam batting: batting pitches.
batting_games_starteddoubleTeam batting: batting games started.
batting_pinch_at_batsdoubleTeam batting: batting pinch at bats.
batting_pinch_hitsdoubleTeam batting: batting pinch hits.
batting_player_ratingdoubleTeam batting: batting player rating.
batting_is_qualifieddoubleTeam batting: batting is qualified.
batting_is_qualified_stealsdoubleTeam batting: batting is qualified steals.
batting_total_basesdoubleTeam batting: batting total bases.
batting_plate_appearancesdoubleTeam batting: batting plate appearances.
batting_projected_home_runsdoubleTeam batting: batting projected home runs.
batting_extra_base_hitsdoubleTeam batting: batting extra base hits.
batting_runs_createddoubleTeam batting: batting runs created.
batting_avgdoubleTeam batting: batting average.
batting_pinch_avgdoubleTeam batting: batting pinch avg.
batting_slug_avgdoubleTeam batting: batting slug avg.
batting_secondary_avgdoubleTeam batting: batting secondary avg.
batting_on_base_pctdoubleTeam batting: batting on base pct.
batting_opsdoubleTeam batting: batting ops.
batting_ground_to_fly_ratiodoubleTeam batting: batting ground to fly ratio.
batting_runs_created_per27_outsdouble
batting_batter_ratingdoubleTeam batting: batting batter rating.
batting_at_bats_per_home_rundoubleTeam batting: batting at bats per home run.
batting_stolen_base_pctdoubleTeam batting: batting stolen base pct.
batting_pitches_per_plate_appearancedoubleTeam batting: batting pitches per plate appearance.
batting_isolated_powerdoubleTeam batting: batting isolated power.
batting_walk_to_strikeout_ratiodoubleTeam batting: batting walk to strikeout ratio.
batting_walks_per_plate_appearancedoubleTeam batting: batting walks per plate appearance.
batting_secondary_avg_minus_badoubleTeam batting: batting secondary avg minus ba.
batting_runs_produceddoubleTeam batting: batting runs produced.
batting_runs_ratiodoubleTeam batting: batting runs ratio.
batting_patience_ratiodouble
batting_bipadouble
batting_mlb_ratingdouble
batting_off_warbrdouble
batting_warbrdouble
fielding_games_playeddouble
fielding_team_games_playeddouble
fielding_double_playsdouble
fielding_opportunitiesdouble
fielding_errorsdouble
fielding_passed_ballsdouble
fielding_assistsdouble
fielding_outfield_assistsdouble
fielding_pickoffsdouble
fielding_putoutsdouble
fielding_outs_on_fielddouble
fielding_triple_playsdouble
fielding_balls_in_zonedouble
fielding_extra_basesdouble
fielding_outs_madedouble
fielding_hitsdouble
fielding_total_basesdouble
fielding_games_starteddouble
fielding_catcher_third_innings_playeddouble
fielding_catcher_caught_stealingdouble
fielding_catcher_stolen_bases_alloweddouble
fielding_catcher_earned_runsdouble
fielding_is_qualifieddouble
fielding_is_qualified_catcherdouble
fielding_is_qualified_pitcherdouble
fielding_successful_chancesdouble
fielding_total_chancesdouble
fielding_full_innings_playeddouble
fielding_part_innings_playeddouble
fielding_fielding_pctdouble
fielding_range_factordouble
fielding_zone_ratingdouble
fielding_catcher_caught_stealing_pctdouble
fielding_catcher_eradouble
fielding_def_warbrdouble
team_idintegerUnique ESPN team identifier.
team_uidcharacterESPN universal team identifier (UID).
team_guidcharacterESPN team GUID.
team_slugcharacterURL-safe team identifier.
team_locationcharacterTeam city / location.
team_namecharacterTeam name.
team_abbreviationcharacterShort team abbreviation (e.g. 'NYY').
team_display_namecharacterFull team display name (e.g. 'New York Yankees').
team_short_display_namecharacterShort team display name.
team_colorcharacterTeam primary color (hex, no leading '#').
team_alternate_colorcharacterTeam alternate color (hex).
team_is_activelogicalTeam is active.
team_logo_hrefcharacterDefault team logo URL; team_detail = TRUE only.

Example

from sportsdataverse.mlb import espn_mlb_player_stats
df = espn_mlb_player_stats(athlete_id=33192, season=2023)
df.select(["full_name", "team_display_name", "batting_home_runs"])

espn_mlb_schedule(dates=None, season_type=None, limit=500, return_as_pandas=False, **kwargs) -> 'pl.DataFrame'

espn_mlb_schedule - look up the MLB schedule for a given date or season-year.

Parameters

ParameterTypeDefaultDescription
datesintNoneDate filter. Either a calendar date as YYYYMMDD or a season-year (e.g. 2024). When a 4-digit year is passed, the call returns the full season slate (paginated by limit).
season_typeintNoneSeason type — 1 = spring training, 2 = regular, 3 = postseason, 4 = all-star.
limitint500Number of records to return. Default 500.
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False (default), returns a polars dataframe.

Returns

Polars dataframe containing the schedule. Returns None if no games.

col_nametypedescription
game_idcharacterUnique ESPN game/event identifier.
datecharacterDate in YYYY-MM-DD format.
season_yearintegerSeason year string ('YYYY-YY' format).
season_typeintegerSeason-type id.
status_type_statecharacterStatus state (pre/in/post).
status_type_completedlogicalWhether the game is complete.
status_type_descriptioncharacterStatus type description.
venue_idcharacterMLBAM venue ID.
venue_full_namecharacterVenue full name.
venue_citycharacterVenue city.
venue_statecharacterVenue state / province.
home_idcharacterUnique identifier for home.
home_namecharacterHome team display name.
home_abbreviationcharacterHome team's abbreviation.
home_display_namecharacterHome team display name.
home_scorecharacterHome team run total after the play.
home_winnerlogicalWhether the home team won.
away_idcharacterUnique identifier for away.
away_namecharacterAway team display name.
away_abbreviationcharacterAway team's abbreviation.
away_display_namecharacterAway team display name.
away_scorecharacterAway team run total after the play.
away_winnerlogicalWhether the away team won.

Example

from sportsdataverse.mlb import espn_mlb_schedule
sched = espn_mlb_schedule(dates=20240328)
print(sched.shape)
sched.select(["game_id", "home_name", "away_name", "status_type_description"]).head()

# Pull a regular-season slate from a season-year

reg = espn_mlb_schedule(dates=2024, season_type=2, limit=500)
reg.group_by("status_type_description").len().sort("len", descending=True)

# Pandas round-trip for one date

espn_mlb_schedule(dates=20240328, return_as_pandas=True).head()

Dataset loaders

load_mlb_pbp(seasons: 'List[int]', return_as_pandas: 'bool' = False)

load_mlb_pbp - planned: load pre-built season-level MLB play-by-play.

TODO: Implement once an MLB-data release pipeline is in place.

Parameters

ParameterTypeDefaultDescription
seasonsList[int]
return_as_pandasboolFalse

load_mlb_player_boxscore(seasons: 'List[int]', return_as_pandas: 'bool' = False)

load_mlb_player_boxscore - planned: load pre-built season-level MLB player boxscores.

Parameters

ParameterTypeDefaultDescription
seasonsList[int]
return_as_pandasboolFalse

load_mlb_rosters(seasons: 'List[int]', return_as_pandas: 'bool' = False)

load_mlb_rosters - planned: load pre-built season-level MLB rosters.

Parameters

ParameterTypeDefaultDescription
seasonsList[int]
return_as_pandasboolFalse

load_mlb_schedule(seasons: 'List[int]', return_as_pandas: 'bool' = False)

load_mlb_schedule - planned: load pre-built season-level MLB schedule.

TODO: Implement once an MLB-data release pipeline is in place.

Parameters

ParameterTypeDefaultDescription
seasonsList[int]
return_as_pandasboolFalse

load_mlb_team_boxscore(seasons: 'List[int]', return_as_pandas: 'bool' = False)

load_mlb_team_boxscore - planned: load pre-built season-level MLB team boxscores.

Parameters

ParameterTypeDefaultDescription
seasonsList[int]
return_as_pandasboolFalse

Utilities & helpers

most_recent_mlb_season() -> 'int'

most_recent_mlb_season - return the most recent / current MLB season year.

MLB seasons run calendar-year. Before April we still consider the previous year the "most recent" season (since spring training only starts in late February).

Returns

The most recent MLB season year (e.g. 2024).

Other

espn_mlb_teams(return_as_pandas=False, **kwargs) -> 'pl.DataFrame'

espn_mlb_teams - look up MLB teams from ESPN's Site v2 API.

Parameters

ParameterTypeDefaultDescription
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False (default), returns a polars dataframe.

Returns

Polars dataframe containing teams for MLB. This function caches by default, so if you want to refresh the data, use sportsdataverse.mlb.espn_mlb_teams.cache_clear().

col_nametypedescription
team_abbreviationcharacterShort team abbreviation (e.g. 'NYY').
team_alternate_colorcharacterTeam alternate color (hex).
team_colorcharacterTeam primary color (hex, no leading '#').
team_display_namecharacterFull team display name (e.g. 'New York Yankees').
team_idcharacterUnique ESPN team identifier.
team_is_activelogicalTeam is active.
team_is_all_starlogicalTeam is all star.
team_locationcharacterTeam city / location.
team_logosintegerTeam logo metadata.
team_namecharacterTeam name.
team_nicknamecharacterTeam nickname.
team_short_display_namecharacterShort team display name.
team_slugcharacterURL-safe team identifier.
team_uidcharacterESPN universal team identifier (UID).

Example

from sportsdataverse.mlb import espn_mlb_teams
teams = espn_mlb_teams()
print(teams.shape)
teams.select(["team_id", "team_abbreviation", "team_display_name"]).head()

# Find Los Angeles Dodgers (team_id 19)

import polars as pl
teams.filter(pl.col("team_id") == "19").to_dicts()

# Refresh the cache (the call is ``lru_cache``'d) and round-trip to pandas

espn_mlb_teams.cache_clear()
teams_pd = espn_mlb_teams(return_as_pandas=True)
teams_pd[["team_id", "team_abbreviation", "team_display_name"]].head()