Skip to main content
Version: main

CFB — additional Python functions

Hand-written wrappers, loaders, and helpers in sportsdataverse.cfb not covered by the generated API-endpoint reference above.

Play-by-play, schedule & rosters

espn_cfb_game_rosters(game_id: 'int', raw=False, return_as_pandas=False, **kwargs) -> 'pl.DataFrame'

espn_cfb_game_rosters() - Pull the game by id.

Parameters

ParameterTypeDefaultDescription
game_idintUnique game_id, can be obtained from espn_cfb_schedule().
rawFalse
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False, returns a polars dataframe.

Returns

Polars dataframe of game roster data with columns: 'athlete_id', 'athlete_uid', 'athlete_guid', 'athlete_type', 'first_name', 'last_name', 'full_name', 'athlete_display_name', 'short_name', 'weight', 'display_weight', 'height', 'display_height', 'age', 'date_of_birth', 'slug', 'jersey', 'linked', 'active', 'alternate_ids_sdr', 'birth_place_city', 'birth_place_state', 'birth_place_country', 'headshot_href', 'headshot_alt', 'experience_years', 'experience_display_value', 'experience_abbreviation', 'status_id', 'status_name', 'status_type', 'status_abbreviation', 'hand_type', 'hand_abbreviation', 'hand_display_value', 'draft_display_text', 'draft_round', 'draft_year', 'draft_selection', 'player_id', 'starter', 'valid', 'did_not_play', 'display_name', 'ejected', 'athlete_href', 'position_href', 'statistics_href', 'team_id', 'team_guid', 'team_uid', 'team_slug', 'team_location', 'team_name', 'team_nickname', 'team_abbreviation', 'team_display_name', 'team_short_display_name', 'team_color', 'team_alternate_color', 'is_active', 'is_all_star', 'team_alternate_ids_sdr', 'logo_href', 'logo_dark_href', 'game_id'

col_nametypedescription
athlete_idintegerESPN athlete id.
athlete_uidcharacterESPN athlete UID (universal identifier).
athlete_guidcharacterESPN athlete GUID.
athlete_typecharacterAthlete type / class.
first_namecharacterAthlete first name.
last_namecharacterAthlete last name.
full_namecharacterVenue full name (e.g. Tenney Stadium).
athlete_display_namecharacterPlayer display name; athlete_detail = TRUE only.
short_namecharacterRanking source short name (e.g. AP Poll).
weightdoubleListed weight (lbs).
display_weightcharacterHuman-readable weight (e.g. 205 lbs).
heightdoubleListed height (inches).
display_heightcharacterHuman-readable height (e.g. 6' 1").
slugcharacterURL slug for the team.
jerseycharacterJersey number.
linkedlogicalTRUE if the record is linked to a related entity.
activelogicalTRUE if the player was active for the game.
alternate_ids_sdrcharacterAlternate ids sdr.
birth_place_citycharacterBirth place city.
birth_place_statecharacterBirth place state.
birth_place_countrycharacterBirth place country.
birth_country_alternate_idcharacter
birth_country_abbreviationcharacterBirth country abbreviation.
headshot_hrefcharacterURL of the athlete headshot image.
headshot_altcharacterAlternative-text label for the headshot.
flag_hrefcharacter
flag_altcharacter
flag_relcharacter
experience_yearsintegerYears of experience.
experience_display_valuecharacterExperience display value.
experience_abbreviationcharacterExperience abbreviation.
status_idcharacterESPN commitment status id.
status_namecharacterStatus-type key (e.g. STATUS_FINAL).
status_typecharacterStatus type.
status_abbreviationcharacterStatus abbreviation.
hand_typecharacterHand type.
hand_abbreviationcharacterHand abbreviation.
hand_display_valuecharacterHand display value.
ageintegerPlayer age (in years).
date_of_birthcharacterPlayer date of birth (if published).
starterlogicalTRUE if the athlete started the game.
jersey_rightcharacter
validlogicalTRUE if the roster entry is flagged valid by ESPN.
did_not_playlogicalTRUE if the athlete did not play.
display_namecharacterHuman-readable metric name.
athlete_hrefcharacter
position_hrefcharacter
statistics_hrefcharacter
team_idintegerESPN team id.
orderintegerTeam order within the competition (0 = first).
home_awaycharacterhome or away.
winnerlogicalTRUE if this team won the game.
team_guidcharacterESPN team GUID.
team_uidcharacterESPN universal team identifier (UID format 's:40~l:...~t:...').
team_slugcharacterTeam slug for the stat row.
team_locationcharacterTeam location / school name; team_detail = TRUE only.
team_namecharacterTeam nickname; team_detail = TRUE only.
team_nicknamecharacterTeam nickname label; team_detail = TRUE only.
team_abbreviationcharacterTeam abbreviation; team_detail = TRUE only.
team_display_namecharacterFull team display name; team_detail = TRUE only.
team_short_display_namecharacterShort team display name; team_detail = TRUE only.
team_colorcharacterPrimary team color; team_detail = TRUE only.
team_alternate_colorcharacterAlternate team color; team_detail = TRUE only.
is_activelogicalWhether the team is currently active.
is_all_starlogicalWhether the team is an all-star team.
team_alternate_ids_sdrcharacter
logo_hrefcharacterURL of the default team logo.
logo_dark_hrefcharacterURL of the dark-variant team logo.
game_idintegerESPN game identifier.

Example

from sportsdataverse.cfb import espn_cfb_game_rosters
rosters = espn_cfb_game_rosters(game_id=401628334)
print(rosters.shape)

# Pandas round-trip

rosters_pd = espn_cfb_game_rosters(game_id=401628334, return_as_pandas=True)
rosters_pd.head()

# Pipeline next step (filter to game starters)

import polars as pl
starters = espn_cfb_game_rosters(game_id=401628334).filter(
pl.col("starter") == True
)

espn_cfb_play_participants(game_id: 'int', *, raw: 'bool' = False, return_as_pandas: 'bool' = False, resolve_missing: 'bool' = True, resolve_missing_max: 'int' = 50, **kwargs: 'Any') -> 'pl.DataFrame | pd.DataFrame | dict[str, Any]'

Pull ESPN per-play participants for a college-football game.

Parameters

ParameterTypeDefaultDescription
game_idintESPN game / event identifier.
rawboolFalseIf True, returns the raw list of play-items dicts (after following pagination) before any flattening.
return_as_pandasboolFalseIf True, returns a pandas DataFrame; otherwise polars.
resolve_missingboolTrueIf True (default), athletes that the cdn.espn.com sidecar omits are fetched one-by-one from their canonical ESPN $ref URL so the resulting frame has populated *_player_name / *_player_names columns wherever an *_player_id is non-null. Setting this to False skips the extra HTTP fan-out and reproduces the pre-enhancement behavior — rows may then ship with *_player_id populated but *_player_name null on the handful of athletes the sidecar misses (most visible on split sacks, multi-lateral returns, and older games).
resolve_missing_maxint50Hard cap on the number of per-athlete $ref requests issued for a single game. Defaults to 50, which comfortably covers every probed game (typical max is ≤8 unique missing athletes). If breached, a warning is logged and the remaining missing athletes are left with null names. Ignored when resolve_missing=False.

Returns

Polars (or pandas) DataFrame, one row per play. Columns include game_id, play_id, and TWO column families for every participant type ESPN ships for the game (typical types: passer, rusher, receiver, tackler, sacked_by, forced_by, pass_defender, kicker, punter, returner, recoverer, scorer, pat_scorer, penalized, assisted_by): * Scalar{type}_player_id / {type}_player_name: the first occurrence of that participant type on the play. Backwards compatible with the legacy regex-extractor shape. * List{type}_player_ids / {type}_player_names: List(Utf8) columns containing every occurrence of that participant type on the play, in the order ESPN shipped them. Plays with no participant of a given type carry an empty list [] (not null) for downstream consumption simplicity. This family preserves multi-entry participant types (split sacks where ESPN ships two sackedBy entries, multi-tacklers, etc.) that the scalar family collapses to first-only. If raw=True, returns the parsed JSON list of play dicts.

col_nametypedescription
game_idintegerESPN game identifier.
play_idintegerESPN play id.
kicker_player_namecharacterString name for the kicker on FG or kickoff.
passer_player_namecharacterName of the passer on a passing play.
receiver_player_namecharacterName of the receiver on a passing play.
rusher_player_namecharacterName of the rusher on a rushing play.
scorer_player_namecharacter
returner_player_namecharacter
pass_defender_player_namecharacter
penalized_player_namecharacter
sacked_by_player_namecharacter
pat_scorer_player_namecharacter
punter_player_namecharacterName of the punter.
kicker_player_idcharacterUnique identifier for the kicker on FG or kickoff.
passer_player_idcharacterUnique identifier for the player that attempted the pass.
receiver_player_idcharacterUnique identifier for the receiver that was targeted on the pass.
rusher_player_idcharacterUnique identifier for the player that attempted the run.
scorer_player_idcharacter
returner_player_idcharacter
pass_defender_player_idcharacter
penalized_player_idcharacter
sacked_by_player_idcharacter
pat_scorer_player_idcharacter
punter_player_idcharacterUnique identifier for the punter.
kicker_player_namescharacter
passer_player_namescharacter
receiver_player_namescharacter
rusher_player_namescharacter
scorer_player_namescharacter
returner_player_namescharacter
pass_defender_player_namescharacter
penalized_player_namescharacter
sacked_by_player_namescharacter
pat_scorer_player_namescharacter
punter_player_namescharacter
kicker_player_idscharacter
passer_player_idscharacter
receiver_player_idscharacter
rusher_player_idscharacter
scorer_player_idscharacter
returner_player_idscharacter
pass_defender_player_idscharacter
penalized_player_idscharacter
sacked_by_player_idscharacter
pat_scorer_player_idscharacter
punter_player_idscharacter

Example

from sportsdataverse.cfb import espn_cfb_play_participants
participants = espn_cfb_play_participants(game_id=401628334)
print(participants.shape)

# Skip the per-athlete fan-out for speed

participants_fast = espn_cfb_play_participants(
game_id=401628334,
resolve_missing=False,
)

# Pipeline next step (join onto play-by-play frame)

from sportsdataverse.cfb import CFBPlayProcess
pbp = CFBPlayProcess(gameId=401628334).espn_cfb_pbp()
plays = pbp["plays"]
joined = plays.join(participants, how="left", left_on="id", right_on="play_id")

espn_cfb_player_stats(athlete_id: 'int', season: 'int', *, season_type: 'str' = 'regular', total: 'bool' = False, raw: 'bool' = False, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'pl.DataFrame | pd.DataFrame | dict[str, Any]'

Pull a college-football athlete's ESPN season stat line.

See sportsdataverse.wbb.espn_wbb_player_stats for full documentation of the wide return shape, the {category}_{stat} stat columns (for football: passing_*, rushing_*, receiving_*, scoring_*, ...), the athlete / team metadata blocks, and the season_type / total parameters. For the richer multi-category web-v3 payload use sportsdataverse.cfb.espn_cfb_player_stats_v3.

Parameters

ParameterTypeDefaultDescription
athlete_idintESPN college-football athlete identifier.
seasonintSeason year, used in the core-v2 path.
season_typestr'regular'"regular" (type 2) or "postseason" (type 3).
totalboolFalseForward-compat totals passthrough.
rawboolFalseIf True, returns the raw core-v2 statistics JSON dict.
return_as_pandasboolFalseIf True, returns a pandas DataFrame; else polars.

Returns

A single-row wide DataFrame (polars by default). When raw=True returns the raw statistics JSON dict.

col_nametypedescription
seasonintegerSeason (4-digit year).
season_typecharacterESPN season type (2 = regular, 3 = postseason).
totallogicalTotal.
athlete_idintegerESPN athlete id.
athlete_uidcharacterESPN athlete UID (universal identifier).
athlete_guidcharacterESPN athlete GUID.
athlete_typecharacterAthlete type / class.
first_namecharacterAthlete first name.
last_namecharacterAthlete last name.
full_namecharacterVenue full name (e.g. Tenney Stadium).
display_namecharacterHuman-readable metric name.
short_namecharacterRanking source short name (e.g. AP Poll).
weightdoubleListed weight (lbs).
display_weightcharacterHuman-readable weight (e.g. 205 lbs).
heightdoubleListed height (inches).
display_heightcharacterHuman-readable height (e.g. 6' 1").
ageintegerPlayer age (in years).
date_of_birthcharacterPlayer date of birth (if published).
jerseycharacterJersey number.
slugcharacterURL slug for the team.
activelogicalTRUE if the player was active for the game.
position_idintegerESPN position id.
position_namecharacterPosition name (e.g. Quarterback); position_detail = TRUE only.
position_display_namecharacterHuman-readable position name; position_detail = TRUE only.
position_abbreviationcharacterPosition abbreviation (e.g. QB); position_detail = TRUE only.
college_namecharacterCollege name.
status_idintegerESPN commitment status id.
status_namecharacterStatus-type key (e.g. STATUS_FINAL).
general_fumblesdouble
general_fumbles_lostdouble
general_fumbles_touchdownsdouble
general_games_playeddoubleGames Played.
general_offensive_two_pt_returnsdouble
general_offensive_fumbles_touchdownsdouble
general_defensive_fumbles_touchdownsdouble
passing_avg_gaindouble
passing_completion_pctdouble
passing_completionsdoublePass completions (split from CFBD's C/ATT field).
passing_espnqb_ratingdouble
passing_interception_pctdouble
passing_interceptionsdouble
passing_long_passingdouble
passing_net_passing_yardsdouble
passing_net_passing_yards_per_gamedouble
passing_net_total_yardsdouble
passing_net_yards_per_gamedouble
passing_passing_attemptsdouble
passing_passing_big_playsdouble
passing_passing_first_downsdouble
passing_passing_fumblesdouble
passing_passing_fumbles_lostdouble
passing_passing_touchdown_pctdouble
passing_passing_touchdownsdouble
passing_passing_yardsdouble
passing_passing_yards_after_catchdouble
passing_passing_yards_at_catchdouble
passing_passing_yards_per_gamedouble
passing_qb_ratingdouble
passing_sacksdouble
passing_sack_yards_lostdouble
passing_team_games_playeddouble
passing_total_offensive_playsdouble
passing_total_points_per_gamedouble
passing_total_touchdownsdouble
passing_total_yardsdouble
passing_total_yards_from_scrimmagedouble
passing_two_point_pass_convsdouble
passing_two_pt_passdouble
passing_two_pt_pass_attemptsdouble
passing_yards_from_scrimmage_per_gamedouble
passing_yards_per_completiondouble
passing_yards_per_gamedouble
passing_yards_per_pass_attemptdouble
passing_net_yards_per_pass_attemptdouble
passing_qbrdoubleESPN Quarterback Rating (QBR) for the player in this game.
passing_adj_qbrdouble
passing_quarterback_ratingdouble
rushing_avg_gaindouble
rushing_espnrb_ratingdouble
rushing_long_rushingdouble
rushing_net_total_yardsdouble
rushing_net_yards_per_gamedouble
rushing_rushing_attemptsdouble
rushing_rushing_big_playsdouble
rushing_rushing_first_downsdouble
rushing_rushing_fumblesdouble
rushing_rushing_fumbles_lostdouble
rushing_rushing_touchdownsdouble
rushing_rushing_yardsdouble
rushing_rushing_yards_per_gamedouble
rushing_stuffsdouble
rushing_stuff_yards_lostdouble
rushing_team_games_playeddouble
rushing_total_offensive_playsdouble
rushing_total_points_per_gamedouble
rushing_total_touchdownsdouble
rushing_total_yardsdouble
rushing_total_yards_from_scrimmagedouble
rushing_two_point_rush_convsdouble
rushing_two_pt_rushdouble
rushing_two_pt_rush_attemptsdouble
rushing_yards_from_scrimmage_per_gamedouble
rushing_yards_per_gamedouble
rushing_yards_per_rush_attemptdouble
receiving_avg_gaindouble
receiving_espnwr_ratingdouble
receiving_long_receptiondouble
receiving_net_total_yardsdouble
receiving_net_yards_per_gamedouble
receiving_receiving_big_playsdouble
receiving_receiving_first_downsdouble
receiving_receiving_fumblesdouble
receiving_receiving_fumbles_lostdouble
receiving_receiving_targetsdouble
receiving_receiving_touchdownsdouble
receiving_receiving_yardsdouble
receiving_receiving_yards_after_catchdouble
receiving_receiving_yards_at_catchdouble
receiving_receiving_yards_per_gamedouble
receiving_receptionsdouble
receiving_team_games_playeddouble
receiving_total_offensive_playsdouble
receiving_total_points_per_gamedouble
receiving_total_touchdownsdouble
receiving_total_yardsdouble
receiving_total_yards_from_scrimmagedouble
receiving_two_point_rec_convsdouble
receiving_two_pt_receptiondouble
receiving_two_pt_reception_attemptsdouble
receiving_yards_from_scrimmage_per_gamedouble
receiving_yards_per_gamedouble
receiving_yards_per_receptiondouble
scoring_defensive_pointsdouble
scoring_field_goalsdouble
scoring_kick_extra_pointsdouble
scoring_kick_extra_points_madedouble
scoring_misc_pointsdouble
scoring_passing_touchdownsdouble
scoring_receiving_touchdownsdouble
scoring_return_touchdownsdouble
scoring_rushing_touchdownsdouble
scoring_total_pointsdouble
scoring_total_points_per_gamedouble
scoring_total_touchdownsdouble
scoring_total_two_point_convsdouble
scoring_two_point_pass_convsdouble
scoring_two_point_rec_convsdouble
scoring_two_point_rush_convsdouble
scoring_one_pt_safeties_madedouble
team_idintegerESPN team id.
team_uidcharacterESPN universal team identifier (UID format 's:40~l:...~t:...').
team_guidcharacterESPN team GUID.
team_slugcharacterTeam slug for the stat row.
team_locationcharacterTeam location / school name; team_detail = TRUE only.
team_namecharacterTeam nickname; team_detail = TRUE only.
team_abbreviationcharacterTeam abbreviation; team_detail = TRUE only.
team_display_namecharacterFull team display name; team_detail = TRUE only.
team_short_display_namecharacterShort team display name; team_detail = TRUE only.
team_colorcharacterPrimary team color; team_detail = TRUE only.
team_alternate_colorcharacterAlternate team color; team_detail = TRUE only.
team_is_activelogicalTRUE if the team is currently active.
team_logo_hrefcharacterDefault team logo URL; team_detail = TRUE only.

Example

from sportsdataverse.cfb import espn_cfb_player_stats
df = espn_cfb_player_stats(athlete_id=4426338, season=2023)
df.select(["full_name", "team_display_name", "passing_passing_yards"])

espn_cfb_schedule(dates=None, week=None, season_type=None, groups=None, limit=500, return_as_pandas=False, **kwargs) -> 'pl.DataFrame'

espn_cfb_schedule - look up the college football schedule for a given season

Parameters

ParameterTypeDefaultDescription
datesintNoneUsed to define different seasons. 2002 is the earliest available season.
weekintNoneWeek of the schedule.
season_typeintNone2 for regular season, 3 for post-season, 4 for off-season.
groupsintNoneUsed to define different divisions. 80 is FBS, 81 is FCS.
limitint500number of records to return, default: 500.
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False, returns a polars dataframe.

Returns

Polars dataframe containing schedule dates for the requested season. Returns None if no games

col_nametypedescription
idcharacter247Sports referencing id for the recruit.
uidcharacterESPN global unique identifier.
datecharacterDate of the poll release.
attendanceintegerReported attendance at the game.
time_validlogicalWhether the start time is confirmed.
date_validlogical
neutral_sitelogicalTRUE/FALSE flag for if the game took place at a neutral site.
conference_competitionlogicalConference competition.
play_by_play_availablelogicalWhether play-by-play data is available.
recentlogicalWhether the game is recent.
start_datecharacterSeason start timestamp (ISO 8601, UTC).
broadcastcharacterBroadcast network short name.
highlightscharacterGame highlight urls.
notes_typecharacterNotes type.
notes_headlinecharacterNotes headline.
broadcast_marketcharacterBroadcast market label (e.g. 'national', 'home').
broadcast_namecharacterBroadcast name.
type_idcharacterPlay-type id.
type_abbreviationcharacterPlay-type abbreviation (e.g. RUSH, TD).
venue_idcharacterReferencing venue id.
venue_full_namecharacterVenue full name.
venue_address_citycharacterVenue address city.
venue_address_countrycharacter
venue_indoorlogicalWhether the home venue is indoors.
status_clockdoubleGame clock in seconds.
status_display_clockcharacterStatus display clock.
status_periodintegerCurrent period.
status_type_idcharacterUnique identifier for status type.
status_type_namecharacterStatus type name.
status_type_statecharacterStatus state (pre/in/post).
status_type_completedlogicalWhether the game is complete.
status_type_descriptioncharacterStatus type description.
status_type_detailcharacterStatus type detail.
status_type_short_detailcharacterStatus type short detail.
format_regulation_periodsintegerFormat regulation periods.
home_idcharacterHome team referencing id.
home_uidcharacterHome team's uid.
home_locationcharacterHome team's location.
home_namecharacterHome team display name.
home_abbreviationcharacterHome team's abbreviation.
home_display_namecharacterHome team display name.
home_short_display_namecharacterHome short display name.
home_colorcharacterHome team primary color hex.
home_alternate_colorcharacterColor code (hex) for home alternate.
home_is_activelogicalHome team's is active.
home_venue_idcharacterUnique identifier for home venue.
home_logocharacterHome team logo URL.
home_conference_idcharacterUnique identifier for home conference.
home_scorecharacterHome-team score after the play.
home_current_rankinteger
home_linescoresinteger
home_recordscharacter
away_idcharacterAway team referencing id.
away_uidcharacterAway team's uid.
away_locationcharacterAway team's location.
away_namecharacterAway team display name.
away_abbreviationcharacterAway team's abbreviation.
away_display_namecharacterAway team display name.
away_short_display_namecharacterAway short display name.
away_colorcharacterAway team primary color hex.
away_alternate_colorcharacterColor code (hex) for away alternate.
away_is_activelogicalAway team's is active.
away_venue_idcharacterUnique identifier for away venue.
away_logocharacterAway team logo URL.
away_conference_idcharacterUnique identifier for away conference.
away_scorecharacterAway-team score after the play.
away_current_rankinteger
away_linescoresinteger
away_recordscharacter
game_idintegerESPN game identifier.
seasonintegerSeason (4-digit year).
season_typeintegerESPN season type (2 = regular, 3 = postseason).
weekintegerGame week of the season.
venue_address_statecharacterVenue address state / region.
groups_idcharacterUnique identifier for groups.
groups_namecharacterGroups name.
groups_short_namecharacterGroups short name.
groups_is_conferencelogicalGroups is conference.

Example

from sportsdataverse.cfb import espn_cfb_schedule
slate = espn_cfb_schedule()
print(slate.shape if slate is not None else "no games")

# Pull a specific week of FBS games

week5 = espn_cfb_schedule(dates=2023, week=5, season_type=2)

# Pipeline next step (extract finals only)

import polars as pl
finals = espn_cfb_schedule(dates=2023, week=5).filter(
pl.col("status_type_completed") == True
)

Dataset loaders

load_cfb_betting_lines(return_as_pandas=False) -> 'pl.DataFrame'

Load college football betting lines information

Parameters

ParameterTypeDefaultDescription
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False, returns a polars dataframe.

Returns

Polars dataframe containing betting lines available for the available seasons.

col_nametypedescription
iddouble247Sports referencing id for the recruit.
game_idintegerESPN game identifier.
seasondoubleSeason (4-digit year).
game_desccharacter
date_timecharacter
market_typecharacterGeographic market type (e.g. National).
abbrcharacter
linesdouble
oddsinteger
opening_linesdouble
opening_oddsinteger
bookcharacter
season_typecharacterESPN season type (2 = regular, 3 = postseason).
weekintegerGame week of the season.

Example

from sportsdataverse.cfb import load_cfb_betting_lines
lines = load_cfb_betting_lines()
print(lines.shape)

# Pandas round-trip

lines_pd = load_cfb_betting_lines(return_as_pandas=True)
lines_pd.head()

# Pipeline next step (filter to one provider in 2023)

import polars as pl
consensus_2023 = load_cfb_betting_lines().filter(
(pl.col("season") == 2023) & (pl.col("provider") == "consensus")
)

Utilities & helpers

CFBPlayProcess(gameId=0, raw=False, path_to_json='/', return_keys=None, odds_override=None, **kwargs)

Process ESPN college-football play-by-play feeds into a tidy game-level dictionary.

Wraps the ESPN playbyplay / summary endpoints (or a local JSON dump) and pipes the result through a chain of feature-engineering steps -- down/distance, play-type flags, EPA, WPA, QBR, drive aggregation, and an advanced box score. Use run_processing_pipeline() for the full feature set or run_cleaning_pipeline() for a lighter clean.

Parameters

ParameterTypeDefaultDescription
gameId0ESPN game id.
rawFalseif True, espn_cfb_pbp() returns the (allowlisted) summary verbatim.
path_to_json'/'directory for cfb_pbp_disk() offline loads.
return_keysNoneoptional subset of result keys to return.
odds_overrideNoneoptional dict {gameSpread, overUnder, homeFavorite, gameSpreadAvailable} that short-circuits odds resolution (sets odds_source="injected") so offline rebuilds never hit the live core-odds endpoint or fall back to defaults. Validated + coerced here.

Example

from sportsdataverse.cfb import CFBPlayProcess
proc = CFBPlayProcess(gameId=401628334)
proc.espn_cfb_pbp()
result = proc.run_processing_pipeline()
len(result["plays"])

# Offline replay from a JSON dump

proc = CFBPlayProcess(gameId=401628334, path_to_json="./pbp_dump")
proc.cfb_pbp_disk()
result = proc.run_processing_pipeline()

Methods

CFBPlayProcess.cfb_pbp_disk()

Load a previously cached ESPN summary JSON for this game from disk.

Reads {path_to_json}/{gameId}.json where path_to_json was passed to the CFBPlayProcess constructor.

Returns

Parsed JSON contents, also stored on self.json.

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334, path_to_json="./cache")
pbp = game.cfb_pbp_disk()
print(list(pbp.keys()))

CFBPlayProcess.cfb_pbp_json(**kwargs)

Return the JSON payload currently attached to this CFBPlayProcess

instance.

Returns

The cached JSON payload (self.json).

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
cached = game.cfb_pbp_json()

CFBPlayProcess.corrupt_pbp_check()

Heuristic check for corrupt or incomplete play-by-play.

Flags games with zero plays, fewer than 50 plays for a completed game, or more than 500 plays for a completed game -- all of which historically indicate ESPN delivered a malformed PBP payload that should not be processed downstream.

Returns

True if PBP looks corrupt and the processing pipeline should be skipped, False otherwise.

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
if not game.corrupt_pbp_check():
game.run_processing_pipeline()

CFBPlayProcess.create_box_score(play_df)

Build a per-team and per-player advanced box score from a processed

plays frame.

Triggers run_processing_pipeline first if it hasn't already run, so the input play_df is expected to be the post-pipeline plays frame.

Parameters

ParameterTypeDefaultDescription
play_dfpl.DataFrameThe plays frame produced by run_processing_pipeline (with EPA, WPA and play-type flags already populated).

Returns

Box-score sections, each a list of records — "pass" / "rush" / "receiver" (per-player advanced + EPA lines), "team" and "situational" (per-team), "defensive" and "defensive_players" (team- and player-level havoc), "specialists" (kicking / punting / return players), "turnover", "drives", and the ESPN-sourced "espn_team" / "espn_players" totals.

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
processed = game.run_processing_pipeline()
box = game.create_box_score(game.plays_json)
print(list(box.keys()))

CFBPlayProcess.espn_cfb_pbp(**kwargs)

espn_cfb_pbp() - Pull the game by id. Data from API endpoints: college-football/playbyplay,

college-football/summary

Returns

Dictionary of game data with keys - "gameId", "plays", "boxscore", "header", "broadcasts", "videos", "playByPlaySource", "standings", "leaders", "timeouts", "homeTeamSpread", "overUnder", "pickcenter", "againstTheSpread", "odds", "predictor", "winprobability", "espnWP", "gameInfo", "season"

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
pbp = game.espn_cfb_pbp()
print(list(pbp.keys()))

# Pull only the raw ESPN summary payload (skip cleaning)

raw_pbp = CFBPlayProcess(gameId=401628334, raw=True).espn_cfb_pbp()

# Pipeline next step (run the full processing pipeline for advanced features)

game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
processed = game.run_processing_pipeline() # adds EPA, WPA, box score

CFBPlayProcess.run_cleaning_pipeline()

Run the lighter cleaning pipeline (no EPA/WPA/QBR/box-score).

Same per-play feature engineering as run_processing_pipeline through add_spread_time`, but stops short of the modeling steps. Use this when you only need cleaned plays and don't need expected points or win probability columns.

Returns

Cleaned game payload (no advBoxScore key).

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
cleaned = game.run_cleaning_pipeline()
print(len(cleaned["plays"]))

CFBPlayProcess.run_processing_pipeline()

Run the full play-by-play processing pipeline.

Applies every scoring/feature step in order: down detection, play type flags, rush/pass flags, team score variables, new play types, penalty setup, play category flags, yardage cols, player cols, after cols, spread time, EPA, WPA, drive data, and QBR. Also produces an advanced box score and stores it under advBoxScore on the returned dict.

Idempotent -- subsequent calls return the cached self.json.

Returns

The fully-processed game payload. If the constructor was given return_keys, only those keys are returned.

Example

from sportsdataverse.cfb import CFBPlayProcess
game = CFBPlayProcess(gameId=401628334)
game.espn_cfb_pbp()
processed = game.run_processing_pipeline()
print(processed["advBoxScore"].keys())

# Pipeline next step (return only selected keys)

game = CFBPlayProcess(gameId=401628334, return_keys=["plays", "advBoxScore"])
game.espn_cfb_pbp()
trimmed = game.run_processing_pipeline()

most_recent_cfb_season()

Return the most recent college football season year based on today's date.

The college football season starts in mid-August. If today is on or after August 15 (or any day in September or later), this returns the current calendar year. Otherwise, it returns the previous calendar year.

Returns

The most recent CFB season year.

Example

from sportsdataverse.cfb import most_recent_cfb_season
year = most_recent_cfb_season()
print(year)

# Combine with the loaders for a "current season" pull

from sportsdataverse.cfb import load_cfb_schedule, most_recent_cfb_season
sched = load_cfb_schedule(seasons=[most_recent_cfb_season()])

Other

cfb_odds_events_crosswalk(season: 'Optional[int]' = None, week: 'Optional[int]' = None, *, sport: 'str' = 'americanfootball_ncaaf', api_key: 'Optional[str]' = None, season_type: 'int' = 2, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'DataFrameT'

Match The Odds API CFB events to ESPN game ids.

Pulls the upcoming/live events for sport from The Odds API and the ESPN scoreboard for (season, week), then joins them on the order-independent team matchup so each odds event id maps to its ESPN event id. Because The Odds API only lists near-term events, this is most useful for the current/upcoming week.

Parameters

ParameterTypeDefaultDescription
seasonOptional[int]NoneESPN season year for the schedule side. Defaults to the most recent CFB season.
weekOptional[int]NoneESPN schedule week. When None, ESPN returns its default (current) slate.
sportstr'americanfootball_ncaaf'The Odds API sport key. Defaults to "americanfootball_ncaaf".
api_keyOptional[str]NoneThe Odds API key; falls back to the ODDS_API_KEY env var.
season_typeint2ESPN season type (2 regular, 3 post-season). Defaults to 2.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars.

Returns

A polars DataFrame (pandas when return_as_pandas=True), one row per odds event, with columns matchup_key, odds_event_id, espn_game_id, home_team, away_team, commence_time, espn_date, matched_sources.

Example

from sportsdataverse.cfb import cfb_odds_events_crosswalk
xwalk = cfb_odds_events_crosswalk(season=2024, week=5)
matched = xwalk.filter(pl.col("espn_game_id").is_not_null())

cfb_rosters_crosswalk(espn_team_id: 'Union[int, str]', fox_team_id: 'Union[int, str]', *, season: 'Optional[int]' = None, providers: 'Optional[Sequence[str]]' = None, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'DataFrameT'

Build the ESPN x Fox x Yahoo player-id crosswalk for one team.

Fetches the selected providers' players for the team, matches them on normalized name (with jersey as a confidence signal), and returns each player's ESPN, Fox, and Yahoo athlete ids side by side. Use cfb_teams_crosswalk first to translate an ESPN team id into the matching Fox team id.

ESPN and Fox provide full rosters, so the default is ("espn", "fox"). Yahoo is opt-in (pass providers=("espn", "fox", "yahoo")) because it has no roster endpoint — its only player feed is the season stat-leaderboard (sportsdataverse.cfb.yahoo_cfb_player_season_stats), which is the league's top ~200 players (roughly one per team) and frequently includes no player for a given team at all. When selected, the team is resolved by matching Yahoo's (abbreviated) team name against the ESPN team's name; if it can't be resolved, the Yahoo columns are simply null.

Parameters

ParameterTypeDefaultDescription
espn_team_idUnion[int, str]ESPN team id (e.g. 194 for Ohio State).
fox_team_idUnion[int, str]Fox Bifrost team id (e.g. 25 for Ohio State).
seasonOptional[int]NoneSeason year for the Yahoo player-stats leg. Defaults to the most recent CFB season. Unused when Yahoo isn't selected.
providersOptional[Sequence[str]]NoneWhich sources to include — any of "espn", "fox", "yahoo". None (default) uses ("espn", "fox"); add "yahoo" explicitly for its (sparse) leg, or pass a single source.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars.

Returns

A polars DataFrame (pandas when return_as_pandas=True) with columns person_key, espn_athlete_id, fox_athlete_id, yahoo_athlete_id, name, espn_jersey, fox_jersey, espn_position, fox_position, yahoo_position, match_method, matched_sources. match_method reflects the ESPN/Fox jersey agreement: name_jersey (agree), name (name only), name_jersey_conflict (jerseys differ — review), or unmatched.

Example

from sportsdataverse.cfb import cfb_rosters_crosswalk
xwalk = cfb_rosters_crosswalk(espn_team_id=194, fox_team_id=25, season=2024)
matched = xwalk.filter(pl.col("matched_sources") == "espn+fox")

# Just ESPN vs Fox (skip Yahoo's partial leg)

espn_fox = cfb_rosters_crosswalk(194, 25, providers=("espn", "fox"))

cfb_schedule_crosswalk(season: 'int', week: 'Optional[int]' = None, *, season_type: 'int' = 2, providers: 'Optional[Sequence[str]]' = None, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'DataFrameT'

Build the ESPN x Fox x Yahoo CFB game-id crosswalk.

Each ESPN game is keyed by its order-independent team matchup, and the Fox and Yahoo games are mapped onto it, so each row pairs the ESPN event id with the Fox Bifrost event id and the Yahoo dotted game id. Where a provider has no game, its columns are None and matched_sources records who contributed — so regular season, conference championships, bowls, and the CFP all flow through the same call, degrading gracefully when a source lacks a game.

Two modes:

  • Full season (week omitted): pulls every ESPN game (regular weeks + bowls + CFP), Fox's full season, and Yahoo's full season, and matches on team + date (date disambiguates rematches — a regular-season game vs a conference-championship or CFP rematch of the same teams).
  • Single week (week given): just that week's slate, matched on team.

Each provider leg is best-effort: a Fox outage, a Yahoo per-week parser hiccup, or Fox's offseason-projected CFP matchups simply leave that provider's columns null rather than failing the call.

Parameters

ParameterTypeDefaultDescription
seasonintSeason year (e.g. 2024).
weekOptional[int]NoneSchedule week number for single-week mode; omit (None) for the whole season.
season_typeint2ESPN season type for single-week mode — 2 regular, 3 post-season (week=1 bowls, week=999 CFP). Ignored in full-season mode. Defaults to 2.
providersOptional[Sequence[str]]NoneWhich sources to include — any of "espn", "fox", "yahoo". None (default) uses all three; pass a subset for a pairwise crosswalk (e.g. ("espn", "fox")) or a single source. Unselected providers are not fetched and surface as null columns.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars.

Returns

A polars DataFrame (pandas when return_as_pandas=True) with columns matchup_key, espn_game_id, fox_game_id, yahoo_game_id, yahoo_global_game_id, home_team, away_team, espn_date, fox_date, yahoo_date, matched_sources.

Example

from sportsdataverse.cfb import cfb_schedule_crosswalk
full = cfb_schedule_crosswalk(2024)
all_three = full.filter(pl.col("matched_sources") == "espn+fox+yahoo")

# Or just one week

wk5 = cfb_schedule_crosswalk(2024, 5)

cfb_teams_crosswalk(*, season: 'Optional[int]' = None, week: 'int' = 1, providers: 'Optional[Sequence[str]]' = None, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'DataFrameT'

Build the ESPN x Fox x Yahoo CFB team-id crosswalk.

Fetches the selected provider team directories, normalizes each team name to a shared key, and full-outer-joins them so every row carries each provider's id, name, and abbreviation (None where a provider has no match). The matched_sources column records which providers contributed.

Parameters

ParameterTypeDefaultDescription
seasonOptional[int]NoneSeason year used only to fetch Yahoo's embedded team directory (Yahoo has no standalone teams endpoint). Defaults to the most recent CFB season.
weekint1Schedule week used for the Yahoo scoreboard fetch. Defaults to 1. The embedded directory is the full league list regardless.
providersOptional[Sequence[str]]NoneWhich sources to include — any of "espn", "fox", "yahoo". None (default) uses all three; pass a subset for a pairwise crosswalk (e.g. ("espn", "fox")) or a single source. Unselected providers are not fetched and surface as null columns.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars.

Returns

A polars DataFrame (pandas when return_as_pandas=True) with columns norm_key, espn_team_id, espn_team, espn_abbreviation, fox_team_id, fox_team, fox_abbreviation, yahoo_team_id, yahoo_team, yahoo_abbreviation, matched_sources.

Example

from sportsdataverse.cfb import cfb_teams_crosswalk
xwalk = cfb_teams_crosswalk(season=2024)
row = xwalk.filter(pl.col("espn_team_id") == 194) # Ohio State

# Pairwise — just ESPN vs Fox

espn_fox = cfb_teams_crosswalk(providers=("espn", "fox"))

espn_cfb_teams(groups=None, return_as_pandas=False, **kwargs) -> 'pl.DataFrame'

espn_cfb_teams - look up the college football teams

Parameters

ParameterTypeDefaultDescription
groupsintNoneUsed to define different divisions. 80 is FBS, 81 is FCS.
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False, returns a polars dataframe.

Returns

Polars dataframe containing schedule dates for the requested season. This function caches by default, so if you want to refresh the data, use the command sportsdataverse.cfb.espn_cfb_teams.clear_cache().

col_nametypedescription
team_abbreviationcharacterTeam abbreviation; team_detail = TRUE only.
team_alternate_colorcharacterAlternate team color; team_detail = TRUE only.
team_colorcharacterPrimary team color; team_detail = TRUE only.
team_display_namecharacterFull team display name; team_detail = TRUE only.
team_idcharacterESPN team id.
team_is_activelogicalTRUE if the team is currently active.
team_is_all_starlogicalTRUE if the row represents an All-Star team.
team_locationcharacterTeam location / school name; team_detail = TRUE only.
team_logosintegerTeam logo metadata.
team_namecharacterTeam nickname; team_detail = TRUE only.
team_nicknamecharacterTeam nickname label; team_detail = TRUE only.
team_short_display_namecharacterShort team display name; team_detail = TRUE only.
team_slugcharacterTeam slug for the stat row.
team_uidcharacterESPN universal team identifier (UID format 's:40~l:...~t:...').

Example

from sportsdataverse.cfb import espn_cfb_teams
teams = espn_cfb_teams()
print(teams.shape)

# Pull FCS teams (group 81)

fcs = espn_cfb_teams(groups=81, return_as_pandas=True)
fcs.head()

# Pipeline next step (build an abbreviation lookup)

teams = espn_cfb_teams()
abbr_map = dict(zip(teams["team_id"], teams["team_abbreviation"]))

fox_cfb_boxscore(game_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB boxscore (long: one row per player-stat).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/event/{game_id}/data (the boxscore block).

Parameters

ParameterTypeDefaultDescription
game_idUnion[int, str]Fox Bifrost event id (e.g. "41616").
return_parsedboolTrueIf True (default) flatten the per-team stat tables to long form; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_boxscore
df = fox_cfb_boxscore("41616")

fox_cfb_league_leaders(category: 'str' = 'passing', who: 'str' = 'player', page: 'int' = 0, group_id: 'Union[int, str]' = '2', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB statistical leaders (one row per player/team).

Endpoint: GET .../bifrost/v1/cfb/league/stats-con/{who}/{category}/{page}

Parameters

ParameterTypeDefaultDescription
categorystr'passing'Stat category -- passing, rushing, receiving, defense, kicking, returning, scoring, yardage (team adds downs, turnovers). Defaults to "passing".
whostr'player'"player" or "team". Defaults to "player".
pageint00-based result page. Defaults to 0.
group_idUnion[int, str]'2'Conference/group filter. Defaults to "2".
return_parsedboolTrueIf True (default) flatten the leader tables to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_league_leaders
df = fox_cfb_league_leaders("passing")

fox_cfb_odds(game_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB game odds six-pack (spread / to win / total per team).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/event/{game_id}/odds

Parameters

ParameterTypeDefaultDescription
game_idUnion[int, str]Fox Bifrost event id (e.g. "41616").
return_parsedboolTrueIf True (default) flatten the six-pack market to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default; empty when no market is posted), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_odds
df = fox_cfb_odds("41616")

fox_cfb_pbp(game_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB play-by-play (one row per play).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/event/{game_id}/data

Parameters

ParameterTypeDefaultDescription
game_idUnion[int, str]Fox Bifrost event id (e.g. "41616") -- not the ESPN id.
return_parsedboolTrueIf True (default) flatten the pbp layout to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_pbp
df = fox_cfb_pbp("41616")

fox_cfb_play_process(event_id, odds_override: 'Optional[Dict[str, Any]]' = None, process: 'bool' = True, raw: 'bool' = False, **kwargs) -> 'Dict[str, Any]'

Build a processed CFB play-by-play game from FoxSports as a backup to ESPN.

Where ~sportsdataverse.cfb.cfb_fox_ext.fox_cfb_pbp returns the raw Fox play-by-play rows, this runs Fox data through the full ESPN play processor: it fetches FoxSports Bifrost cfb/event/{event_id}/data, adapts it into the ESPN-summary shape via fox_to_espn_summary, and runs the same ~sportsdataverse.cfb.cfb_pbp.CFBPlayProcess pipeline ESPN games use -- producing EPA / WPA / advanced box score. The result carries source="fox" so downstream consumers know the provenance (and that text-derived columns are lower fidelity than the ESPN path).

Parameters

ParameterTypeDefaultDescription
event_idFoxSports CFB event id (e.g. 41616).
odds_overrideOptional[Dict[str, Any]]NoneOptional {gameSpread, overUnder, homeFavorite, gameSpreadAvailable} dict. Fox does not expose a clean pre-game spread, so when omitted a neutral pick'em line is used (EPA is unaffected; only the WP model's spread term is neutralized).
processboolTrueIf True (default) run the full ~sportsdataverse.cfb.cfb_pbp.CFBPlayProcess.run_processing_pipeline (EPA/WPA/box). If False run the lighter ~sportsdataverse.cfb.cfb_pbp.CFBPlayProcess.run_cleaning_pipeline.
rawboolFalseIf True skip the processor entirely and return the adapted ESPN-summary dict (the input the processor would consume).

Returns

The processed game payload (same keys as CFBPlayProcess.run_processing_pipeline) with an added source="fox" key. When raw=True, the adapted summary dict.

Example

from sportsdataverse.cfb import fox_cfb_play_process
game = fox_cfb_play_process(41616)
print(len(game["plays"]), game["source"])

fox_cfb_schedule(season: 'Optional[int]' = None, *, segment_id: 'Optional[str]' = None, group_id: 'Union[int, str]' = '2', return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB full-season schedule (one row per game).

Fox lists games behind a two-step selector -> segment flow: scoreboard/main enumerates the season's segments (its selectionGroupList), and league/scores-segment/{segmentId} returns the games for one segment. Pass a season to scrape the whole season -- every regular week plus conference championships, bowls, and every College Football Playoff round -- enumerated from the live selector and unioned, deduplicated by game_id.

Segment ids encode the phase, not an ESPN-style integer week: "{season}-{week}-1" for a regular-season week, "{season}-bowls-2" for the bowls, "{season}-cfp-2" for the CFP (conference championships fall in the final regular-season week). Pass segment_id to fetch just one of them.

The numeric game_id is the Fox Bifrost event id that fox_cfb_pbp / fox_cfb_odds accept; week_label is the section title.

Parameters

ParameterTypeDefaultDescription
seasonOptional[int]NoneSeason year -> scrape the full season. Ignored when segment_id is given; if both are None the current segment is returned.
segment_idOptional[str]NoneExplicit Fox segment id (e.g. "2025-5-1", "2025-cfp-2") -> fetch just that segment.
group_idUnion[int, str]'2'Conference/division group filter. Defaults to "2" (FBS).
return_parsedboolTrueIf True (default) flatten to a DataFrame; if False return the raw JSON (a single segment's dict, or a {segment_id: dict} map in full-season mode).
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default) with columns game_id, date, status, week_label, home_team, home_team_id, away_team, away_team_id, segment_id; a pandas DataFrame when return_as_pandas=True; or raw JSON when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_schedule
season = fox_cfb_schedule(2025)

# Fetch just one segment (a week, or the playoff)

wk5 = fox_cfb_schedule(segment_id="2025-5-1")
cfp = fox_cfb_schedule(segment_id="2025-cfp-2")

fox_cfb_standings(team_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB conference standings for a team's conference.

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/team/{team_id}/standings (the league-wide league/standings endpoint returns header-only tables, so standings are keyed by team).

Parameters

ParameterTypeDefaultDescription
team_idUnion[int, str]Fox Bifrost team id (e.g. "11" = Miami (FL)).
return_parsedboolTrueIf True (default) flatten the standings tables to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_standings
df = fox_cfb_standings("11")

fox_cfb_team_gamelog(team_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB team game log -- tidy long: one row per (game, stat).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/team/{team_id}/gamelog The endpoint groups team per-game stats by category (passing, rushing, defense, ...) and season-type split; this flattens to columns team_id, season_type, category, game_id, game_date, opponent, stat, value.

Parameters

ParameterTypeDefaultDescription
team_idUnion[int, str]Fox Bifrost team id (e.g. "11" = Miami (FL)).
return_parsedboolTrueIf True (default) flatten to long form; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_team_gamelog
df = fox_cfb_team_gamelog("11")

fox_cfb_team_roster(team_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB team roster (one row per player).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/team/{team_id}/roster

Parameters

ParameterTypeDefaultDescription
team_idUnion[int, str]Fox Bifrost team id (e.g. "11" = Miami (FL)); discover via the league team directory (cfb/league/teamnav).
return_parsedboolTrueIf True (default) flatten the position-group tables to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_team_roster
df = fox_cfb_team_roster("11")

fox_cfb_team_stats(team_id: 'Union[int, str]', *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB team stat leaders (one row per category leader).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/team/{team_id}/stats

Parameters

ParameterTypeDefaultDescription
team_idUnion[int, str]Fox Bifrost team id (e.g. "11" = Miami (FL)).
return_parsedboolTrueIf True (default) flatten the leader sections to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_team_stats
df = fox_cfb_team_stats("11")

fox_cfb_teams(*, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Fox Sports CFB team directory (one row per team).

Endpoint: GET https://api.foxsports.com/bifrost/v1/cfb/league/teamnav

The team-nav payload is the canonical Fox directory: it maps every team's Bifrost id to its abbreviation, full name, and web slug. This is the lookup you need to translate a human team name into the numeric team_id the other fox_cfb_* wrappers expect, and it is the Fox side of sportsdataverse.cfb.cfb_teams_crosswalk.

Parameters

ParameterTypeDefaultDescription
return_parsedboolTrueIf True (default) flatten the nav items to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default) with columns fox_team_id, abbreviation, name, slug, color, logo_url; a pandas DataFrame when return_as_pandas=True; or the raw JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import fox_cfb_teams
teams = fox_cfb_teams()
fox_id = dict(zip(teams["abbreviation"], teams["fox_team_id"]))

fox_to_espn_summary(fox_data: 'Dict[str, Any]') -> 'Dict[str, Any]'

Adapt a Fox cfb/event/{id}/data payload into the ESPN-summary shape.

Parameters

ParameterTypeDefaultDescription
fox_dataDict[str, Any]Parsed JSON from api.foxsports.com/bifrost/v1/cfb/event/{id}/data.

Returns

A dict shaped like ESPN's college-football/summary response (header + drives + stub pickcenter/boxscore/...), ready to assign onto CFBPlayProcess(...).json.

get_cfb_teams(return_as_pandas=False) -> 'pl.DataFrame'

Load college football team ID information and logos

Parameters

ParameterTypeDefaultDescription
return_as_pandasboolFalseIf True, returns a pandas dataframe. If False, returns a polars dataframe.

Returns

Polars dataframe containing teams available.

col_nametypedescription
team_idintegerESPN team id.
schoolcharacterTeam name.
mascotcharacterTeam mascot.
abbreviationcharacterMetric abbreviation.
alt_name1characterTeam alternate name 1 (as it appears in play_text).
alt_name2characterTeam alternate name 2 (as it appears in play_text).
alt_name3characterTeam alternate name 3 (as it appears in play_text).
conferencecharacterConference of the team.
divisioncharacterDivision in the conference for the team.
colorcharacterPrimary team color (hex, no #).
alt_colorcharacterTeam color (alternate).
logocharacterTeam or league logo URL.
logo_darkcharacterDark-mode logo URL.

Example

from sportsdataverse.cfb import get_cfb_teams
teams = get_cfb_teams()
print(teams.shape)

# Pandas round-trip

teams_pd = get_cfb_teams(return_as_pandas=True)
teams_pd.head()

# Pipeline next step (build a team_id to logo URL map)

teams = get_cfb_teams()
logo_map = dict(zip(teams["team_id"], teams["logo"]))

scoreboard_event_parsing(event)

Internal helper that flattens an ESPN scoreboard event dict into a shape

suitable for pd.json_normalize.

Parameters

ParameterTypeDefaultDescription
eventdictA single scoreboard events[*] entry from the ESPN college-football scoreboard API.

Returns

The same event dict, mutated in place with home/away copies of the competitors and trimmed of unused link/odds keys.

Example

from sportsdataverse.cfb import espn_cfb_schedule
sched = espn_cfb_schedule(dates=2023, week=5)

yahoo_cfb_boxscore(game_id: 'Union[int, str]', *, return_parsed: 'bool' = False, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> 'Dict[str, Any]'

Yahoo CFB boxscore — raw JSON passthrough (parsing not yet implemented).

Wraps the editorial boxscore/{game_id} resource. The payload uses a normalized decoder-dictionary schema (player_stats[playerId][variation][stat_type]=value joined against the stat_types/stat_categories dictionaries). Flattening that into tidy frames is a follow-up; until then this returns the raw JSON dict and fails fast if a parsed frame is requested rather than silently ignoring return_parsed.

Parameters

ParameterTypeDefaultDescription
game_idUnion[int, str]Dotted Yahoo game id (e.g. "ncaaf.g.202509200023").
return_parsedboolFalseMust be False (the default). Passing True raises NotImplementedError because parsing is not implemented.
return_as_pandasboolFalseAccepted for signature parity with the sibling wrappers; has no effect while only raw output is supported.

Returns

The raw editorial boxscore JSON as a dict (service.boxscore).

Example

from sportsdataverse.cfb import yahoo_cfb_boxscore
raw = yahoo_cfb_boxscore("ncaaf.g.202509200023")

yahoo_cfb_player_season_stats(season: 'int' = 2024, *, league_structure: 'str' = 'ncaaf.struct.div.1', count: 'int' = 200, qualified: 'bool' = False, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB player season stats (modern; one wide row per player).

Wraps the shangrila leagueStatsIndividual query, which returns every stat group (passing/rushing/receiving/...) in one call, pivoted wide with one column per statId. NCAAF data is available 2013-present.

Parameters

ParameterTypeDefaultDescription
seasonint2024Season year (2013-present). Defaults to 2024.
league_structurestr'ncaaf.struct.div.1'Yahoo league-structure id (division filter). Defaults to "ncaaf.struct.div.1" (FBS).
countint200Maximum number of players to request. Defaults to 200.
qualifiedboolFalseRestrict to qualified leaders only. Defaults to False.
return_parsedboolTrueIf True (default) flatten to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A wide polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False. Includes a self-describing season column.

Example

from sportsdataverse.cfb import yahoo_cfb_player_season_stats
df = yahoo_cfb_player_season_stats(season=2024)

yahoo_cfb_player_season_stats_legacy(season: 'int' = 2024, category: 'str' = 'Passing', sort_stat: 'str' = 'PASSING_YARDS', *, league_structure: 'str' = 'ncaaf.struct.div.1', count: 'int' = 200, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB legacy per-category player leaders (one wide row per player).

Wraps the legacy seasonStatsFootball{Category}Ncaaf query (one stat category per call), pivoted wide with one column per statId.

Parameters

ParameterTypeDefaultDescription
seasonint2024Season year (2013-present). Defaults to 2024.
categorystr'Passing'Stat category, one of {"Passing", "Rushing", "Receiving", "Defense", "Kicking", "Punting", "Returns"}. Defaults to "Passing".
sort_statstr'PASSING_YARDS'Required FootballStatId to sort by (see the catalog vocab). Defaults to "PASSING_YARDS".
league_structurestr'ncaaf.struct.div.1'Yahoo league-structure id (division filter). Defaults to "ncaaf.struct.div.1" (FBS).
countint200Maximum number of players to request. Defaults to 200.
return_parsedboolTrueIf True (default) flatten to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A wide polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False. Includes self-describing season and category columns.

Example

from sportsdataverse.cfb import yahoo_cfb_player_season_stats_legacy
df = yahoo_cfb_player_season_stats_legacy(
season=2024, category="Rushing", sort_stat="RUSHING_YARDS"
)

yahoo_cfb_scoreboard(season: 'int', week: 'int' = 1, *, count: 'int' = 500, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB scoreboard (one row per game).

Wraps the editorial scoreboard resource and flattens the games map. season is required — there is no meaningful default for a weekly scoreboard and the API has no concept of "current season". The full raw payload also carries teams/leagues/odds maps (use return_parsed=False).

Parameters

ParameterTypeDefaultDescription
seasonintSeason year (required).
weekint1Schedule week number. Defaults to 1.
countint500Maximum number of games to request. Defaults to 500.
return_parsedboolTrueIf True (default) flatten the games map to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default) with one row per game, a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False. Includes self-describing season and week columns.

Example

from sportsdataverse.cfb import yahoo_cfb_scoreboard
df = yahoo_cfb_scoreboard(season=2024, week=1)

yahoo_cfb_team_season_stats(season: 'int' = 2024, *, league_structure: 'str' = 'ncaaf.struct.div.1', count: 'int' = 200, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB team season stats (modern; one wide row per team).

Wraps the shangrila leagueStatsByTeam query (all stat groups in one call, pivoted wide with one column per statId).

Parameters

ParameterTypeDefaultDescription
seasonint2024Season year (2013-present). Defaults to 2024.
league_structurestr'ncaaf.struct.div.1'Yahoo league-structure id (division filter). Defaults to "ncaaf.struct.div.1" (FBS).
countint200Maximum number of teams to request. Defaults to 200.
return_parsedboolTrueIf True (default) flatten to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A wide polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False. Includes a self-describing season column.

Example

from sportsdataverse.cfb import yahoo_cfb_team_season_stats
df = yahoo_cfb_team_season_stats(season=2024)

yahoo_cfb_team_season_stats_legacy(season: 'int' = 2024, category: 'str' = 'Passing', sort_stat: 'str' = 'PASSING_YARDS', *, league_structure: 'str' = 'ncaaf.struct.div.1', count: 'int' = 200, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB legacy per-category team stats (one wide row per team).

Wraps the legacy seasonTeamStatsFootball{Category} query (one stat category per call), pivoted wide with one column per statId.

Parameters

ParameterTypeDefaultDescription
seasonint2024Season year (2013-present). Defaults to 2024.
categorystr'Passing'Stat category, one of {"Passing", "Rushing", "Receiving", "Defense", "Kicking", "Punting", "Returns", "Kickoffs", "Offense"}. Defaults to "Passing".
sort_statstr'PASSING_YARDS'Required FootballStatId to sort by. Defaults to "PASSING_YARDS".
league_structurestr'ncaaf.struct.div.1'Yahoo league-structure id (division filter). Defaults to "ncaaf.struct.div.1" (FBS).
countint200Maximum number of teams to request. Defaults to 200.
return_parsedboolTrueIf True (default) flatten to a DataFrame; if False return the raw JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A wide polars DataFrame (default), a pandas DataFrame when return_as_pandas=True, or the raw JSON dict when return_parsed=False. Includes self-describing season and category columns.

Example

from sportsdataverse.cfb import yahoo_cfb_team_season_stats_legacy
df = yahoo_cfb_team_season_stats_legacy(
season=2024, category="Rushing", sort_stat="RUSHING_YARDS"
)

yahoo_cfb_teams(season: 'int', week: 'int' = 1, *, return_parsed: 'bool' = True, return_as_pandas: 'bool' = False, **kwargs: 'Any') -> "Union[pl.DataFrame, 'pd.DataFrame', Dict[str, Any]]"

Yahoo CFB team directory (one row per team).

Yahoo has no standalone teams resource (the documented sports.league.teams resource 404s without auth). Instead the editorial scoreboard payload is "fat": one call embeds the full ~186-team directory under service.scoreboard.teams keyed by the dotted ncaaf.t.<id> team id. This wrapper pulls that map for the requested (season, week) and projects it to the directory columns -- it is the Yahoo side of sportsdataverse.cfb.cfb_teams_crosswalk.

Parameters

ParameterTypeDefaultDescription
seasonintSeason year (required; the scoreboard is fetched to obtain the embedded teams map).
weekint1Schedule week used to fetch the scoreboard. Defaults to 1. The embedded directory is the full league list regardless of week.
return_parsedboolTrueIf True (default) flatten the teams map to a DataFrame; if False return the raw scoreboard JSON dict.
return_as_pandasboolFalseIf True return a pandas DataFrame; otherwise polars. Ignored when return_parsed=False.

Returns

A polars DataFrame (default) with one row per team -- columns team_id, abbreviation, display_name, full_name, location, nickname, conference, conference_abbreviation, conference_id, division, division_id, seatgeek_id -- a pandas DataFrame when return_as_pandas=True, or the raw scoreboard JSON dict when return_parsed=False.

Example

from sportsdataverse.cfb import yahoo_cfb_teams
teams = yahoo_cfb_teams(season=2024)
abbr = dict(zip(teams["team_id"], teams["abbreviation"]))