nfl_team_epa <- read_csv("http://bcdanl.github.io/data/nfl_team_epa.csv")Data Storytelling Team Project - Football
Data
The following lists data frames about National Football League (NFL) for the seasons from 2014-15 through 2023-2024:
nfl_team_epa: Team’s mean expected points added (EPA) when the team was on offense and when the team was on defense- For the details about EPA, please refer to the Football Metrics section below in the webpage.
nfl_field_goals: Play-by-play statistics at situations when field goals were attempted during the gamenfl_passers: weakly EPA and completion percentage over expected (CPOE) among players who passed more than 44 times within a weeknfl_player_stat: Player statisticsnfl_receivers: Total EPA of players whose positions are either “WR”, “TE”, or “RB”, and top 10 players in terms of total EPA
| Position | Full Name | Main Role | Skills Required |
|---|---|---|---|
| WR | Wide Receiver | Catch passes and gain yards or score TDs | Speed, agility, reliable hands |
| TE | Tight End | Block defenders and catch passes | Strength, versatility, reliable hands |
| RB | Running Back | Run the ball, catch passes, block | Speed, vision, agility |
| QB | Quarterback | Lead the offense and throw passes | Decision-making, accuracy, arm strength, leadership |
NFL Team EPA (nfl_team_epa)
nfl_team_epa: Team’s mean EPA when the team was on offense and when the team was on defense
season Starting year of the season (2014 if 2014-15 season)
team Team abbreviation
off_epa Offensive EPA
def_epa Defensive EPA
NFL Field Goals (nfl_field_goals)
nfl_field_goals: Play-by-play statistics at situations when field goals were attempted during the game- A data frame with 10047 observations on the 23 variables.
nfl_field_goals <- read_csv("http://bcdanl.github.io/data/nfl_field_goals.csv")fg_distanceaccurately reflects the total distance of a field goal attempt:- The total addition of 17 yards comes from 10 yards (end zone) + 7 yards (holder’s position).
NFL Passer’s EPA and COPE (nfl_passers)
nfl_passers: weakly mean EPA and completion percentage over CPOE among players who passed more than 44 times within a week- A data frame with 1098 rows and 22 variables:
nfl_passers <- read_csv("http://bcdanl.github.io/data/nfl_passers.csv")NFL Player Statistics (nfl_players_stat)
nfl_players_stat: Player statistics
nfl_players_stat <- read_csv("http://bcdanl.github.io/data/nfl_players_stat.csv")yards rushing_yards + receiving_yards
rushing_yards Yards gained when rushing with the ball (incl. scrambles and kneel downs). Also includes yards gained after obtaining a lateral on a play that started with a rushing attempt.
receiving_yards Yards gained after a pass reception. Includes yards gained after receiving a lateral on a play that started as a pass play.
touches carries + receptions
carries The number of official rush attempts (incl. scrambles and kneel downs). Rushes after a lateral reception don’t count as carry.
receptions The number of pass receptions. Lateral receptions officially don’t count as reception.
tds rushing_tds + receiving_tds
rushing_tds The number of rushing touchdowns (incl. scrambles). Also includes touchdowns after obtaining a lateral on a play that started with a rushing attempt.
receiving_tds The number of touchdowns following a pass reception. Also includes touchdowns after receiving a lateral on a play that started as a pass play.
NFL Receivers (nfl_receivers)
nfl_receivers: Total EPA of players whose positions are either “WR”, “TE”, or “RB”, and top 10 players in terms of total EPA
nfl_receivers <- read_csv("http://bcdanl.github.io/data/nfl_receivers.csv")epa_rank: Ranking in terms of tot_epa (The lower tot_epa, the higher EPA)
epa_rank_within_position: Ranking in terms of EPA within the group of the same position
n_received: the number of passes a player received
tot_epa: Total EPA within a season
Football Metrics
Expected Points
Expected Points Added (EPA) is a football analytics metric that measures the value of a play in terms of its impact on the team’s expected scoring. It quantifies how much a single play increases or decreases a team’s chances of scoring compared to the situation before the play.
How EPA Works
Every play in football occurs within a specific context (down, distance, field position, time remaining, and score). Historical data is used to calculate the expected points (EP) a team can expect to score from their current situation. EPA is the difference between the expected points after the play and before the play.
Formula:
\[ EPA = EP (\text{after the play}) - EP (\text{before the play}) \]
Key Insights:
- Positive EPA: The play improved the team’s scoring chances.
- Example: A 20-yard pass on 3rd and 8 increases the likelihood of scoring.
- Negative EPA: The play reduced the team’s scoring chances.
- Example: A sack or an interception harms the team’s scoring potential.
Why EPA Is Important
- Contextual: Accounts for situational factors, making it more informative than raw stats like yards gained.
- Play Evaluation: Helps determine the effectiveness of specific plays or players.
- Strategic Decisions: Assists coaches and analysts in evaluating decisions like when to go for it on 4th down.
Applications
- Offensive EPA: Evaluates how well a team’s offense increases scoring opportunities.
- Defensive EPA: Measures how effectively a defense reduces the opposing team’s scoring potential.
- Player Performance: Used to assess quarterbacks, running backs, wide receivers, and defenders by their contribution to scoring or preventing points.
Completion Percentage Over Expected (CPOE)
Completion Percentage Over Expected (CPOE) is a football analytics metric that evaluates a quarterback’s passing performance by comparing their actual completion percentage to the expected completion percentage based on the difficulty of their pass attempts.
How CPOE Works
- Actual Completion Percentage (COMP%):
- The percentage of passes a quarterback completes.
- Expected Completion Percentage (xCOMP%):
- Calculated based on factors like:
- Distance of the throw (air yards)
- Angle of the throw
- Receiver separation
- Defensive pressure
- Game situation (e.g., down, distance, and field position)
- Derived from historical data on similar passes.
- CPOE Formula: \[ CPOE = COMP - xCOMP \]
Interpretation of CPOE
- Positive CPOE: Indicates the quarterback is completing more passes than expected, showcasing accuracy and skill.
- Negative CPOE: Indicates the quarterback is completing fewer passes than expected, potentially highlighting issues with accuracy or decision-making.
Why CPOE Is Useful
- Isolates Skill: It accounts for the difficulty of throws, focusing on the quarterback’s performance rather than the system or play design.
- Complementary Metric: Often paired with EPA/play to provide a comprehensive evaluation of a quarterback’s impact.
- Game Context: Helps differentiate between quarterbacks who excel in challenging situations versus those whose stats are inflated by easy throws.