<- read_csv("http://bcdanl.github.io/data/nfl_team_epa.csv") nfl_team_epa
Data Storytelling Team Project - Football
Data
The following lists data frames about National Football League (NFL) for the seasons from 2014-15 through 2023-2024:
nfl_team_epa
: Team’s mean expected points added (EPA) when the team was on offense and when the team was on defense- For the details about EPA, please refer to the Football Metrics section below in the webpage.
nfl_field_goals
: Play-by-play statistics at situations when field goals were attempted during the gamenfl_passers
: weakly EPA and completion percentage over expected (CPOE) among players who passed more than 44 times within a weeknfl_player_stat
: Player statisticsnfl_receivers
: Total EPA of players whose positions are either “WR”, “TE”, or “RB”, and top 10 players in terms of total EPA
Position | Full Name | Main Role | Skills Required |
---|---|---|---|
WR | Wide Receiver | Catch passes and gain yards or score TDs | Speed, agility, reliable hands |
TE | Tight End | Block defenders and catch passes | Strength, versatility, reliable hands |
RB | Running Back | Run the ball, catch passes, block | Speed, vision, agility |
QB | Quarterback | Lead the offense and throw passes | Decision-making, accuracy, arm strength, leadership |
NFL Team EPA (nfl_team_epa
)
nfl_team_epa
: Team’s mean EPA when the team was on offense and when the team was on defense
season <dbl> | team <chr> | off_epa <dbl> | def_epa <dbl> | |
---|---|---|---|---|
2014 | ARI | -0.0245 | -0.0501 | |
2014 | ATL | 0.0069 | 0.0748 | |
2014 | BAL | 0.0669 | -0.0321 | |
2014 | BUF | -0.0743 | -0.1024 | |
2014 | CAR | -0.0067 | 0.0045 | |
2014 | CHI | -0.0498 | 0.0789 | |
2014 | CIN | -0.0176 | -0.0107 | |
2014 | CLE | -0.0710 | -0.0350 | |
2014 | DAL | 0.1174 | 0.0030 | |
2014 | DEN | 0.0851 | -0.0569 |
season
Starting year of the season (2014 if 2014-15 season)
team
Team abbreviation
off_epa
Offensive EPA
def_epa
Defensive EPA
NFL Field Goals (nfl_field_goals
)
nfl_field_goals
: Play-by-play statistics at situations when field goals were attempted during the game- A data frame with 10047 observations on the 23 variables.
<- read_csv("http://bcdanl.github.io/data/nfl_field_goals.csv") nfl_field_goals
game_date <date> | time <time> | down <dbl> | yrdln <chr> | ydstogo <dbl> | yardline_100 <dbl> | fg_distance <dbl> | posteam <chr> | defteam <chr> | field_goal_result <chr> | |
---|---|---|---|---|---|---|---|---|---|---|
2014-09-07 | 08:12:00 | 4 | CHI 32 | 6 | 32 | 49 | BUF | CHI | made | |
2014-09-07 | 09:39:00 | 4 | BUF 23 | 11 | 23 | 40 | CHI | BUF | made | |
2014-09-07 | 04:07:00 | 4 | CHI 15 | 5 | 15 | 32 | BUF | CHI | made | |
2014-09-07 | 00:35:00 | 4 | BUF 19 | 1 | 19 | 36 | CHI | BUF | made | |
2014-09-07 | 09:51:00 | 2 | CHI 9 | 9 | 9 | 26 | BUF | CHI | made | |
2014-09-07 | 00:05:00 | 3 | TB 10 | 9 | 10 | 27 | CAR | TB | made | |
2014-09-07 | 09:45:00 | 4 | TB 30 | 17 | 30 | 47 | CAR | TB | missed | |
2014-09-07 | 00:28:00 | 4 | TB 15 | 2 | 15 | 32 | CAR | TB | made | |
2014-09-07 | 10:30:00 | 4 | BAL 31 | 2 | 31 | 48 | CIN | BAL | made | |
2014-09-07 | 01:28:00 | 4 | BAL 4 | 4 | 4 | 21 | CIN | BAL | made |
fg_distance
accurately reflects the total distance of a field goal attempt:- The total addition of 17 yards comes from 10 yards (end zone) + 7 yards (holder’s position).
NFL Passer’s EPA and COPE (nfl_passers
)
nfl_passers
: weakly mean EPA and completion percentage over CPOE among players who passed more than 44 times within a week- A data frame with 1098 rows and 22 variables:
<- read_csv("http://bcdanl.github.io/data/nfl_passers.csv") nfl_passers
season <dbl> | week <dbl> | passer <chr> | epa <dbl> | cpoe <dbl> | n_passes <dbl> | team <chr> | position <chr> | jersey_number <dbl> | full_name <chr> | |
---|---|---|---|---|---|---|---|---|---|---|
2014 | 1 | A.Luck | 0.1148 | 1.1793 | 56 | IND | QB | 12 | Andrew Luck | |
2014 | 1 | C.Henne | -0.2598 | -6.7311 | 46 | JAX | QB | 7 | Chad Henne | |
2014 | 1 | J.Cutler | -0.1325 | 2.8643 | 51 | CHI | QB | 6 | Jay Cutler | |
2014 | 1 | J.Flacco | -0.0749 | -11.8895 | 65 | BLT | QB | 5 | Joe Flacco | |
2014 | 1 | N.Foles | -0.2225 | -1.0866 | 49 | PHI | QB | 9 | Nick Foles | |
2014 | 1 | T.Brady | -0.2110 | -8.6263 | 60 | NE | QB | 12 | Tom Brady | |
2014 | 2 | A.Rodgers | 0.2271 | 2.7334 | 47 | GB | QB | 12 | Aaron Rodgers | |
2014 | 2 | A.Smith | 0.2550 | 0.2252 | 44 | KC | QB | 11 | Alex Smith | |
2014 | 2 | M.Ryan | -0.3360 | -8.0864 | 46 | ATL | QB | 2 | Matt Ryan | |
2014 | 2 | M.Stafford | -0.0220 | -6.2701 | 52 | DET | QB | 9 | Matthew Stafford |
NFL Player Statistics (nfl_players_stat
)
nfl_players_stat
: Player statistics
<- read_csv("http://bcdanl.github.io/data/nfl_players_stat.csv") nfl_players_stat
season <dbl> | player_id <chr> | player_name <chr> | recent_team <chr> | yards <dbl> | rushing_yards <dbl> | receiving_yards <dbl> | touches <dbl> | carries <dbl> | receptions <dbl> | |
---|---|---|---|---|---|---|---|---|---|---|
2014 | 00-0028009 | D.Murray | DAL | 2261 | 1845 | 416 | 449 | 392 | 57 | |
2014 | 00-0030496 | L.Bell | PIT | 2215 | 1361 | 854 | 373 | 290 | 83 | |
2014 | 00-0026184 | M.Forte | CHI | 1846 | 1038 | 808 | 368 | 266 | 102 | |
2014 | 00-0027793 | A.Brown | PIT | 1711 | 13 | 1698 | 133 | 4 | 129 | |
2014 | 00-0025399 | M.Lynch | SEA | 1673 | 1306 | 367 | 317 | 280 | 37 | |
2014 | 00-0027874 | D.Thomas | DEN | 1619 | 0 | 1619 | 111 | 0 | 111 | |
2014 | 00-0027944 | J.Jones | ATL | 1594 | 1 | 1593 | 105 | 1 | 104 | |
2014 | 00-0026796 | NA | HOU | 1573 | 1246 | 327 | 298 | 260 | 38 | |
2014 | 00-0030485 | E.Lacy | GB | 1566 | 1139 | 427 | 288 | 246 | 42 | |
2014 | 00-0026373 | NA | BAL | 1529 | 1266 | 263 | 279 | 235 | 44 |
yards
rushing_yards
+ receiving_yards
rushing_yards
Yards gained when rushing with the ball (incl. scrambles and kneel downs). Also includes yards gained after obtaining a lateral on a play that started with a rushing attempt.
receiving_yards
Yards gained after a pass reception. Includes yards gained after receiving a lateral on a play that started as a pass play.
touches
carries
+ receptions
carries
The number of official rush attempts (incl. scrambles and kneel downs). Rushes after a lateral reception don’t count as carry.
receptions
The number of pass receptions. Lateral receptions officially don’t count as reception.
tds
rushing_tds
+ receiving_tds
rushing_tds
The number of rushing touchdowns (incl. scrambles). Also includes touchdowns after obtaining a lateral on a play that started with a rushing attempt.
receiving_tds
The number of touchdowns following a pass reception. Also includes touchdowns after receiving a lateral on a play that started as a pass play.
NFL Receivers (nfl_receivers
)
nfl_receivers
: Total EPA of players whose positions are either “WR”, “TE”, or “RB”, and top 10 players in terms of total EPA
<- read_csv("http://bcdanl.github.io/data/nfl_receivers.csv") nfl_receivers
season <dbl> | receiver <chr> | position <chr> | epa_rank <dbl> | epa_rank_within_position <dbl> | n_received <dbl> | tot_epa <dbl> |
---|---|---|---|---|---|---|
2014 | J.Nelson | WR | 1 | 1 | 182 | 109.155604 |
2014 | A.Brown | WR | 2 | 2 | 208 | 98.263152 |
2014 | R.Cobb | WR | 3 | 3 | 153 | 91.941996 |
2014 | E.Sanders | WR | 4 | 4 | 169 | 81.666310 |
2014 | D.Bryant | WR | 5 | 5 | 156 | 81.495877 |
2014 | J.Edelman | WR | 6 | 6 | 184 | 75.916749 |
2014 | O.Beckham | WR | 7 | 7 | 141 | 66.936396 |
2014 | J.Jones | WR | 8 | 8 | 171 | 64.560002 |
2014 | R.Gronkowski | TE | 9 | 1 | 172 | 63.931129 |
2014 | T.Hilton | WR | 10 | 9 | 166 | 63.020969 |
epa_rank
: Ranking in terms of tot_epa
(The lower tot_epa
, the higher EPA)
epa_rank_within_position
: Ranking in terms of EPA within the group of the same position
n_received
: the number of passes a player received
tot_epa
: Total EPA within a season
Football Metrics
Expected Points
Expected Points Added (EPA) is a football analytics metric that measures the value of a play in terms of its impact on the team’s expected scoring. It quantifies how much a single play increases or decreases a team’s chances of scoring compared to the situation before the play.
How EPA Works
Every play in football occurs within a specific context (down, distance, field position, time remaining, and score). Historical data is used to calculate the expected points (EP) a team can expect to score from their current situation. EPA is the difference between the expected points after the play and before the play.
Formula:
Key Insights:
- Positive EPA: The play improved the team’s scoring chances.
- Example: A 20-yard pass on 3rd and 8 increases the likelihood of scoring.
- Negative EPA: The play reduced the team’s scoring chances.
- Example: A sack or an interception harms the team’s scoring potential.
Why EPA Is Important
- Contextual: Accounts for situational factors, making it more informative than raw stats like yards gained.
- Play Evaluation: Helps determine the effectiveness of specific plays or players.
- Strategic Decisions: Assists coaches and analysts in evaluating decisions like when to go for it on 4th down.
Applications
- Offensive EPA: Evaluates how well a team’s offense increases scoring opportunities.
- Defensive EPA: Measures how effectively a defense reduces the opposing team’s scoring potential.
- Player Performance: Used to assess quarterbacks, running backs, wide receivers, and defenders by their contribution to scoring or preventing points.
Completion Percentage Over Expected (CPOE)
Completion Percentage Over Expected (CPOE) is a football analytics metric that evaluates a quarterback’s passing performance by comparing their actual completion percentage to the expected completion percentage based on the difficulty of their pass attempts.
How CPOE Works
- Actual Completion Percentage (COMP%):
- The percentage of passes a quarterback completes.
- Expected Completion Percentage (xCOMP%):
- Calculated based on factors like:
- Distance of the throw (air yards)
- Angle of the throw
- Receiver separation
- Defensive pressure
- Game situation (e.g., down, distance, and field position)
- Derived from historical data on similar passes.
- CPOE Formula:
Interpretation of CPOE
- Positive CPOE: Indicates the quarterback is completing more passes than expected, showcasing accuracy and skill.
- Negative CPOE: Indicates the quarterback is completing fewer passes than expected, potentially highlighting issues with accuracy or decision-making.
Why CPOE Is Useful
- Isolates Skill: It accounts for the difficulty of throws, focusing on the quarterback’s performance rather than the system or play design.
- Complementary Metric: Often paired with EPA/play to provide a comprehensive evaluation of a quarterback’s impact.
- Game Context: Helps differentiate between quarterbacks who excel in challenging situations versus those whose stats are inflated by easy throws.