Showing posts with label chi-square. Show all posts
Showing posts with label chi-square. Show all posts

Thursday, November 28, 2019

Penalty Yards Awarded for Defensive Pass Interference in the NFL are Unjustified

I watch way too much college football, but I have a limited interest in the NFL. I will watch NFL games featuring Lamar Jackson, Deshaun Watson, the Jaguars with Fournette and Josh Allen, and the Chief’s offense—when those games are actually available in my viewing area. Blackouts are one of myriad reasons a person might develop a distaste for the NFL (e.g., stadium costs, handling of violence against women, ticket costs, nonguaranteed contractsobstructing the publishing of concussion problems, treatment of retired players, and Roger Goodell). In full disclosure, I loathe how defensive pass interference (DPI) penalties are enforced in the NFL, the topic of this post. Regardless, I am specifically curious if the yardage granted by an enforced DPI call in the NFL is justifiable statistically.  

Longtime POTH readers know I am an aspiring defensive back. Hence, I will typically be biased against many DPI calls, but I recognize that DPI does genuinely occur. My concern about DPI enforcement in the NFL is that it is a spot foul. The offensive team is awarded a first down at the yard line where the DPI was committed. It thus assumes that the receiver would have caught the pass were it not for the DPI. A spot foul 10 or 15 yards down field seems reasonable to me, but 30 or 40 yards seems like far too much field position to simply gift the offense on what may or may not have been a catch if there were no PI. In other words, it is unfair to award the offense, say, 40 yards because of DPI given that one of several events could have led to an incompletion if there were not DPI. My thesis is that NFL DPI penalty yardage becomes increasingly unjustifiable as the spot of the foul gets farther from the line of scrimmage. But, I don’t know that and that’s why I’m exploring the matter. 

any act by a player more than one yard beyond the line of scrimmage that significantly hinders an eligible player’s opportunity to catch the ball

I needed play-by-play data. It had to include depth of target, the distance from the line of scrimmage to the yard line where the pass is caught or comes closest to the targeted receiver. I found nothing in the open-source arena but ArmchairAnalysis does provide a sample of their thorough NFL charting data, which I used. Specifically, it is a sample of about 4013 plays from two weeks in the 2019 NFL season. Of those, about 2430 are passing plays. I removed 82 throwaways, 157 sacks, and 11 spiked balls, because these events preclude a pass to a receiver.  This left 2180 passing plays eligible for analysis. 

First, I compared the completion % of the 2273 passing plays that were not sacks, which is 65%, to the NFL average % for all other weeks in 2019 (through week 12), which is 63.8%.  This way we can assess if this sample of passes is somehow dissimilar from passes in the remainder of the season. It was not, χ² = 1.49, p = 0.22, 95% CI [0.63, 0.67]. 


Figure 1. Top panel shows distribution of completions (magenta) and incompletions (brown) by depth of target. Bottom panel shows completion % by depth of target (dotted black) and estimated probability of completion by depth of target (green).


Figure 1 shows the raw completion percentage by depth of target (dotted black) and the estimated probability of a completion when accounting for random variance due to defense and targeted receiver (green).1 The farther a targeted receiver is from the line of scrimmage, the less likely the pass is to be completed. Indeed, this provides some support for my thesis that awarding a first down at the spot of the DPI is increasingly unjustifiable when the depth of target is farther and farther from the line of scrimmage. This is because passes targeted farther down the field are simply less likely to be completed.

Figure 2. Expected yards per pass attempt by depth of target (dotted black) and the expected yards when controlling for random variance due to defense and offense units (orange).


Another way to frame the issue is in terms of yards per attempt (Y/A). That is, how many receiving yards are expected on a pass to a given depth of target. Y/A is a widely used measure of passing efficiency. Figure 2 shows the Y/A by depth of target (dotted black) and the expected Y/A when accounting for random variance due to defense and offense (orange).  This provides additional support for my thesis that awarding a spot foul for DPI is increasingly unwarranted when the DPI is farther from the line of scrimmage. For example, a target of 32 yards down field is expected to gain only 15.9 yards. This might seem odd because 15.9 is less than 32, but Y/A accounts for the probability of the pass being completed. Again, passes target farther down field are less likely to be caught thus a spot foul is less justifiable.

Summarily, this exploratory post yields evidence suggesting that the penalty yards awarded for DPI in the NFL are unwarranted. Although some random variance was accounted for in the models, the major shortcoming is that other factors that affect completion percentage and receiving yards were not accounted for in the analysis. This includes factors such as QB pressure, pass coverage, field position, score differential, and others. Nevertheless, the results demonstrate that, by being a spot foul, the penalty yards awarded to the offense following a DPI (with an uncaught pass) in the NFL are incommensurate with the yardage that would be expected given the depth of target. We here at POTH have no delusions that the enforcement of penalty yardage for DPI will be subject to change. Likewise, we are not anarchists; we respect the game and know that parameters are needed to standardize competition. However, we do feel it necessary to present evidence that directly contradicts any notion of rules designed to ensure a fair game that is decided on the field, by the players.









1
Computed using GLMM specifying a binomial distribution. Depth of Target is fixed effect, with defense and targeted receiver as random effects. QB and offense were considered as random effects but were essentially null and excluded from the model. The model explained about 12% of the variance in completion %. Depth of Target was significant, reducing the log odds of completion by -0.059 for each one yard from the line of scrimmage.

2
Computed using LMM. Simple linear regression had R2 = 0.108 and a smooth regression line had R2 = 0.11, so I used a linear model for simplicity. Depth of Target is fixed effect, with defense and offense as random effects, each with a random intercept for depth of target. The model explained about 18.4% of the variance in A/Y. Depth of target was significant, increasing the A/Y by about 1.03 yards for every three yards of depth of target.


NFL Stats provided by ArmchairAnalysis.com

Wednesday, January 16, 2019

Icing the Kicker in NCAA Football 2005-18

In gridiron football, the icing the kicker phenomenon is thought to occur when the defending team calls a TO just before the ball is snapped on a FG attempt (FGA) that could tie, win, or otherwise sway the outcome of the game in favor of the kicking team. The motivation for calling the TO is that it could somehow disturb, or ‘ice’ the kicker in a way that he will be more likely to miss the FGA. 

Other authors have endeavored to examine icing the kicker. Some have reported that, in the NFL, calling a TO before a FG does not reduce the likelihood of making a FGA, whether controlling for FGA length or not. Other studies suggest suggests there is indeed an effect of reducing likelihood on longer NFL FGAs that is absent on shorter FGAs, when controlling for FGA length and other factors. At the collegiate level, it appears that icing the kicker may be effective on longer FGs; specifically, greater than 45 yards.  However, this study had a small sample of iced kicks.

Table 1. Descriptive Statistics for NCAA FGAs 2005-18
This Many Attempted Fields Goals were
Quarter FG% uFG% Attempted Made Blocked Home Attempts Last 2min Attempts ≤15s after TO Attempts
1st 0.727 0.753 7051 5123 248 3559 1129 468
2nd 0.702 0.732 11473 8049 475 5957 4413 3264
3rd 0.740 0.766 6629 4906 224 3417 1033 425
4th & OT 0.715 0.747 7176 5129 308 3764 1566 1754
TOTAL 0.718 0.747 32329 23207 1255 16697 8141 5911
We here at POTH sought to reexamine icing the kicker at the collegiate level using a much larger data set. This includes 32,329 FGAs from NCAA Division I FBS vs FBS and FBS vs FCS games from 2005 through mid-November 2018. Table 1 has the breakdowns of some data we’ll refer to throughout. The last 2 minutes refers to FGAs during the last two minutes of quarters 1 through 4 and any FGA occurring in OT. 

Let us start with blocked FGAs, though. Notably, as seen in Table 1, blocked FGAs were more likely to occur in the 2nd quarter and 4th quarter and OT (χ² = 12.3, p = 0.007)—the situations in games most relevant to icing the kicker. Longer FGAs were more likely to be blocked regardless of the quarter (p < 0.001). FGAs were also more likely to be blocked in the last 2 minutes of quarters and OT, but especially in the last 2 minutes of the 4th quarter and OT (p = 0.06). For these reasons, we shall include in our analyses only unblocked FGAs. This leaves 31,074 FGAs for analysis.

Table 2. Proportional Statistics for NCAA FGAs 2005-18
Proportion of Field Goals Made
Quarter % Blocked Home Team Away Tem Last 2min Before Last 2m ≤15s after TO No TO Before
1st 0.035 0.743 0.710 0.731 0.751 0.726 0.752
2nd 0.041 0.714 0.688 0.677 0.746 0.680 0.742
3rd 0.034 0.755 0.724 0.743 0.765 0.701 0.769
4th & OT 0.043 0.728 0.700 0.676 0.754 0.694 0.749
OVERALL 0.039 0.757 0.736 0.725 0.754 0.721 0.752

About 74.6% of (unblocked) FGAs are made. Figure 1 shows that FG% declines as the length of the FGA increases. There is some variation in FG% between quarters, with 3rd-quarter FGAs being most successful. Only differences between the 3rd and 2nd (p = 0.03) and the 4th and 3rd (p = 0.03) are significant when we account for length of the FGA, which is, by far, the most significant predictor of FG success. Longer FGAs are less successful at all points in the game. 
Figure 1. Likelihood of Making a FGA, by Length (using binomial smooth)
FGAs by the home team (75.6%) are about 2.8% more likely to be made than FGAs by the road team (73.5%) (χ² = 17.9, p < 0.001). When controlling for FGA length and quarter, home FGAs are 6.6% more likely to be made (p < 0.001). However, this advantage of home FG% is relatively constant at all FGA lengths. That is, home-team FG kickers tend to be slightly more successful than road-team kickers on FGAs of any length, and at any point in the game. 

What about the FG% in the last 2 minutes of quarters, when icing the kicker usually occurs? Table 1 shows that it clearly drops in the 4th quarter and OT (in the 2nd too). This drop in FG% in the last two minutes is, however, diminished when controlling for FGA length, quarter, and home/away (p = 0.52). It should be noted that FGAs in the last 2 minutes of the 2nd and 4th quarters are 1-2 yards longer than FGAs at other times in the game (ps < 0.002). 

How do the stakes of the game effect FG%? The opportunity to tie the game seems to have a general effect of increasing the likelihood of making a FG (p = 0.05). Otherwise, though, there is no effect of stakes on FG% when controlling for length, quarter, home/away, and being in the last 2 minutes or not

FGAs 15 seconds or less after a TO are made 72.1% of the time whereas other FGAs are made 75.2% of the time (χ² = 24, p < 0.001). Now, this is just if any TO is called; that is, by the offense, the defense, or some other TO that was not attributed to either team in the data. Really, we have a variable that indicates whether the TO was called by the offense, the defense, was unattributed, or if no TO was called. If we were to continue the analysis as we have been doing it, we would examine a four-way interaction between quarter, last 2 minutes or not, stakes, and who called the TO before or not. Four-way interactions are messy. And three of those variables have four levels. We should do something else.
Figure 2a. FG% by TO TypeFigure 2b. FGA Length by TO Type
Let us narrow our focus to FGAs in the last 2 minutes of the 4th quarter and OT where the offense can either tie the game or take the lead with a FG, which leaves 2,173 FGAs for analysis. In the two figures we see that iced FGAs (i.e., those after a defense TO) [a] are the least successful, at ~70%, but [b] are also, on average, the longest FGAs in this game situation. Thus, when we model the likelihood of making a FG, while accounting for length and home/away, there is no effect of icing the kicker (p = 0.24). Like, icing the kicker has no statistically differentiable effect of decreasing the likelihood of making longer FGAs (p = 0.25). However, the estimated marginal probabilities in the figure below suggest that the likelihood of iced FGAs declines slightly more at longer distances, although, again, this is not statistically significant. 
Figure 3. Estimated probabilities of FG% by length and who called TO in last 2 minutes of 4th & OT

Whereas we have used the raw yardage value for FGA length, the one previous study of icing the kicker in NCAA football split length into ‘bins’: distances of 18-25 yards, 26-35 yards, 36-45 yards, and >45 yards. The author of the previous study used only data from 2017-18 and found that of 38 iced FGAs in the last 2 minutes of the 4th quarter and OT, only 26% were made. If I examine only data from 2017-18, I find these same numbers (38 iced FGAs, 10 made, 28 missed). Below, using all data, I went ahead and show the FG% for each of these length-bins by who called the TO, for the sake of comparison across studies. The quantities of FGAs are shown parenthetically. Longer iced FGAs appear to be made lower rates.
Figure 4. Proportion of FGAs Made by TO Type by Yardage Bins used in Dalen (2018)
Summarily, the present report examined icing the kicker in NCAA football. This study used a sizable data set which would enhance the generalizability of the findings. However, the primary analysis indicated there was no effect of icing the kicker. Additional examination suggested that there might be an effect of icing the kicker at FGAs longer than 45 but such a conclusion is limited by there being fewer FGAs attempted from these lengths (i.e., smaller sample) and the variability of success at increasing lengths. Likewise, other potentially influential factors such as meteorological conditions, team FG kicking/defensing quality, and on-field activity were not accounted for in the analysis. NCAA football coaches should continue utilizing icing the kicker so they may endure the rancor of punditry, boosters, delusional fans, etc., when their teams lose games on last-second field goals.  

Sunday, March 19, 2017

Comparison of WNBA and NBA 2-Point Field Goals

Comparisons of WNBA and NBA Association-wide, season-level data were compared in a previous post. For the season of each that was analyzed, disparities in FG% were evident. Initially, the NBA appeared to be more successful shooting, 44.9% to 42.5%. However, when excluding dunks, the WNBA was slightly but significantly more successful, 42.5% to ~40%. Having recently acquired WNBA play-by-play (PBP) data for the 2016 season, we can more granularly analyze FG%.

All WNBA data were extracted from the regular season 2016 PBP which is available upon request. I will dump this data along with the regular season PBP data from the 2014 and 2015 seasons in a future post. The NBA shot type and assisted shot type data for the 2015-16 season was culled from Basketball Reference.


Table 1a. WNBA & NBA 2016 counts
Type Total NBA WNBA
All 2FG Made 75549 9815
Attempted 153768 20676
Shots Made 35680 4769
Attempted 89487 12318
Layups Made 30439 5045
Attempted 53920 8356
Dunks Made 9430 1
Attempted 10361 2
Table 1a contains counts and 1b proportions of 2FGAs segmented by shots, dunks, and layups between the Associations for the regular seasons ending in 2016. The class of ‘shots’ includes not only jump shots but also others identified in the PBP as floaters, hooks, and runners. Table 1b also contains Chi-square test-statistics demonstrating that the is NBA is slightly more successful shooting 2-point shots. The WNBA is significantly more successful with layups. The Chi-square was not performed for dunks because only 2 were attempted by WNBA players.

Table 1b. 2pt-FG% by Shot Types
Type NBA WNBA χ² p
All 0.491 0.475 6.961 0.008
Shots 0.399 0.387 2.623 0.105
Layups 0.565 0.604 12.230 0.000
Dunks 0.910 0.500
So, these findings provide a more nuanced perspective on the findings from the previous post that WNBA is more successful shooting non-dunk shots. The two posts did employ data from different seasons. Nonetheless, given the present data, the two Associations ostensibly shoot 2-point shots (i.e., not dunks and layups) with similar successfulness—the p¬¬-value approaching conventional significance levels is likely a result of large quantities. That is, a 1 percentage-point advantage to the NBA may approach significance statistically, but I suspect it is ecologically meaningless.

Alternatively, the WNBA was significantly more successful than the NBA shooting layups. My initial notional hypothesis was that because males are predisposed to heightened athleticism, there are more contested or blocked layup attempts in the NBA. Ironically (to the impetus for pursuing this line of research), although blocked layup attempts can be extracted from the WNBA PBP, I am unable to locate blocked layup attempt data for the NBA (short of scraping). Likewise, contested shots attempted from with ≤5ft from the basket are available for the NBA but shot contesting is not recorded in the WNBA PBP.

Table 2. Assists on Dunks + Layups
Dunk + Layups NBA WNBA χ² p
Made & Assisted 15536 2749 6.596 0.010
Made 64281 8358
Proportion Assisted 0.242 0.329
Table 2 contains the proportion of dunks and layups, combined, that were assisted. The WNBA assists on significantly more of their successful layups (and dunks) than does the NBA. So, this might also explain why WNBA players are more successful executing layups. Because the WNBA assists on a greater proportion of layups, there may be more floor spacing or player movement such that defenders are less frequently positioned to defend layup attempts. Indeed, this notion is interrelated to there being greater athleticism in the NBA, as well as greater size, and thus less floor space in the NBA. Lastly, WNBA players may execute successful layups in certain scenarios whereas NBA players would likely execute successful dunks such as on uncontested fast breaks.

If athleticism were the sole determinant in explaining WNBA layup FG% superiority, there would be little recourse for defensive strategists other than playing larger or quicker players. However, if it is the result of floor spacing or player movement, I suspect WNBA coaches transiently employ zone defenses to narrow passing and driving lanes to reduce opponents’ layup success.

Summarily, this report indicates that WNBA and NBA players shoot with similar accuracy on 2-point FGs that are not dunks or layups. Also, the WNBA is more successful on layups than the NBA. This author posited three reasons why this may be: (a) greater athleticism and size on NBA players results in more contested layup attempts and passes; (b) relatedly, the WNBA assists on a higher proportion of its layups which may be the result of more floor space or player movement, but is also related to point (a); and, (c) NBA players may execute dunks in many scenarios where most WNBA players would have to execute layups, also related to point (a).

Monday, May 30, 2016

More Kick and Punt Return TDs are Surrendered Early in the NFL Season


I was recently reading about football as I often do. For the uninformed reader, my preference is for the collegiate game but I do possess a certain fondness for the professional game. It fascinates me, the extent to which many professionals prepare for and intellectually categorize the intricacies of such a violent game. The machinations of pro football that occur away from the gridiron, on days other than Sunday and during the offseason are intriguing as well. This, notwithstanding the machinations of pro pigskin that I detest.

Anyhow, I was reading Take Your Eyes off the Ball authored by Pat Kerwin. I enjoyed his anatomization of team management activities in the League. In brief, it is a quality read for the novice football fan interested in learning about Xs and Os, team management operations, or both.

In his discussion of special teams play, Kerwin asserts that we “often see a flurry of punts and kickoffs returned for touchdowns [during] the first three weeks of the season.” He attributes this effect to poor management of practice time, roster turnover and the limitations of a 53-man roster, and coverage teams often being comprised of inexperienced players.

I was interested in testing Kerwin’s assertion. Data were culled from the Football Reference Play Finder for seasons 2009-15. I divided the 16-game NFL regular season into quadrants: team-games 1-4, 5-8, and so on, and computed six variables for each quadrant: 
  • punt return (PR) and kickoff return (KR) TDs; 
  • PRs and KRs that were not TDs; 
  • and punts and kickoffs with no return (NR).

Table 1. Observed and Expected Values for NFL Kickoff Outcomes 2009-15
OBSERVED EXPECTED
GAMES Kickoffs KRTD KR NR KRTD KR NR
1-4 4406 18 2171 2217 16.8 2489.4 1899.8
5-8 4467 21 2480 1966 17.0 2523.9 1926.1
9-12 4382 14 2620 1748 16.7 2475.9 1889.5
13-16 4343 14 2672 1657 16.5 2453.8 1872.6
TOTAL 17598 67 9943 7588
Data for KRs and PRs appear in Table 1 and 2, respectively. The values included are the quantities of TDs, returns, and NRs that were observed and those that would be expected given the proportions that emerged. This arrangement is suited for a Chi-square test of independence because that is the analysis we will use. Though excluded for visibility, do note that expected-column totals equal that of observed-columns.

Table 2. Observed and Expected Values for NFL Punt Outcomes 2009-15
OBSERVED EXPECTED
GAMES Punts PRTD PR NR PRTD PR NR
1-4 4137 30 1833 2274 24.3 1807.9 2304.8
5-8 4193 19 1874 2300 24.6 1832.4 2336.0
9-12 4295 27 1892 2376 25.2 1876.9 2392.8
13-16 4233 23 1768 2442 24.9 1849.8 2358.3
TOTAL 16858 99 7367 9392

KR TDs are observed more than would be expected during the 1-4 and 5-8 team-games of the season and less than would be expected in the latter quadrants, χ2(6, N = 17,598) = 160.34, p <.001, φC = .067. Given the effect size (φC), we conclude that there is a small but significant disparity in the distribution of KR TDs throughout the season. Although we observed differences in PR TD distributions, the distributional disparities through the season appear minimal, χ2(6, N = 16,858) = 11.98, p = .062, φC = .019. However, it is worth noting that PR TDs during the first four games of the season are probably occurring more than would be expected, supporting Kerwin’s hypothesis.

Chi-square tests indicate that Kerwin is accurate in his assertions. NFL teams surrender more KR and PR TDs early in the season. However, this analysis merely tests the distribution of observed PR/KR TDs and what would be expected to happen given those observations. That is, we are unable to test the influence of poor management of practice time, roster turnover, inexperience, etc. The present analysis does provide a foundation for future studies of the foregoing variables.