Pads of the Hands: field position

Showing posts with label field position. Show all posts

Sunday, January 21, 2018

Examination of Success Thresholds in College Football

Anyone familiarizing themselves with gridiron football analytics will quickly acquaint with success rate. Success is widely defined by an offense gaining 40-50% of yards to go on 1st down, 60-70% on 2nd down, and 100% on 3rd and 4th downs; preventing gains of said percentages defines success for defenses. Counting all the successful plays in a drive, game, or season and dividing by the total quantity of plays for that period yields the success rate. Personally, I am more interested in whether an activity was productive, unproductive, or counterproductive but I’ll save that for another post. However, curious as to how the thresholds for success may have been established and if there are nuances to current definitions, I examine it here.

Myself and others before me, suspect that success rate is derived from traditional football notions of ‘staying ahead of the chains’ or ‘setting up for third and short’. Popularized by Football Outsiders, success, by our definition—like many gridiron analytic concepts—can be traced at least to the mid-1980s when it was outlined on p. 69 of the Hidden Game of Football. The authors used 40%, 60%, and 100% to benchmark ‘wins’ and ‘failures’, as well as a derived qualitative measure of success that awards more credit (i.e., >1-point) for big plays and penalizes turnovers and lost yardage (i.e., negative points).

How do we determine what is a successful play? That is not, how success is defined according to X-amount of yardage gained or lost on a given play, per se, but how X-amount of yardage on a given play portends future success when aggregated from many, many plays in a similar context. Given this notion, let us define success as a play occurring on a drive that ends in scoring either a TD or FG.

Let us define success another way, too: a first down occurring on or after a given play on a drive (or series, in this case, really). For example, take a 2nd and 8; if there is a first down obtained on that play or subsequent play in the drive, that 2nd and 8 would be considered as having occurred on a successful drive (or, series). Alternatively, imagine a 1st and 10 which is, say, the fifth play of a drive and occurs after obtaining having at least one first down on the drive; if there is not a first down obtained on that 1st and 10 or any play later in the drive, that 1st and 10 would be considered as having occurred on an unsuccessful drive (or series).

For data, I have all pass and rush plays from games played by Division 1 college football teams from 2005-13—995,895 plays. For each play, I included an indicator of whether the play occurred on a scoring drive and whether there was another first down on that drive. As these are a binary variables, indicating yes or no, logistic regression is suitable. As the predictor variable we will use yards gained on a play divided by the yards to go on the play. This way we can say gaining X% of yards on a given down down is the threshold of success. Oh, so since we’re using college data, we’ll use the thresholds utilized by Football Study Hall of 50%, 70%, and 100% on 1st, 2nd, and 3rd and 4th downs, respectively.

Using logistic regression and ROC curves, we identify thresholds for the proportion of yards gained on each down that correctly predicts both the maximum quantity of plays on successful drives while minimizing the quantity of plays on unsuccessful drive wrongly predicted as successful (in our data set). This becomes our threshold of success. Figure 1 shows the success thresholds from these analyses for scoring drives in purple, drives with another first down in green, and the commonly applied success thresholds in orange.

Figure 1. Thresholds for Success

That the thresholds for 3rd and 4th down are essentially identical for scoring and first downs is unsurprising because scoring requires gaining at least the yards to go. The disparity in thresholds on second downs is also intuitive. It suggests that gaining a greater portion of the yards to go on 2nd down portends a more successful drive. The lower threshold for scoring drives on 1st down is interesting, however. It may be that obtaining 40% of the yards to on first downs typically setups a 2nd and 6 with offensive being in neither a definitive rush nor definitive pass situation. This, in turn, could conceivably lead to future success and the disparity here compared to the commonly used threshold.

I was curious also how field position affects success. Let us focus only first downs, for convenience. I computed whether each play was a success based on the threshold for scoring drives described above; we’ll call this the fixed threshold. A mixture model was used to segment the field into 8 segments. Several logistic regression models were blended to generate thresholds for each segment, which we’ll call blended thresholds.ⁱ This is shown in Figure 2. Yard line 1-9 is closest to the defense’s end zone. The bottom row of panels are successful plays based on the fixed threshold and the blended threshold on the top row. On the X-axis are Yes or No to indicate whether a play actually occurred on a scoring drive or not. Green indicates a play was predicted to occur on a non-scoring drive and orange indicates a play was predicted to occur on a scoring drive. We can see the fixed threshold emerged because it accurately predicts so many plays on unsuccessful drives in opponent’s territory.

Figure 2. Comparing Fixed and Blended Success Thresholds by Field Position on First Down

Summarily, this report showed that, at least in college football, success thresholds are relatively constant whether success is defined as a drive ending in a score or whether there is a first down after a given play. Secondarily, the report provides evidence that statistically-derived success thresholds vary by field position, at least on first down. Thus, future work should examine how adjusting thresholds by field position affects the valuation of player and team performance when using success rates.

ⁱTo do this, I averaged the threshold from three logistic regressions. For each group I obtained thresholds from three logistic regressions with the following subset of the data: [a] plays in each field position segment, [b] all plays in each field position segment and all plays from field positions closer to the defense's end zone, and [c] all plays in each field position segment and all plays from field positions farther from the defense's end zone.

Sunday, July 17, 2016

Field Position Part II

In a previous post I discussed how INTs and INT return yardage influenced starting field position (SFP). I will extend that discussion to include each of the other events that directly result in SFP: turnover-fumble returns, kick and punt returns, and missed field goals by opponents. As an aspiring defensive back, I of course took great care discussing interceptions. I will devote little discussion here to fumble recoveries and missed field goals. I will harp on kick-off returns but refrain from discussing punt returns at any depth.

Let me first state that my play-by-play (PBP) data differs slightly from the official record. I excluded yardage gained on returns for TDs in the analysis because a TD precludes SFP. Excluded also was return yardage gained prior to a turnover-fumble.

Concerning INTs, I emphasized that ending opponents’ possessions is most salient and that INT return yardage is a somewhat superfluous stat. INT return yards may be useful to compare playmaking abilities between DBs, although statisticians, teams, and observers might be better served knowing the SFP that resulted from an interception. This notion is definitely applicable for fumble returns where, again, the ending of opponents’ possession is most salient.

Likewise, it is also relevant for rating punt returners. For instance, a player fair catching a punt at his own 9-yard line would be recorded as a fairly unremarkable zero yards (i.e., it is counted in his average PRY). However, the fair catch was probably initiated in the presence of proximal defenders who could have disrupted the impetus of the punted ball at say, the 2-yard line had the returner declined to fair catch. Thus, by fair catching—despite accruing zero yards—the returner in the example would improve his team’s SFP by 7 yards (of course, the defense downing the ball is hypothetical).

The foregoing notion of field position in lieu of yardage is applicable to kick returns as well. For example, let us review the 2014 NFLleading kick-returners by average yards per return. I have Bruce Ellington of the 49ers at 24 returns for 25.9 yards per return;^c.f. he ranks about ninth in KR yards. However, Ellington gives his offensive teammates an average starting FP at the ~23-yard line—18^th on my list of qualifying players. It may be poor decision making on his behalf or poor block execution behalf of his teammates or that he generally fields kickoffs from superior kickers but we must acknowledge Ellington’s average catch-spot (CS) on KRs was nearly 3-yards into the endzone, ranking third-deepest on my list of qualifying players.¹

Although this post is about SFP, the above anecdotes underscore the entanglement of variables involved in appraising performances with yardage accrued. However, Ellington still gained those yards. If we are comparing players (or even coverage units), perhaps, Ellington does rank ninth in KR yards. However, football is about team success and on a given drive, a team is increasingly inclined to success the closer it begins to its opponent’s endzone. Conversely, Ellington’s team did start 3 yards closer to the endzone then would result from him taking more touchbacks.

Moving on, for all teams in the 2014-15 NFL season, I obtained all non-TD turnover-fumble returns, interceptions, kick and punt returns, and field goals missed by opponents using the Pro-Football Reference PBP searchtool. Opponents’ missed FGs include blocks but excludes blocks returned for TDs. For all plays except opponents’ missed field goals, I extracted [a] the spot of the INT, fumble recovery, or catch and [b] the spot at which the player was downed following the return. Computed with those values were [c] return yards or 20 for a touchback and [d] the SFP of the player’s offensive teammates. SFP was scaled such that teams’ own goal lines equaled zero and opponents’ goal lines equaled 100; greater yards indicate better SFP.

Table 1. Counts, Average SFP, and Average Return Yards for Events Resulting in SFP, NFL 2014-15

TEAM	TOTAL EVENT COUNTS						AVERAGE STARTING FIELD POSITION BY EVENT							AVERAGE RETURN YARDS BY EVENT
	KR	PR	FR	INT	oMFG		SFP	KR	PR	FR	INT	oMFG		KR	PR	FR	INT
KAN	68	76	5	5	5		29.3	25.4	28.7	22.4	44.0	23.6		25.4	8.6	0.0	15.2
CIN	78	74	5	19	5		30.3	24.7	30.3	27.4	50.8	26.0		24.9	8.4	0.0	9.2
NWE	68	64	7	16	5		30.6	22.7	32.0	36.9	51.8	29.6		22.1	7.5	0.4	11.7
DAL	75	66	12	16	2		28.9	21.1	26.9	35.5	43.3	21.0		22.3	7.3	2.6	9.3
TAM	81	63	11	11	7		26.6	20.7	25.3	30.1	48.8	32.6		21.6	7.2	1.3	6.9
IND	79	88	13	11	4		28.7	22.5	28.4	30.3	43.2	24.0		24.3	7.0	2.6	8.7
BAL	68	73	12	10	6		28.7	23.3	30.4	32.9	48.2	27.8		22.9	6.9	4.3	9.1
JAX	92	74	12	5	4		25.7	22.1	21.5	24.3	51.0	31.8		21.9	6.4	0.8	13.0
PHI	83	85	16	9	5		30.0	22.9	28.8	33.8	41.6	23.4		20.9	6.2	6.3	6.2
MIN	73	74	4	11	6		27.3	25.0	25.5	15.3	49.0	30.3		21.9	6.2	0.0	8.2
STL	79	74	11	10	1		28.0	22.1	28.8	32.4	42.1	35.0		22.9	5.9	3.3	10.8
BUF	71	86	8	18	7		30.2	21.3	27.0	25.5	60.4	26.9		20.6	5.9	4.4	19.2
OAK	94	81	4	9	5		24.2	19.9	24.7	26.0	61.9	24.2		21.4	5.7	5.3	8.4
CHI	97	49	8	13	6		25.9	21.1	27.0	15.9	43.6	23.8		20.5	5.6	2.4	10.9
SDG	81	66	8	6	2		26.4	21.3	26.5	24.1	31.7	33.5		21.1	5.5	0.0	12.0
SFO	70	74	5	21	0		27.8	22.6	28.3	16.2	48.0	-		22.9	5.5	0.0	18.8
ATL	89	55	7	15	4		26.5	22.4	25.7	32.3	40.5	27.5		22.5	5.4	5.6	6.9
MIA	82	57	10	11	6		31.1	24.2	25.5	21.3	50.6	23.5		23.9	5.3	0.0	17.1
DEN	75	84	5	16	5		28.9	22.6	28.2	26.0	54.4	25.2		21.4	5.3	0.4	10.8
TEN	89	72	6	11	5		25.8	23.2	24.8	38.8	52.7	31.0		22.5	5.2	7.2	10.8
ARI	76	77	5	15	4		26.9	19.6	24.8	20.8	51.3	32.5		20.2	5.1	1.8	10.3
NYJ	85	79	7	6	5		27.8	22.5	26.6	31.4	35.0	24.0		22.1	5.1	0.3	9.0
PIT	86	66	10	7	2		25.7	20.7	24.9	32.5	48.9	29.5		21.1	5.0	3.9	18.1
NYG	87	74	9	16	2		28.2	20.7	23.7	31.6	62.1	24.0		21.1	4.9	1.5	16.6
GNB	79	60	7	15	1		28.5	20.1	27.0	37.4	54.7	29.0		20.3	4.8	0.0	15.2
CAR	83	69	13	10	4		27.7	21.8	25.5	32.5	45.2	25.8		21.0	4.5	2.8	19.0
SEA	62	81	9	11	2		30.5	22.4	27.7	29.0	58.2	32.5		21.4	4.2	0.0	14.2
CLE	72	83	7	18	3		26.8	22.6	24.9	27.7	54.1	34.0		22.8	4.2	4.9	14.4
WAS	85	80	9	6	3		25.1	21.4	22.7	29.0	45.5	18.0		20.8	4.0	2.1	5.0
HOU	71	82	10	16	2		27.7	20.4	23.5	31.9	60.6	26.5		20.7	3.8	8.7	16.6
DET	70	81	7	18	4		29.9	21.1	29.4	32.9	59.2	27.3		21.8	3.8	1.8	18.7
NOR	86	62	6	12	0		25.5	22.2	22.0	18.5	42.9	-		22.3	3.0	0.0	12.5

League	Event Counts						Average Field Position by Event							Average Return Yards by Event
AVG	79	73	8	12	4	AVG	27.9	22.0	26.5	29.1	50.5	27.2	AVG	22.0	5.6	2.3	12.3
SD	9	10	3	4	2	SD	1.8	1.4	2.5	6.4	7.6	4.2	SD	1.3	1.3	2.4	4.2

Table 1 contains 2014-15 distributions, NFL team average SFP and yards gained for each event, and League averages thereof. KRY and PRY are computed with touchbacks equal to 20 yards and no return equal to zero yards. Neither New Orleans’ nor San Francisco’s opponents missed FGs, apparently. There is nothing particularly noteworthy in the table, otherwise.

I also can tell you several things. INTs have the largest impact on the next-SFP when statistically controlling for the initial play spot, the spot at which an INT, fumble recovery, or kick/punt catch occurred, and the yardage gained on the return.² I can also tell you that for all NFL teams, the majority of SFP yardage is derived from either KR yards or PR yards. Table 2 provides some insight into why this is.

**Table 2.** Characteristics of NFL Based on Majority of SFP
	Majority of Team SFP From
VARIABLE	KR	PR
Teams Count	11	21
avg SFP	27	28
avg SFP Unproductive Drives	24	24
avg KR-SFP	22	22
avg Unproductive Drive Yards	17	16
avg Punt Yards	45	45
Opp avg Punt Return Yards	9	9
avg Def. SFP After Unproductive Drive	24	23
Opp avg Unproductive Drive Yards	16	16
Opp avg Punt Yards	45	45
avg Punt Return Yards	5	6
% All Drives Turnovers	14%	11%
Opp % All Drives Turnovers	12%	12%
% All Drives End w/ Score	32%	35%
Opp % All Drives End w/ Score	39%	32%
win%	35%	58%

NOTE: Unproductive drives are defined as those that end without a score.
Scoring drives are those that ended in TDs or FGs.

In Table 2 we see that the two types of teams perform similarly in most situations. Notably, teams whose majority of SFP is derived from KRs commit TOs slightly more frequently. As an aside, this might suggest that while essentially random, a modicum of TOs may be attributable to offensive ineptitude (albeit, in single season sample). Those teams’ opponents also end drives by scoring considerably more frequently—23% more—than teams whose majority of SFP is derived from PRs. The PR-teams score slightly more frequently.

Most striking in Table 2, though, is the disparity in win percentage. The KR-teams can be expected to win 5.6 games whereas PR-teams can be expected to win 9.3 games. Thus, I conclude that, despite the indelible impact of Devon Hester or the ’84 Seahawks’ 3-4 monster, ultimately, SFP is largely the result of an ungenerous defense supplemented by relatively consistent and careful offensive play.

Summarily, the impact of various events on starting field position was examined using data from the 2014-15 NFL season. Although INT yards are most impactful on SFP in isolation, when statistically controlling for event-spot and return yardage, the majority of SFP is derived from either KR or PR yards. Likewise, winning teams garner most of their from PR yards. I concluded that this effect is likely due to defensive stops and consistent, careful offensive play.

¹ Minimum 1 KR per game scheduled.

² To accomplish this, SFP was regressed on to play start spot, event spot, and yards gained. The residuals were saved. An ANOVA was performed with those residuals as the dependent variable and event type as the independent variable. A significant effect of event type was found, F(4, 5641) = 17.422, p < .001. Roughly, planned post hoc comparisons indicate the effect of event on SFP could be ranked as INT > FUM > PR > MFG > KR.

Saturday, January 16, 2016

Field Position Part 1: Interception Return Yards

In a previous post I mentioned my intent to explore the effects of defense and special teams on field position. This is part one of that investigation. In another post I offered a method of valuating the average yardage of an interception (for the intercepting team). At present, I will discuss a different method of calculating and valuating INT yards using play-by-play (PBP) data.

Yes, collegiate and professional statistical records include yardage gained on interception returns just as passing or rushing yards are included. However, we track passing and rushing yards as a measure of progress and productivity—a measure of player or team performance. We track interceptions because each signifies an exchange of possession. The interception itself is the measure of player or team capacity—not the yardage gained on returns. Interception return yards are unexpected gratuity.

Chris Harris picks off and returns a Kyle Orton pass.

For instance, consider two interceptions from 2014. A Tony Romo pass was intercepted by NY Giants’ Prince Amukamara and Buffalo Bill Kyle Orton’s pass was intercepted by Dever Bronco Chris Harris. Amukamara and Harris were each credited with 38 INT return yards. Amukamara’s half way through the second quarter of a tie game and Harris’ with 5 minutes remaining in the third quarter, his team leading 21-3. Both players’ offenses scored on their following drives. So what differentiates Amukamara’s INT return from that of Harris? Field position.

Amukamara intercepted Romo’s pass at the NYG 35-yard line and returned the ball to the Dallas 27. Harris intercepted Orton’s pass at the Bronco 2-yard line and returned it to the Bronco 40. Indeed, Harris’ INT may be more valuable because he ended the possession of an opponent in scoring position.[1] But, Amukamara’s 38 INT return yards put his offensive teammates in field goal position before they lined up.

An offense gaining no more than 2.3 yards on every play of every game would be disbanded. A DB intercepting one pass per game and being downed at the spot of each INT would receive a max contract and an eponymous island. No writers would denigrate him for failing to gain yardage after the INT. Any coach or fan would prefer an INT returned to the 50 than one returned to his own 3, of course; but surely every coach or fan would prefer an INT to the opponent having possession.

To me this means that, although there is value in knowing the yardage gained from the spot of an interception to the spot an interceptor is downed, ultimately, it is more meaningful knowing the field position produced by that gained yardage. That is, both players in our example should be credited 38 INT return yards but the values in the game at the point of each player being downed were more accurately described as 73 or 40. This is particularly true if we desire statistics that reflect happenings on the field.

I ran a Pro-Football Reference Game Play search for all interceptions in 2014 regular season excluding pick-sixes and interceptions with lost fumbles on the return. Pick-sixes were excluded because a touchdown precludes an offense driving and, thus, are uninvolved in the tabulation of average starting field position. Four returns with fumbles lost by the interceptor and recovered by the intercepted team were excluded because possession was regained.

Extracted from that data were the (a) spot of the interception and (b) spot of being downed following INT return. Computed with those values were the (c) interception yards from the spot of the INT to the spot of being downed or 20 for touchbacks and (d) starting field position for the interceptor’s offensive unit measured from a team’s goal line to b, the spot of being down. All spot-yardage values were scaled from 1 to 99 with 1 being the intercepting team’s own goal line and 99 being their opponents’ goal lines.

Table 1 contains various interception statistics for 2014 NFL teams, including League averages. The interception return yardage value we are interested in is Mean INT FP column. Teams are ranked by average starting field position following interceptions. Interestingly, I was forced to revisit my earlier debate of the greater value of Amukamara’s and Harris’ interceptions. It appears that over the course of the 2014-‘15 seasons, the Giants’ defense endowed their offense with the greatest field position advantage with interceptions but the average spot of their 16 interceptions was nearly midfield. Compare this to the average spot of the 16 Dallas Cowboys interceptions, their own 30—where opponents are within field goal range. Although intuitive, interception spot increased with field position following interception (N = 393, r = .81, p < .001). Interception return yards also increased with field position following interception (r = .41, p <.001).

**Table 1.** Interception Yards and Interception Field Position (Yards) Following Interceptions for 2014-15 NFL Teams
TEAM	Mean FP	Non-TD INT	Mean INT Spot	Non-TD INT Yards	Mean nTD INT Yards	INT FP Yards	Mean INT FP
SDG	26.4	6	18.8	72	12.0	185	30.8
CAR	27.7	10	23.4	190	19.0	424	42.4
NYJ	27.8	6	24.7	54	9.0	202	33.7
KAN	29.3	5	27.0	76	15.2	211	42.2
CHI	25.9	13	30.1	142	10.9	533	41.0
PIT	25.7	7	30.1	127	18.1	338	48.3
DAL	28.9	16	30.3	149	9.3	634	39.6
NWE	25.5	12	30.9	140	11.7	511	42.6
STL	28.0	10	31.3	108	10.8	421	42.1
ATL	26.5	15	32.1	104	6.9	586	39.1
MIA	31.1	11	33.1	188	17.1	552	50.2
SEA	27.8	21	33.5	298	14.2	1001	47.7
IND	28.7	11	34.5	96	8.7	475	43.2
SFO	30.5	11	34.6	207	18.8	588	53.5
PHI	30.0	9	35.3	56	6.2	374	41.6
JAX	25.7	5	35.8	65	13.0	244	48.8
CLE	26.8	18	37.9	260	14.4	943	52.4
ARI	26.9	15	38.3	154	10.3	728	48.5
BAL	28.7	10	39.1	91	9.1	482	48.2
DET	29.9	18	39.3	336	18.7	1043	57.9
NOR	30.6	16	39.3	200	12.5	829	51.8
GNB	28.5	15	39.5	228	15.2	820	54.7
CIN	30.3	19	40.4	174	9.2	941	49.5
WAS	25.1	6	40.5	30	5.0	273	45.5
MIN	27.3	11	40.8	90	8.2	539	49.0
BUF	30.2	18	41.2	346	19.2	1088	60.4
TEN	25.8	11	41.7	119	10.8	578	52.5
TAM	26.6	11	41.9	76	6.9	537	48.8
DEN	28.9	16	42.1	173	10.8	846	52.9
HOU	27.7	16	44.0	265	16.6	969	60.6
NYG	28.2	16	45.5	266	16.6	994	62.1
OAK	24.2	9	53.4	76	8.4	557	61.9

LEAGUE	27.9	12.3	36.8	154.9	12.6	607.7	49.5

[1] Pro-Football Reference’ Expected Points model tells us that Amukamara’s INT was worth -3.78 EPA and Harris’ -1.6 EPA but Amukamara’s INT yielded a net EP -3.56 and Harris’, -5.91. Amukamara’s INT is worth a greater EPA value probably due to resultant field position but Harris’ INT has a greater net EPA value because his opponent was near the endzone he was defending.