Sunday, February 19, 2017

2016 WNBA FG% Distribution by Shot Location

So this is a first for POTH: I am posting twice in one day, or twice in one uninterrupted span of wakefulness. However, it is a diminutive post. Below is a shot chart for the 2016 WNBA regular season with field goal percentage. Greener means higher shooting percentage, navy-er means lower percentage. 

Rather unexpectedly, I was able to amass some data rather quickly and it happened that the shot locations were included. I got excited as I often do when there are discoveries at 00:18. Of course, the chart would would be more informative if the hexagons were sized according to the quantity of shots therein (e.g., here). But it's late and my belly aches.

Chart 1: Distribution of FG% by Shot Location in the WNBA, 2016

A previous post revealed some differences in WNBA compared to NBA league-wide aggregate statistics. I am now able to address these and other topics.

FBS vs FCS Score Differentials Equated to FBS vs FBS Score Differentials

Every matchup is unique. Either team could win. Although related to the outcomes of the other matchups of either team, the outcome of any one matchup is somewhat independent of the others. This notion underlies the nature of competition, the allure of sports betting, and the precedence for retold stories of unlikely winners. For football, because of its small sample size relative to other games, this notion underlies the complexity of numerating many activities on the gridiron and is, to some extent, the topic of this post. 

A recent undertaking at work portends a new analytic technique: observed-score linking and equating. I will undoubtedly seek guidance from our expert colleagues, but, of course, I prefer to be informed before that day is upon us. Linking and equating have distinct definitions, applications, and procedures but I will refer to these casually as equating. Equating allows us to generate uniform score-ranges between sections or items belonging to different versions of a single assessment, two unique assessments, or an old and a new version. 


More practically, consider the ACT, for example. Let us imagine that ACT Inc (the ACT developer) develops 20 versions of the ACT Reading Section. ACT Inc needs the scores for each version to be equitable so that a 36 is always a 36. Of the imaginary 20 versions, let us focus on Versions 6 and 12, or V6 and V12, for short. So, to test these versions, ACT Inc has 200 freshmen in college complete both versions. Say, 100 freshmen completed V12 in the first test session and V6 in the second whereas the other 100 freshmen completed V6 in the first and V12 in the second session. Afterwards, ACT Inc realizes that the average score for V6 is 18.5 and the average for V12 is 20.5, whoops. However, the average for all tests completed in the first session is 20.4 and all tests completed in the second session is 20.3, so ACT Inc knows that the disparity in Version-scores is not due to sequence of test administration. Likewise, because the same freshmen completed both versions, the 2-point disparity in Version-scores is not due to differences in the test-takers. ACT Inc must conclude that the disparity is due to differences in V6 and V12. Then, ACT Inc could use equating procedures to develop uniform scores to ensure little Johnny sets realistic standards for his future based on an ACT Reading Version 12 score of 30 instead of the inflated 36 it would have been without equating.


Here, I use equating to generate equivalency score-differentials for interdivisional college football games. That is, a 35-point win (or, +35 score differential) by a FBS team over a FCS team, for instance, is equivalent to what differential in an FBS versus FBS matchup. Let us relate this to the above example. This analysis would get restrictively complex if we sought to equate scores between all FBS and FCS teams—ACT Reading V6 and V12 would be tantamount to FBS Teams 1, 2, 3, …, 128! However, for FBS and FCS programs alike, most matchups each season are versus FBS and FCS foes, respectively. Like many FBS teams face a smattering of inferior opponents with FCS status, many FCS teams face a few inferior opponents with DII or NAIA memberships. So, we can consider two types (or versions) of games: [i] intradivisional games and [ii] interdivisional games. Intradivisional games are FBS vs FBS or FCS vs FCS whereas interdivisional games are FBS vs FCS or FCS vs non-DI. Thus, if the distributions for score differentials of FBS-FBS and FCS-FCS games are similar, and the same is true for FBS-FCS and FCS-non-DI, we can generate FBS-FCS scores that equate to FBS-FBS scores.

Chart 1: Distributions of Score Differentials

First, I obtained all Division I NCAA football game scores for 2012-2016 from this vast resource hosted by Kenneth Massey, that includes 8,349 games in which either an FBS or FCS team played. Second, I specified whether the home team won each game because home advantages are well-documented (here, here, here, but cf. here). Third, I specified one of four classifications for each game, the first two of which are intradivisional and the second two, interdivisional:

•    FBS vs FBS,
•    FCS vs FCS,
•    FBS vs FCS, or

•    FCS vs non-DI teams.

Fourth, I removed all games in which both teams did not play in an interdivisional game in that season, leaving 6,397 games for the analysis. For example, in 2012, neither UCLA nor USC played an FCS team so, the UCLA vs USC game was excluded from the analysis. However, the 2012 USC versus Washington game was included because Washington played a FCS team (Portland St.). The data was prepared in this manner because I only want to analyze score differentials of teams that played both types of games. That is, although it is only one FCS game, we know about FBS-FBS and FBS-FCS games that involve ’12 Washington whereas we only know about FBS-FBS games that involve ’12 USC.

Chart 2: D1 Teams Ranked by Win% and Mean Score Diff.
Fifth, I prepared Chart 1. It shows the distributions of score differentials for the four categories of games. Chart 1 demonstrates that FBS vs FBS scores (green) differentials are distributed almost identically to FCS vs FCS (brown) score differentials. Likewise, the score differentials are similarly distributed for the interdivisional games, but with some distinct dissimilarities. I attribute the dissimilarity in interdivisional distributions to the similar talent levels of lesser-FBS/better-FCS teams and lesser-FCS/non-DI teams while, concurrently, more better-FBS teams play FCS opponents (green) than better-FCS teams play non-DI opponents (orange). Hence, there are more 35-point blowouts in FBS-FCS games. This is evident in the ad hoc chart below, which was the sixth thing I did. 

Anyhow, because the distributions for FBS vs FBS and FCS vs FCS are nonetheless similar, we will consider in the analysis only home field advantage and whether a game was intra- or inter-divisional (i.e., we will ignore whether a team was FCS or FBS). I do this for simplicity—mostly for me, but maybe also for you. 

Seventh, the equating procedure was performed using a nonequivalent-groups design with one anchor, a home team win. Here, the anchor informs the equating procedure that differences in these games might be due to home-field advantage. The influence of including home team victory is evident in Chart 3. The black line represents the intradivisional score differential and the other lines are the corresponding interdivisional scores with or without home advantage. Some descriptive statistics appear in the table below. A table with unadjusted and adjusted score differentials and SEs appears at the close of the post.

Table 1. Descriptive Statistics for NCAA 1 D1 Intra- & Inter-Division Games, 2012-16
mean sd skew kurt min max n
Intradivisional 17.49 13.5 0.96 3.53 1 78 5493
Interdivisional 30.52 19.69 0.36 2.29 1 86 904
Intra- Home Wins 0.54 0.5 -0.15 1.02 0 1 5493
Inter- Home Wins 0.87 0.34 -2.16 5.67 0 1 904
Chart 3: Equated Interdivisional Score Differentials
Controlling for home-advantage—the green line—produces equated scores which are more sound, in my estimation. Notice how the green line equals the black line in the bottom left corner. The green line diverges at the 7-point differential. So, with this equating procedure, if an FBS team wins by 7 or fewer points over an FCS, it is the same differential as an FBS-FBS victory. To this author, this validly reflects in the score differential the competitiveness of an FBS-FCS game decided by one touchdown or less. Without adjusting for home winning, there are inflated point differentials in this range. Also, compared to the orange and the black lines, there is less of a difference between the green and black lines as the score differential increases (if such a feat were meaningful, Baylor). Likewise, Iowa St. is not additionally penalized for succumbing to a last-second field-goal whereas the orange line equates a 7-point FBS-FCS victory to 16 FBS-FBS points and a field-goal lead at 00:00 in the 4th quarter to 5 points. 

Now, there are of course shortcomings to this study, primarily one. Recall in the verbose example I provided earlier that the same 200 college freshmen completed both V6 and V12 of the ACT Reading sections. By doing so, we could be relatively certain that any disparity in V6 and V12 averages was not due to the test takers.  In the analysis, however, I included only games involving at least one team that played in intra- and inter-division games in the season. Thus, this analysis rests on the potentially fallible assumption that all intra- or inter-divisional opponents to these teams are identical—which is patently untrue. Hence, the reason we considered the distribution of different classifications of games in Chart 1.


Summarily, an equating procedure was used to generate score-differential equivalencies for FBS-FCS games to FBS-FBS games. This author concluded that adjusting for well-documented home field advantages provided more valid equivalencies. Secondarily, an ad hoc analysis demonstrated that upper echelon FBS teams more frequently play FCS opponents than upper echelon FCS teams play non-DI teams.



Adjusted Unadjusted
FBS Scr Diff Est. SE Est. SE
1 0.974 0.2 1.358 0.175
2 1.743 0.285 2.672 0.21
3 2.81 0.208 5.106 0.779
4 3.762 0.756 7.546 0.79
5 5.054 0.896 9.956 1.156
6 6.127 0.8 12.3 1.166
7 7.359 0.891 15.918 1.072
8 10.068 1.461 19.907 1.127
9 10.968 1.578 20.872 0.771
10 13.186 1.415 22.593 1.161
11 14.47 1.015 24.394 0.867
12 15.391 1.017 25.328 0.954
13 16.529 1.173 26.464 1.014
14 18.299 1.438 28.122 1.024
15 20.713 1.223 30.582 0.942
16 21.174 1.141 31.068 0.82
17 23.077 1.324 32.161 0.962
18 24.61 1.284 34.025 0.9
19 26.413 1.334 35.059 1.109
20 27.782 1.262 36.855 1.117
21 30.308 1.197 38.215 0.688
22 31.551 0.981 39.434 0.943
23 32.615 1.131 40.542 1.061
24 34.24 1.375 41.97 1.075
25 37.279 1.458 44.231 1.228
26 38.21 1.093 45.077 1.165
27 39.011 1.182 45.894 1.214
28 41.563 1.188 47.865 1.147
29 42.478 1.111 49.033 1.238
30 44.187 1.182 49.589 1.415
31 45.563 1.132 51.843 1.423
32 47.876 1.165 53.61 1.266
33 48.85 1.181 54.631 1.102
34 50.254 1.454 55.331 0.775
35 52.735 1.534 56.108 0.747
36 54.72 1.304 56.876 1.063
37 55.346 1.134 58.072 1.234
38 56.116 1.085 59.217 1.359
39 57.228 1.251 61.627 1.452
40 58.761 1.543 62.474 1.195
41 59.231 1.681 62.889 1.011
42 61.718 1.732 63.532 1.122
43 62.719 1.574 64.919 1.179
44 62.99 1.426 65.649 1.247
45 63.557 1.384 66.142 1.177
46 65.6 1.338 66.812 1.522
47 65.788 1.319 67.265 1.673
48 65.991 1.261 68.845 1.777
49 66.467 1.093 69.941 1.934
50 67.188 1.348 71.796 2.045
51 68.502 1.659 72.619 2.149
52 69.592 1.949 73.615 1.891
53 70.118 2.188 74.109 1.831
54 70.359 2.166 74.335 1.861
55 72.129 2.096 75.157 1.77
56 73.833 2.102 76.604 1.884
57 74.595 1.891 77.468 1.617
58 75.772 1.74 77.78 1.724
59 77.016 1.535 78.302 2.118
60 77.7 1.329 78.892 2.151
61 77.876 1.252 79.221 2.14
62 78.095 1.368 79.633 2.24
63 78.405 1.623 80.209 2.572
64 78.877 1.726 81.62 2.747
65 79.142 1.818 81.785 2.716
66 79.672 1.996 82.114 2.65
67 80.336 2.151 83.525 2.608
68 81.733 2.146 83.772 2.547
69 82.131 2.244 84.019 2.447
70 83.794 2.309 84.43 2.217
71 84.192 2.288 85.677 1.879
72 84.325 2.255 85.759 1.872
73 85.59 2.218 85.924 1.778
74 85.855 2.275 86.089 1.748
75 85.988 2.341 86.171 1.739
76 86.116 2.343 86.253 1.714
77 86.244 2.251 86.335 1.633
78 86.372 2.17 86.418 1.613

Wednesday, February 1, 2017

FBS Rivalry Games Decline Parallel to an Increase in Bowls and FCS Opponents

Figure 1. Margin of Victory, 1990-2016
I discussed college football program rivalries in a previous post. Preparing a forthcoming post elucidated a decline in rivalry games occurring parallel to changes in the college football environment. Some operational definitions appear at the close of the post. All data were culled from Sports Reference

There was the implementation of the BCS in 1998 to resolve that annual, irrevocable controversy about the national champion. The BCS spawned myriad controversies and intensified that which it was implemented to resolve. There were also concerns that teams were maximizing margins of victory to exploit innerworkings of the BCS algorithm until it was remodeled. However, any broad impact in FBS vs FBS games is not readily deducible from Figure 1. The gradual increase in margin of victory over FCS teams however, is likely attributable to an overall increase of FCS opponents and, thus also, an increase in higher-quality FBS teams playing FCS teams, as seen in Figure 2 (in purple).

We also see in Figure 2 that, although it may be spurious epochal fluctuations, there was a few years from ~2005-10 when the proportion of FBS vs FBS between two top-25 ranked opponents dipped considerably (orange). Indeed, when schedules are set years in advance, come game day, the ranking of either team is unknowable. But, we might also note that teams were playing gradually fewer games versus top-25 ranked opponents (brown). However, there were more FBS teams as time progressed and a lower proportion of those teams could be ranked at a given time as there was still only 25 ranked teams in a given season.


Figure 2. Attributes of FBS Games, 1990-2016
In 2006 and after, FBS programs were permitted to schedule one additional regular season game in any season. Prior to this ruling, programs could schedule one additional game in select seasons. It was around 2006 when the quantity of bowl games began to increase as well. Enter the Playoff in 2014. The concept of a playoff was, itself, accompanied by some controversy. Critics of the CFP cite similar issues as the BCS. Alas, this author is too indolent to cite anything in this paragraph. 

Figure 3 indicates that programs and their teams were pressured to respond to these and other environmental changes. Accompanying the 2006 rule-change permitting one additional game was another statute allowing one win over an FCS team to count toward bowl eligibility in each season. Previously, since at least 1997, one FCS win “every four years” could count toward bowl eligibility.  Thus, programs scheduling an FCS foe essentially guaranteed their teams 12.5% of the six wins necessary to secure an official invitation from one of an ever-increasing quantity of certified licensed bowl games.  


Figure 3. FBS Bowl Team Attributes, 1990-2016
The green line in Fig 3 corresponds to the proportion of bowl teams with 6 or fewer scheduled wins. Annual vicissitudes are evident but only in 2013 or -14 does a trend beginning in about 1998 appear to ossify. The orange and brown lines add some context; until about the inauguration of the BCS the two generally overlap. Although the proportion of bowl teams scheduling multiple FCS games appears relatively constant 1990-2016, we see that even before the 2006 ruling, increasing proportions of bowl teams were distending their win totals with wins over FCS teams, though this may or may not have increased the likelihood they received bowl invites. Lest there is no mention of monetary implications, what was the cost to rivalries?

Summarily, as seen in Figure 4, the programs are scheduling more games versus FCS opponents, but the increase has been gradual, while simultaneously, more teams played in bowls…because there were more bowls. However, we see that, on average, teams are playing fewer games versus rivals, ~15% to ~10% from 1990 to 2016. Realistically, this equates to from ~1.8 to 1.3 rival games per team over that span (out of ~12 to ~13 scheduled games).

Some Operational Definitions 
To examine how the foregoing adaptations are associated with college football rivalries, at the FBS level, I examined the variables described below. The rivalry data are described elsewhere and expounded upon below.  I define scheduled games as games that are not bowl games but include conference championship games as these, when not de facto, are scheduled per se. 
Figure 4.


Minimalists
  • Set the minimum wins of team playing in bowls at 6 scheduled games for each season (although that minimum was 5 in several seasons)
  • Computed for each season the proportion of teams in bowls with at or less than 6 scheduled game wins
  • Subtracted the quantity of wins versus FCS opponents from each team scheduled game win total in each season and computed the proportion of teams playing bowls with < 6 scheduled game wins versus FBS opponents
  • Computed the proportion of bowl teams that played ≥1 FCS opponent
  • Computed the proportion of bowl teams with ≥2 wins over FCS opponents
    • Indeed, this value is meaningless in regards to bowl eligibility. 
Team Games vs Rivals
  • Compute proportion of scheduled games versus rival(s) for each FBS team in each season and then compute average for FBS teams for each season.
  • I would have included an adjusted version of this value adjusted by subtracting FCS games from the denominator, i.e., rivalry games / (scheduled games – FCS games), but it is essentially overlaps with the main variable
  • I included bowl games in these totals because imagine the sensation of a Michigan-Ohio St. playoff game.Non-FBS teams were excluded from the computation of this variable because of the following bullet.
  • I used a data set for rivalries generated for a previous posting. In that post, I set the rivalry inclusion criterion at ≥50 rivalry games as major college football programs. This data pertained to games from 1891 to 2015. To apply a similar stringency to the present data, a threshold of rivalry games was set for each season proportional to 50 / (2015-1891), which equals ~.403. For instance, the 1999 rivalry inclusion criterion was set at 43.45 games = (1999 – 1891) × .403 = 108 × .403. So any rivalry with less than 43.45 games by 1999 was excluded in the computation of that season.
FBS Games with Two Ranked Teams
  • The quantity of scheduled games pitting two FBS-ranked teams divided by the quantity of scheduled games with at least one FBS-ranked team.
  • Indeed, this could have been computed as games with two ranked teams divided by total games. But, there were 107 FBS teams in 1990 and 128 in 2016. However, there were 25 ranked teams in each week of those seasons or 23.3% and 19.5% of total teams, respectively. Likewise, teams played, on average, about 12 games from 1990-2005 but 13.5 games from 2006-2016. Thus, as the eras progressed, there were more games to be scheduled and thus a higher likelihood of facing an unranked team. It seemed to me, most objective to derive the value in this manner as it standardizes across seasons, to some extent.
Games with One Ranked- and One non-FBS Team
  • The quantity of scheduled games with one ranked FBS team pitted against a non-FBS team divided by the quantity of scheduled games with at least one FBS-ranked team.
  • Indeed, this could have been computed with total games in the denominator but the results are similar.