# Valuing Goalie Shootout Performance (Again)

• Goalies are not interchangeable with respect to the shootout, i.e. there is skill involved in goalie performance.
• An extra percentage point in shootout save percentage is worth about 0.002 standings points per game. This is based on some admittedly sketchy calculations based on long term NHL performance, and not something I think is necessarily super accurate.

I’m bringing this up because a couple of other articles have been written about this recently: one by Tom Tango and one much longer one by Michael Lopez. One of the comments over there, from Eric T., mentioned wanting a better sense of the practical significance of the differences in skill, given that Lopez offers an estimate that the difference between the best and worst goalies is worth about 3 standings points per year.

That’s something I was trying to do in the previous post up above, and the comment prompted me to try to redo it. I made some simple assumptions that align with the one’s Lopez did in his followup post:

• Each shot has equal probability of being saved (i.e. shooter quality doesn’t matter, only goalie quality). This probably reduces the volatility in my estimates, but since a goalie should end up facing a representative sample of shooters, I’m not too concerned.
• The goalie’s team has an equal probability of converting each shot. This, again, probably reduces the variance, but it makes modelling things much simpler, and I think it makes it easier to isolate the effect that goalie performance has on shootout winning percentage.

Given these assumptions, we can compute an exact probability that one team wins given team 1’s save percentage $p_1$ and team 2’s $p_2$. If you don’t care about the math, skip ahead to the table. Let’s call $P_{i,j}$ the probability that team $i$ scores $j$ times in the first three rounds of the shootout:

$P_{i,j} = {3 \choose j} p_i^j(1-p_i)^{3-j}$

$P(\text{Team 1 Wins } | \text{ } p_1, p_2) = \sum_{j=1}^3 \sum_{k=0}^{j-1} P_{1,j} \cdot P_{2,k} + \left ( \sum_{j=1}^3 P_{1,j}\cdot P_{2,j} \right ) \frac{p_1(1-p_2)} {1-(p_1p_2+(1-p_1)(1-p_2))}$

The first term on the right side is just the sum of the probabilities of the ways that team 1 can win the first three rounds, e.g. 2 goals for and 1 allowed or 3 goals for and none allowed. The term on the right is the sum of all the ways they can win if the first three rounds end in a tie, which can be expressed easily as the sum of a geometric series.

Ultimately, we don’t really care about the formula so much as the results, so here’s a table and a plot showing the performance of a goalies who are a given percentage below or above league average when facing a league average goalie:

Percentage Points Above/Below League Average Winning Percentage
-20 26.12
-19 27.14
-18 28.18
-17 29.24
-16 30.31
-15 31.41
-14 32.52
-13 33.66
-12 34.81
-11 35.98
-10 37.17
-9 38.37
-8 39.60
-7 40.84
-6 42.10
-5 43.37
-4 44.67
-3 45.98
-2 47.30
-1 48.64
0 50.00
1 51.37
2 52.76
3 54.16
4 55.58
5 57.01
6 58.45
7 59.91
8 61.38
9 62.86
10 64.35
11 65.85
12 67.37
13 68.89
14 70.42
15 71.96
16 73.51
17 75.06
18 76.62
19 78.19
20 79.76

We would expect most of these figures to be close to league average, so if we borrow Tom Tango’s results (see the link above) we figure the most and least talented goalies are going to be roughly 6 percentage points away from the mean. The difference between +0.06 and -0.06 is about 0.16 in the simulation output, meaning the best goalies are likely to win sixteen shootouts per hundred more than the worst goalies assuming both play average competition.

Multiplying this by 13.2%, the past frequency of shootouts, and we get an estimated benefit of only about 0.02 standings points / game from switching from the worst shootout goalie to the best. For a goalie making 50 starts, that’s only about 1 point added to the team, and that’s assuming maximal possible impact.

Similarly, moving up this curve by one percentage point appears to be worth about 1.35 wins per hundred; multiplying that by 13.2% gives a value of 0.0018 standings points / game, which is almost exactly what I got when I did this empirically in the prior post, which leads me to believe that that estimate is a lot stronger than I initially thought.

There’s obviously a lot of assumptions in play here, including the assumptions going into my probabilities and Tango’s estimates of true performance, and I’m open to the idea that one or another of those is suppressing the importance of this skill. Overall, though, I’m largely inclined to hew to my prior conclusions saying that for a difference in shootout performance to be enough to make one goalie better overall than another, it has to be a fairly substantial one, and the difference in actual save percentage has to be correspondingly fairly small.

# Do Low Stakes Hockey Games Go To Overtime More Often?

Sean McIndoe wrote another piece this week about NHL overtime and the Bettman point (the 3rd point awarded for a game that is tied at the end of regulation—depending on your preferred interpretation, it’s either the point for the loser or the second point for the winner), and it raises some interesting questions. I agree with one part of his conclusion (the loser point is silly), but not with his proposed solution—I think a 10 or 15 minute overtime followed by a tie is ideal, and would rather get rid of the shootout altogether. (There may be a post in the future about different systems and their advantages/disadvantages.)

At one point, McIndoe is discussing how the Bettman point affects game dynamics, namely that it makes teams more likely to play for a tie:

So that’s exactly what teams have learned to do. From 1983-84 until the 1998-99 season, 18.4 percent of games went to overtime. Since the loser point was introduced, that number has up to 23.5 percent. 11 That’s far too big a jump to be a coincidence. More likely, it’s the result of an intentional, leaguewide strategy: Whenever possible, make sure the game gets to overtime.

In fact, if history holds, this is the time of year when we’ll start to see even more three-point games. After all, the more important standings become, the more likely teams will be to try to maximize the number of points available. And sure enough, this has been the third straight season in which three-point games have increased every month. In each of the last three full seasons, three-point games have mysteriously peaked in March.

So, McIndoe is arguing that teams are effectively playing for overtime later in the season because teams feel a more acute need for points. If you’re curious, based on my analysis this trend he cites is statistically significant, looking at a simple correlation of fraction of games ending in ties with the relative month of the season. If one assumes the effect is linear, each month the season goes on, a game becomes 0.5 percentage points more likely to go to overtime. (As an aside, I suspect a lot of the year-over-year trend is explained by a decrease in scoring over time, but that’s also a topic for another post.)

I’m somewhat unconvinced of this, given that later in the year there are teams who are tanking for draft position (would rather just take the loss) and teams in playoff contention want to deprive rivals of the extra point. (Moreover, teams may also become more sensitive to playoff tiebreakers, the first one of which is regulation and overtime wins.) If I had to guess, I would imagine that the increase in ties is due to sloppy play due to injuries and fatigue, but that’s something I’d like to investigate and hopefully will in the future.

Still, McIndoe’s idea is interesting, as it (along with his discussion of standings inflation, in which injecting more points into the standings makes everyone likelier to keep their jobs) suggests to me that there could be some element of collusion in hockey play, in that under some circumstances both teams will strategically maximize the likelihood of a game going to overtime. He believes that both teams will want the points in a playoff race. If this quasi-collusive mechanism is actually in place, where else might we see it?

My idea to test this is to look at interconference matchups. Why? This will hopefully be clear from looking at the considerations when a team wins in regulation instead of OT or a shootout:

1. The other team gets one point instead of zero. Because the two teams are in different conferences, this has no effect on whether either team makes the playoffs, or their seeding in their own conference. The only way it matters is if a team suspects it would want home ice advantage in a matchup against the team it is playing…in the Stanley Cup Finals, which is so unlikely that a) it won’t play into a team’s plans and b) even if it did, would affect very few games. So, from this perspective there’s no incentive to win a 2 point game rather than a 3 point game.
2. Regulation and overtime wins are a tiebreaker. However, points are much more important than the tiebreaker, so a decision that increases the probability of getting points will presumably dominate considerations about needing the regulation win. Between 1 and 2, we suspect that one team benefits when an interconference game goes to overtime, and the other is not hurt by the result.
3. The two teams could be competing for draft position. If both teams are playing to lose, we would suspect this would be similar to a scenario in which both teams are playing to win, though that’s a supposition I can test some other time.

So, it seems to me that, if there is this incentive issue, we might see it in interconference games. So our hypothesis is that interconference games result in more three point games than intraconference games.

Using data from Hockey Reference, I looked at the results of every regular season game since 1999, when overtime losses began getting teams a point, counting the number of games that went to overtime. (During the time they were possible, I included ties in this category.) I also looked at the stats restricted to games since 2005, when ties were abolished, and I didn’t see any meaningful differences in the results.

As it turns out, 24.0% of interconference games have gone to OT since losers started getting a point, compared with…23.3% of intraconference games. That difference isn’t statistically significant (p = 0.44); I haven’t done power calculations, but since our sample of interconference games has N > 3000, I’m not too worried about power. Moreover, given the point estimate (raw difference) of 0.7%, we are looking at such a small effect even if it were significant that I wouldn’t put much stock in it. (The corresponding figures for the shootout era are 24.6% and 23.1%, with a p-value of 0.22, so still not significant.)

My idea was that we would see more overtime games, not more shootout games, as it’s unclear how the incentives align for teams to prefer the shootout, but I looked at the numbers anyway. Since 2005, 14.2% of interconference games have gone to the skills competition, compared to 13.0% of intraconference games. Not to repeat myself too much, but that’s still not significant (p = 0.23). Finally, even if we look at shootouts as a fraction of games that do go to overtime, we see no substantive difference—57.6% for interconference games, 56.3% for intraconference games, p = 0.69.

So, what do we conclude from all of these null results? Well, not much, at least directly—such is the problem with null results, especially when we are testing an inference from another hypothesis. It suggests that NHL teams aren’t repeatedly and blatantly colluding to maximize points, and it also suggests that if you watch an interconference game you’ll get to see the players trying just as hard, so that’s good, if neither novel nor what we set out to examine. More to the point, my read is that this does throw some doubt on McIndoe’s claims about a deliberate increase in ties over the course of the season, as it shows that in another circumstance where teams have an incentive to play for a tie, there’s no evidence that they are doing so. However, I’d like to do several different analyses that ideally address this question more directly before stating that firmly.

Or, to borrow the words of a statistician I’ve worked with: “We don’t actually know anything, but we’ve tried to quantify all the stuff we don’t know.”