Here’s a short post to answer a straight-forward question: do pitchers that throw more pitches pitch more slowly? If it’s not clear, the idea is that a pitcher who throws several pitches frequently will take longer because the catcher has to spend more time calling the pitch, perhaps with a corresponding increase in how often the pitcher shakes off the catcher.
To make a quick pass at this, I pulled FanGraphs data on how often each pitcher threw fastballs, sliders, curveballs, changeups, cutters, splitters, and knucklers, using data from 2009–13 on all pitches with at least 200 innings. (See the data here. There are well-documented issues with the categorizations, but for a small question like this they are good enough.) The statistic used for how quickly the pitcher worked was the appropriately named Pace, which measures the number of seconds between pitches thrown.
To easily test the hypothesis, we need a single number to measure how even the pitcher’s pitch mix is, which we believe to be linked to the complexity of the decision they need to make. There are many ways to do this, but I decided to go with the Herfindahl-Hirschman Index, which is usually used to measure market concentration in economics. It’s computed by squaring the percentage share of each pitch and adding them together, so higher values mean things are more concentrated. (The theoretical max is 10,000.) As an example, Mariano Rivera threw 88.9% cutters and 11.1% fastballs over the time period we’re examining, so his HHI was . David Price threw 66.7% fastballs, 5.8% sliders, 6.6% cutters, 10.6% curveballs, and 10.4% changeups, leading to an HHI of 4746. (See additional discussion below.) If you’re curious, the most and least concentrated repertoires split by role are in a table at the bottom of the post.
As an aside, I find two people on those leader/trailer lists most interesting. The first is Yu Darvish, who’s surrounded by junkballers—it’s pretty cool that he has such amazing stuff and still throws 4.5 pitches with some regularity. The second is that Bartolo Colon has, according to this metric, less variety in his pitch selection over the last five years than the two knuckleballers in the sample. He’s somehow a junkballer but with only one pitch, which is a pretty #Mets thing to be.
Back to business: after computing HHIs, I split the sample into 99 relievers and 208 starters, defined as pitchers who had at least 80% of their innings come in the respective role. I enforced the starter/reliever split because a) relievers have substantially less pitch diversity (unweighted mean HHI of 4928 vs. 4154 for starters, highly significant) and b) they pitch substantially slower, possibly due to pitching more with men on base and in higher leverage situations (unweighted mean Pace of 23.75 vs. 21.24, a 12% difference that’s also highly significant).
So, how does this HHI match up with pitching pace for these two groups? Pretty poorly. The correlation for starters is -0.11, which is the direction we’d expect but a very small correlation (and one that’s not statistically significant at p = 0.1, to the limit extent that statistical significance matters here). For relievers, it’s actually 0.11, which runs against our expectation but is also statistically and practically no different from 0. Overall, there doesn’t seem to be any real link, but if you want to gaze at the entrails, I’ve put scatterplots at the bottom as well.
One important note: a couple weeks back, Chris Teeter at Beyond the Box Score took a crack at the same question, though using a slightly different method. Unsurprisingly, he found the same thing. If I’d seen the article before I’d had this mostly typed up, I might not have gone through with it, but as it stands, it’s always nice to find corroboration for a result.
Name | FB% | SL% | CT% | CB% | CH% | SF% | KN% | HHI | |
---|---|---|---|---|---|---|---|---|---|
1 | Sean Marshall | 25.6 | 18.3 | 17.7 | 38.0 | 0.5 | 0.0 | 0.0 | 2748 |
2 | Brandon Lyon | 43.8 | 18.3 | 14.8 | 18.7 | 4.4 | 0.0 | 0.0 | 2841 |
3 | D.J. Carrasco | 32.5 | 11.2 | 39.6 | 14.8 | 2.0 | 0.0 | 0.0 | 2973 |
4 | Alfredo Aceves | 46.5 | 0.0 | 17.9 | 19.8 | 13.5 | 2.3 | 0.0 | 3062 |
5 | Logan Ondrusek | 41.5 | 2.0 | 30.7 | 20.0 | 0.0 | 5.8 | 0.0 | 3102 |
Name | FB% | SL% | CT% | CB% | CH% | SF% | KN% | HHI | |
---|---|---|---|---|---|---|---|---|---|
1 | Kenley Jansen | 91.4 | 7.8 | 0.0 | 0.2 | 0.6 | 0.0 | 0.0 | 8415 |
2 | Mariano Rivera | 11.1 | 0.0 | 88.9 | 0.0 | 0.0 | 0.0 | 0.0 | 8026 |
3 | Ronald Belisario | 85.4 | 12.7 | 0.0 | 0.0 | 0.0 | 1.9 | 0.0 | 7458 |
4 | Matt Thornton | 84.1 | 12.5 | 3.3 | 0.0 | 0.1 | 0.0 | 0.0 | 7240 |
5 | Ernesto Frieri | 82.9 | 5.6 | 0.0 | 10.4 | 1.1 | 0.0 | 0.0 | 7013 |
Name | FB% | SL% | CT% | CB% | CH% | SF% | KN% | HHI | |
---|---|---|---|---|---|---|---|---|---|
1 | Shaun Marcum | 36.6 | 9.3 | 17.6 | 12.4 | 24.1 | 0.0 | 0.0 | 2470 |
2 | Freddy Garcia | 35.4 | 26.6 | 0.0 | 7.9 | 13.0 | 17.1 | 0.0 | 2485 |
3 | Bronson Arroyo | 42.6 | 20.6 | 5.1 | 14.2 | 17.6 | 0.0 | 0.0 | 2777 |
4 | Yu Darvish | 42.6 | 23.3 | 16.5 | 11.2 | 1.2 | 5.1 | 0.0 | 2783 |
5 | Mike Leake | 43.5 | 11.8 | 23.4 | 9.9 | 11.6 | 0.0 | 0.0 | 2812 |
Name | FB% | SL% | CT% | CB% | CH% | SF% | KN% | HHI | |
---|---|---|---|---|---|---|---|---|---|
1 | Bartolo Colon | 86.2 | 9.1 | 0.2 | 0.0 | 4.6 | 0.0 | 0.0 | 7534 |
2 | Tim Wakefield | 10.5 | 0.0 | 0.0 | 3.7 | 0.0 | 0.0 | 85.8 | 7486 |
3 | R.A. Dickey | 16.8 | 0.0 | 0.0 | 0.2 | 1.5 | 0.0 | 81.5 | 6927 |
4 | Justin Masterson | 78.4 | 20.3 | 0.0 | 0.0 | 1.3 | 0.0 | 0.0 | 6560 |
5 | Aaron Cook | 79.7 | 9.7 | 2.8 | 7.6 | 0.4 | 0.0 | 0.0 | 6512 |
Boring methodological footnote: There’s one primary conceptual problem with using HHI, and that’s that in certain situations it gives a counterintuitive result for this application. For instance, under our line of reasoning we would think that, ceteris paribus, a pitcher who throws a fastball 90% of a time and a change 10% of the time would have an easier decision to make than one who throws a fastball 90% of the time and a change and slider 5% each. However, the HHI is higher for the latter pitcher—which makes sense in the context of market concentration, but not in this scenario. (The same issue holds for the Gini coefficient, for that matter.) There’s a very high correlation between HHI and the frequency of a pitcher’s most common pitch, though, and using the latter doesn’t change any of the conclusions of the post.