Category Archives: Announcing

Wear Down, Chicago Bears?

I watched the NFC Championship game the weekend before last via a moderately sketchy British stream. It used the Joe Buck/Troy Aikman feed, but whenever that went to commercials they had their own British commentary team whose level of insight, I think it’s fair to say, was probably a notch below what you’d get if you picked three thoughtful-looking guys at random out of an American sports bar. (To be fair, that’s arguably true of most of the American NFL studio crews as well.)

When discussing Marshawn Lynch, one of them brought out the old chestnut that big running backs wear down the defense and thus are likely to get big chunks of yardage toward the end of games, citing Jerome Bettis as an example of this. This is accepted as conventional wisdom when discussing football strategy, but I’ve never actually seen proof of this one way or another, and I couldn’t find any analysis of this before typing up this post.

The hypothesis I want to examine is that bigger running backs are more successful late in games than smaller running backs. All of those terms are tricky to define, so here’s what I’m going with:

  • Bigger running backs are determined by weight, BMI, or both. I’m using Pro Football Reference data for this, which has some limitations in that it’s not dynamic, but I haven’t heard of any source that has any dynamic information on player size.
  • Late in games is the simplest thing to define: fourth quarter and overtime.
  • More successful is going to be measured in terms of yards per carry. This is going to be compared to the YPC in the first three quarters to account for the baseline differences between big and small backs. The correlation between BMI and YPC is -0.29, which is highly significant (p = 0.0001). The low R squared (about 0.1) says that BMI explains about 10% of variation in YPC, which isn’t great but does say that there’s a meaningful connection. There’s a plot below of BMI vs. YPC with the trend line added; it seems like close to a monotonic effect to me, meaning that getting bigger is on average going to hurt YPC. (Assuming, of course, that the player is big enough to actually be an NFL back.)

BMI & YPC

My data set consisted of career-level data split into 4th quarter/OT and 1st-3rd quarters, which I subset to only include carries occurring while the game was within 14 points (a cut popular with writers like Bill Barnwell—see about halfway down this post, for example) to attempt to remove huge blowouts, which may affect data integrity. My timeframe was 1999 to the present, which is when PFR has play-by-play data in its database. I then subset the list of running backs to only those with at least 50 carries in the first three quarters and in the fourth quarter and overtime (166 in all). (I looked at different carry cutoffs, and they don’t change any of my conclusions.)

Before I dive into my conclusions, I want to preemptively bring up a big issue with this, which is that it’s only on aggregate level data. This involves pairing up data from different games or even different years, which raises two problems immediately. The first is that we’re not directly testing the hypothesis; I think it is closer in spirit to interpret as “if a big running back gets lots of carries early on, his/his team’s YPC will increase in the fourth quarter,” which can only be looked at with game level data. I’m not entirely sure what metrics to look at, as there are a lot of confounds, but it’s going in the bucket of ideas for research.

The second is that, beyond having to look at this potentially effect indirectly, we might actually have biases altering the perceived effect, as when a player runs ineffectively in the first part of the game, he will probably get fewer carries at the end—partially because he is probably running against a good defense, and partially because his team is likely to be behind and thus passing more. This means that it’s likely that more of the fourth quarter carries come when a runner is having a good day, possibly biasing our data.

Finally, it’s possible that the way that big running backs wear the defense down is that they soften it up so that other running backs do better in the fourth quarter. This is going to be impossible to detect with aggregate data, and if this effect is actually present it will bias against finding a result using aggregate data, as it will be a lurking variable inflating the fourth quarter totals for smaller running backs.

Now, I’m not sure that either of these issues will necessarily ruin any results I get with the aggregate data, but they are caveats to be mentioned. I am planning on redoing some of this analysis with play-by-play level data, but those data are rather messy and I’m a little scared of small sample sizes that come with looking at one quarter at a time, so I think presenting results using aggregated data still adds something to the conversation.

Enough equivocating, let’s get to some numbers. Below is a plot of fourth quarter YPC versus early game YPC; the line is the identity, meaning that points above the line are better in the fourth. The unweighted mean of the difference (Q4 YPC – Q1–3 YPC) is -0.14, with the median equal to -0.15, so by the regular measures a typical running back is less effective in the 4th quarter (on aggregate in moderately close games). (A paired t-test shows this difference is significant, with p < 0.01.)

Q1-3 & Q4

A couple of individual observations jump out here, and if you’re curious, here’s who they are:

  • The guy in the top right, who’s very consistent and very good? Jamaal Charles. His YPC increases by about 0.01 yards in the fourth quarter, the second smallest number in the data (Chester Taylor has a drop of about 0.001 yards).
  • The outlier in the bottom right, meaning a major dropoff, is Darren Sproles, who has the highest early game YPC of any back in the sample.
  • The outlier in the top center with a major increase is Jerious Norwood.
  • The back on the left with the lowest early game YPC in our sample is Mike Cloud, whom I had never heard of. He’s the only guy below 3 YPC for the first three quarters.

A simple linear model gives us a best fit line of (Predicted Q4 YPC) = 1.78 + 0.54 * (Prior Quarters YPC), with an R squared of 0.12. That’s less predictive than I thought it would be, which suggests that there’s a lot of chance in these data and/or there is a lurking factor explaining the divergence. (It’s also possible this isn’t actually a linear effect.)

However, that lurking variable doesn’t appear to be running back size. Below is a plot showing running back BMI vs. (Q4 YPC – Q1–3 YPC); there doesn’t seem to be a real relationship. The plot below it shows difference and fourth quarter carries (the horizontal line is the average value of -0.13), which somewhat suggests that this is an effect that decreases with sample size increasing, though these data are non-normal, so it’s not an easy thing to immediately assess.

BMI & DiffCarries & Diff

That intuition is borne out if we look at the correlation between the two, with an estimate of 0.02 that is not close to significant (p = 0.78). Using weight and height instead of BMI give us larger apparent effects, but they’re still not significant (r = 0.08 with p = 0.29 for weight, r = 0.10 with p = 0.21 for height). Throwing these variables in the regression to predict Q4 YPC based on previous YPC also doesn’t have any effect that’s close to significant, though I don’t think much of that because I don’t think much of that model to begin with.

Our talking head, though, mentioned Lynch and Bettis by name. Do we see anything for them? Unsurprisingly, we don’t—Bettis has a net improvement of 0.35 YPC, with Lynch actually falling off by 0.46 YPC, though both of these are within one standard deviation of the average effect, so they don’t really mean much.

On a more general scale, it doesn’t seem like a change in YPC in the fourth quarter can be attributed to running back size. My hunch is that this is accurate, and that “big running backs make it easier to run later in the game” is one of those things that people repeat because it sounds reasonable. However, given all of the data issues I outlined earlier, I can’t conclude that with any confidence, and all we can say for sure is that it doesn’t show up in an obvious manner (though at some point I’d love to pick at the play by play data). At the very least, though, I think that’s reason for skepticism next time some ex-jock on TV mentions this.

A Reason Bill Simmons is Bad At Gambling

For those unaware, Bill Simmons, aka the Sports Guy, is the editor-in-chief of Grantland, ESPN’s more literary (or perhaps intelligent, if you prefer) offshoot. He’s hired a lot of really excellent  writers (Jonah Keri and Zach Lowe, just to name two), but he continues to publish long, rambling football columns with limited empirical support. I find this somewhat frustrating given that the chief Grantland NFL writer, Bill Barnwell, is probably the most prominent data-oriented football writer around, but you take the good with the bad.

Simmons writes a column with NFL picks each week during the season, and has a pretty so-so track record for picking against the spread, as detailed in the first footnote to this article here. Simmons has also written a number of lengthy columns attempting to construct a system for gambling on the playoffs, and hasn’t done too great in this regard either. I’ve been meaning to mine some of these for a post for a while now, and since he’s written two such posts this year already (wild card and divisional round), I figured the time was right to look at some of his assertions.

The one I keyed on was this one, from two weeks ago:

SUGGESTION NO. 6: “Before you pick a team, just make sure Marty Schottenheimer, Herm Edwards, Wade Phillips, Norv Turner, Andy Reid, Anyone Named Mike, Anyone Described As Andy Reid’s Pupil and Anyone With the Last Name Mora” Isn’t Coaching Them.

I made this tweak in 2010 and feel good about it — especially when the “Anyone Named Mike” rule miraculously covers the Always Shaky Mike McCarthy and Mike “You Know What?” McCoy (both involved this weekend!) as well as Mike Smith, Mike “The Sideline Karma Gods Put A Curse On Me” Tomlin, Mike Munchak and the recently fired Mike Shanahan. We’re also covered if Mike Shula, Mike Martz, Mike Mularkey, Mike Tice or Mike Sherman ever make comebacks. I’m not saying you bet against the Mikes — just be psychotically careful with them. As for Andy Reid … we’ll get to him in a second.

That was written before the playoffs—after Round 1, he said he thinks he might make it an ironclad rule (with “Reid’s name…[in] 18-point font,” no less).

Now, these coaches certainly have a reputation for performing poorly under pressure and making poor decisions regarding timeouts, challenges, etc., but do they actually perform worse against the spread? I set out to find this out, using the always-helpful pro-football-reference database of historical gambling lines to get historical ATS performance for each coach he mentions. (One caveat here: the data only list closing lines, so I can’t evaluate how the coaches did compared to opening spreads, nor how much the line moved, which could in theory be useful to evaluate these ideas as well.) The table below lists the results:

Playoff Performance Against the Spread by Select Coaches
Coach Win Loss Named By Simmons Notes
Childress 2 1 No Andy Reid Coaching Tree
Ditka 6 6 No Named Mike
Edwards 3 3 Yes
Frazier 0 1 No Andy Reid Coaching Tree
Holmgren 13 9 No Named Mike
John Harbaugh 9 4 No Andy Reid Coaching Tree
Martz 2 5 Yes Named Mike
McCarthy 6 4 Yes Named Mike
Mora Jr. 1 1 Yes
Mora Sr. 0 6 Yes
Phillips 1 5 Yes
Reid 11 8 Yes
Schotteinheimer 4 13 Yes
Shanahan 7 6 Yes Named Mike
Sherman 2 4 Yes Named Mike
Smith 1 4 Yes Named Mike
Tice 1 1 Yes Named Mike
Tomlin 5 3 Yes Named Mike
Turner 6 2 Yes

A few notes: first, I’ve omitted pushes from these numbers, as PFR only lists two (both for Mike Holmgren). Second, the Reid coaching tree includes the three NFL coaches who served as assistants under Reid who coached an NFL playoff game before this postseason. Whether or not you think of them as Reid’s pupils is subjective, but it seems to me that doing it any other way is going to either turn into circular reasoning or cherry-picking. Third, my list of coaches named Mike is all NFL coaches referred to as Mike by Wikipedia who coached at least one playoff game, with the exception of Mike Holovak, who coached in the AFL in the 1960s and who thus a) seems old enough not to be relevant to this heuristic and b) is old enough that there isn’t point spread data for his playoff game on PFR, anyhow.

So, obviously some of these guys have had some poor performances against the spread: standouts include Jim Mora, Sr. at 0-6 and Marty Schottenheimer at 4-13, though the latter isn’t actually statistically significantly different from a .500 winning percentage (p = 0.052). More surprising, given Simmons’s emphasis on him, is the fact that Reid is actually over .500 lifetime in the playoffs against the spread. (That’s the point estimate, anyway; it’s not statistically significantly better, however.) This seems to me to be something you would want to check before making it part of your gambling platform, but that disconnect probably explains both why I don’t gamble on football and why Simmons seems to be poor at it. (Not that his rule has necessarily done him wrong, but drawing big conclusions on limited or contradictory evidence seems like a good way to lose a lot of money.)

Are there any broader trends we can pick up? Looking at Simmons’s suggestion, I can think of a few different sets we might want to look at:

  1. Every coach he lists by name.
  2. Every coach he lists by name, plus the Reid coaching tree.
  3. Every coach he lists by name, plus the unnamed Mikes.
  4. Every coach he lists by name, plus the Reid coaching tree and the unnamed Mikes.

A table with those results is below.

Combined Against the Spread Results for Different Groups of Coaches Cited By Simmons
Set of Coaches Number of Coaches in Set Wins Losses Winning Percentage p-Value
Named 14 50 65 43.48 0.19
Named + Reid 17 61 71 46.21 0.43
Named + Mikes 16 69 80 46.31 0.41
All 19 80 86 48.19 0.70

As a refresher, the p-value is the probability that we would observe a result as or more extreme as the observed result if there were no true effect, i.e. the selected coaches are actually average against the spread. (Here’s the Wikipedia article.) Since none of these are significant even at the 0.1 level (which is generally the lowest barrier to treating a result as meaningful), we wouldn’t conclude that any of Simmons’s postulated sets are actually worse than average ATS in the playoffs. It is true that these groups have done worse than average, but the margins aren’t huge and the samples are small, so without a lot more evidence I’m inclined to think that there isn’t any effect here. These coaches might not have been very successful in the playoffs, but any effect seems to be built into the lines.

Did Simmons actually follow his own suggestion this postseason? Well, he picked against Reid, for Mike McCoy (first postseason game), and against Mike McCarthy in the wild card round, going 1-0-2, with the one win being in the game he went against his own rule. For the divisional round, he’s gone against Ron Rivera (first postseason game, in the Reid coaching tree) and against Mike McCoy, sticking with his metric. Both of those games are today, so as I type we don’t know the results, but whatever they are, I bet they have next to nothing to do with Rivera’s relationship to Reid or McCoy’s given name.

Man U and Second Halves

During today’s Aston Villa-Manchester United match, Iain Dowie (the color commentator) mentioned that United’s form is improving and that they are historically a stronger team in the second half of the season, meaning that they may be able to put this season’s troubles behind them and make a run either the title or a Champions League spot. I didn’t get a chance to record the exact statement, but I decided to check up on it regardless.

I pulled data from the last ten completed Premier League seasons (via statto.com) to evaluate whether there’s any evidence that this is the case. What I chose to focus on was simply the number of first half and second half points for United, with first half and second half defined by number of games played (first 19 vs. last 19). One obvious problem with looking at this so simply is strength of schedule considerations. However, the Premier League, by virtue of playing a double round robin, is pretty close to having a balanced schedule—there is a small amount of difference in the teams one might play, and there are issues involving home and away, rest, and matches in other competitions, but I expect that’s random from year to year.

So, going ahead with this, has Man U actually produced better results in the second half of the season? Well, in the last 10 seasons (2003-04 – 2012-13), they had more points in the second half 4 times, and they did worse in the second half the other 6. (Full results are in the table at the bottom of the post.) The differences here aren’t huge—only a couple of points—but not only is there no statistically significant effect, there isn’t even a hint of an effect. Iain Dowie thus appears to be blowing smoke and gets to be the most recent commentator to aggravate me by spouting facts without support. (The aggravation in this case is compounded by the fact that this “fact” was wrong.)

I’ll close with two oddities in the data. The first is that, there are 20 teams that have been in the Premiership for at least 5 of the last 10 years, and exactly one has a significant result at the 5% level for the difference between first half and second half. (Award yourself a cookie if you guessed Birmingham City.) This seems like a textbook example of multiplicity to me.

The second, for the next time you want to throw a real stumper at someone, is that there is one team in the last 16 years (all I could easily pull data for) that had the same goal difference and number of points in the two halves of the season. That team is 2002-03 Birmingham City; I have to imagine that finishing 13th with 48 points and a -8 goal difference is about as dull as a season can get, though they did win both their Derby matches (good for them, no good for this Villa supporter).

Manchester United Results by Half, 2003—2012
Year First Half Points Second Half Points Total Points First Half Goal Difference Second Half Goal Difference Total Goal Difference
2003 46 29 75 25 4 29
2004 37 40 77 17 15 32
2005 41 42 83 20 18 38
2006 47 42 89 31 25 56
2007 45 42 87 27 31 58
2008 41 49 90 22 22 44
2009 40 45 85 22 36 58
2010 41 39 80 23 18 41
2011 45 44 89 32 24 56
2012 46 43 89 20 23 43

Break Points Bad

As a sentimental Roger Federer fan, the last few years have been a little rough, as it’s hard to sustain much hope watching him run into the Nadal/Djokovic buzzsaw again and again (with help from Murray, Tsonga, Del Potro, et al., of course). Though it’s become clear in the last year or so that the wizardry isn’t there anymore, the “struggles”* he’s dealt with since early 2008 are pretty frequently linked to an inability to win the big points.

*Those six years of “struggles,” by the way, arguably surpass the entire career of someone like Andy Roddick. Food for thought.

Tennis may be the sport with the most discourse about “momentum,” “nerves,” “mental strength,” etc. This is in some sense reasonable, as it’s the most prominent sport that leaves an athlete out there by himself with no additional help–even a golfer gets a caddy. Still, there’s an awful lot of rhetoric floating around there about “clutch” players that is rarely, if ever, backed up. (These posts are exceptions, and related to what I do below, though I have some misgivings about their chosen methods.)

The idea of a “clutch” player is that they should raise their game when it counts. In tennis, one easy way of looking at that is to look at break points. So, who steps their game up when playing break points?

Using data that the ATP provides, I was able to pull year-end summary stats for top men’s players from 1991 to the present, which I then aggregated to get career level stats for every man included in the data. Each list only includes some arbitrary number of players, rather than everyone on tour—this causes some complications, which I’ll address later.

I then computed the fraction of break points won and divided by the fraction of non-break point points won for both service points and return points, then averaged the two ratios. This figure gives you the approximate factor that a player ups his game for a break point. Let’s call it clutch ratio, or CR for short.

This is a weird metric, and one that took me some iteration to come up with. I settled on this as a way to incorporate both service and return “clutchness” into one number. It’s split and then averaged to counter the fact that most people in our sample (the top players) will be playing more break points as a returner than a server.

The first interesting thing we see is that the average value of this stat is just a little more than one—roughly 1.015 (i.e. the average player is about 1.5% better in clutch situations), with a reasonably symmetric distribution if you look at the histogram. (As the chart below demonstrates, this hasn’t changed much over time, and indeed the correlation with time is near 0 and insignificant. And I have no idea what happened in 2004 such that everyone somehow did worse that year.) This average value, to me, suggests that we are dealing at least to some extent with adverse selection issues having to do with looking at more successful players. (This could be controlled for with more granular data, so if you know where I can find those, please holler.)

Histogram

Distribution by Year

Still, CR, even if it doesn’t perfectly capture clutch (as it focuses on only one issue, only captures the top players and lacks granularity), does at least stab at the question of who raises their game. First, though, I want to specify some things we might expect to see if a) clutch play exists and b) this is a good way to measure it:

  • This should be somewhat consistent throughout a career, i.e. a clutch player one year should be clutch again the next. This is pretty self-explanatory, but just to make clear: a player isn’t “clutch” if their improvement isn’t sustained, they’re lucky. The absence of this consistency is one of the reasons the consensus among baseball folk is that there’s no variation in clutch hitting.
  • We’d like to see some connection between success and clutchness, or between having a reputation for being clutch and having a high CR. This is tricky and I want to be careful of circularity, but it would be quite puzzling if the clutchest players we found were journeymen like, I dunno, Igor Andreev, Fabrice Santoro, and Ivo Karlovic.
  • As players get older, they get more clutch. This is preeeeeeeeeeetty much pure speculation, but if clutch is a matter of calming down/experience/whatever, that would be one way for it to manifest.

We can tackle these in reverse order. First, there appears to be no improvement year-over-year in a player’s break ratio. If we limit to seasons with at least 50 matches played, the probability that a player had a higher clutch ratio in year t+1 than he did in year t is…47.6%. So, no year-to-year improvement, and actually a little decrease in clutch play. That’s fine, it just means clutch is not a skill someone develops. (The flip side is that it could be that younger players are more confident, though I’m highly skeptical of that. Still, the problem with evaluating these intangibles is that their narratives are really easily flipped.)

Now, the relationship between success and CR. Let’s first go with a reductive measure of success: what fraction of games a player won. Looking at either a season basis (50 match minimum, 1006 observations) or career basis (200 match minimum, 152 observations), we see tiny, insignificant correlations between these two figures. Are these huge datasets? No, but the total absence of any effect suggests there’s really no link here between player quality and clutch, assuming my chosen metrics are coherent. (I would have liked to try this with year end rankings, but I couldn’t find them in a convenient format.)

What if we take a more qualitative approach and just look at the most and least clutch players, as well as some well-regarded players? The tables below show some results in that direction.

Name Clutch Ratio
Best Clutch Ratios
1 Jo-Wilfried Tsonga 1.08
2 Kenneth Carlsen 1.07
3 Alexander Volkov 1.06
4 Goran Ivanisevic 1.05
5 Juan Martin Del Potro 1.05
6 Robin Soderling 1.05
7 Jan-Michael Gambill 1.04
8 Nicolas Kiefer 1.04
9 Paul Haarhuis 1.04
10 Fabio Fognini 1.04
Worst Clutch Ratios
Name Clutch Ratio
1 Mariano Zabaleta 0.97
2 Andrea Gaudenzi 0.97
3 Robby Ginepri 0.98
4 Juan Carlos Ferrero 0.98
5 Jonas Bjorkman 0.98
6 Juan Ignacio Chela 0.98
7 Gaston Gaudio 0.98
8 Arnaud Clement 0.98
9 Thomas Enqvist 0.99
10 Younes El Aynaoui 0.99

See any pattern to this? I’ll cop to not recognizing many of the names, but if there’s a pattern I can see it’s that a number of the guys at the top of the list are real big hitters (I would put Tsonga, Soderling, Del Potro, and Ivanesevic in that bucket, at least). Otherwise, it’s not clear that we’re seeing the guys you would expect to be the most clutch players (journeyman Dolgov at #3?), nor do I see anything meaningful in the list of least clutch players.

Unfortunately, I didn’t have a really strong prior about who should be at the top of these lists, except perhaps the most successful players—who, as we’ve already established, aren’t the most clutch. The only list of clutch players I could find was a BleacherReport article that used as its “methodology” their performance in majors and deciding sets, and their list doesn’t match with these at all.

Since these lists are missing a lot of big names, I’ve put a few of them in the list below.

Clutch Ratios of Notable Names
Overall Rank (of 152) Name Clutch Ratio
18 Pete Sampras 1.03
20 Rafael Nadal 1.03
21 Novak Djokovic 1.03
26 Tomas Berdych 1.03
71 Andy Roddick 1.01
74 Andre Agassi 1.01
92 Lleyton Hewitt 1.01
122 Marat Safin 1.00
128 Roger Federer 1.00

In terms of relative rankings, I guess this makes some sense—Nadal and Djokovic are renowned for being battlers, Safin is a headcase, and Federer is “weak in big points,” they say. Still, these are very small differences, and while over a career 1-2% adds up, I think it’s foolish to conclude anything from this list.

Our results thus far give us some odd ideas about who’s clutch, which is a cause for concern, but we haven’t tested the most important aspect of our theory: that this metric should be consistent year over year. To check this, I took every pair of consecutive years in which a player played at least 50 matches and looked at the clutch ratios in years 1 and 2. We would expect there to be some correlation here if, in fact, this stat captures something intrinsic about a player.

As it turns out, we get a correlation of 0.038 here, which is both small and insignificant. Thus, this metric suggests that players are not intrinsically better or worse in break point situations (or at least, it’s not visible in the data as a whole).

What conclusions can we draw from this? Here we run into a common issue with concepts like clutch that are difficult to quantify—when you get no result, is the reason that nothing’s there or that the metric is crappy? In this case, while I don’t think the metric is outstanding, I don’t see any major issues with it other than a lack of granularity. Thus, I’m inclined to believe that in the grand scheme of things, players don’t really step their games up on break point.

Does this mean that clutch isn’t a thing in tennis? Well, no. There are a lot of other possible clutch metrics, some of which are going to be supremely handicapped by sample size issues (Grand Slam performance, e.g.). All told, I certainly won’t write off the idea that clutch is a thing in tennis, but I would want to see significantly more granular data before I formed an opinion one way or another.

Tim McCarver and Going the Other Way

During the Tigers-Red Sox game last night, Tim McCarver said he thought it was a little odd that the Tigers would bring in lefty Drew Smyly to face David Ortiz while also leaving the shift on, since lefties are more likely to go to the opposite field against a left-handed pitcher. (At the very least, I know he said this last part. Memory is a tricky thing, and I’m now not sure whether he said this about Ortiz or someone else, possibly Alex Avila.) Being Tim McCarver, he didn’t say why this might be true, nor did he cite a source for this information, putting this firmly in the realm of obnoxious hypotheses.

The first question is whether or not this is true. For that, there are these handy aggregated spray charts, courtesy Brooks Baseball.

David Ortiz Aggregated Spray Charts by Pitcher Handedness, 2007–2013

Alex Avila Aggregated Spray Charts by Pitcher Handedness, 2007–2013

Based on these data, I have to say it seems like McCarver’s assertion is true: they are slightly more likely to go to left against a left-handed pitcher. I don’t have enough information to say if the differences are either statistically significant (I’d guess it is, given the number of balls these guys have put into play in the last  5-7 years) or practically significant (I kinda doubt it). Regardless of the answer, though, the fact remains that the appropriate thing to do is to bring in the lefty and shift slightly less drastically, so who knows why McCarver brought this up to begin with. After all, Ortiz hits drastically worse against lefties (his OPS against lefties is 24% smaller than his lifetime rate, via baseball-reference), as does Avila (36%).

There’s also the question of why this might be true, and in fairness to McCarver, there are some pretty plausible mechanisms for what he was saying. One is that a breaking pitch from a left-hander is more likely to be on the outer part of the plate for a left-handed batter than a similar pitch from a right-handed batter, and outside pitches are more likely to get hit the other way. Another is that left-handed batters can’t pick up a pitch as easily against a left-handed pitcher, so they are more likely to make late contact, which is in turn more likely to go to the opposite field. I can’t necessarily confirm either of these mechanisms empirically, though looking at Brooks splits for Avila and Ortiz suggests that the fraction of outside pitches they see against left-handers is about 3 percentage points larger than the fraction against righties.

So, what McCarver said was true (though not terribly helpful), and there are seemingly good reasons for it to be true. I still posted something, though, because this is a great example of something that pisses me off about sports commentators–a tendency to toss out suppositions and not bother with supporting or explaining them. (Another good example of this is Hawk Harrelson.) That tendency, along with their love of throwing out hypotheses that are totally unfalsifiable (McCarver asserting that the pitching coach coming out to the mound is valuable, e.g.), is one of the things I plan to deal with pretty regularly in this space.

(Happy first post, everyone.)