Category Archives: Musings

Does the Call Need to Stand?

I have a post over at the Hardball Times today about a way to improve replay review. Check it out!

Notes from SaberSeminar

I was fortunate enough to be at SaberSeminar this past weekend, held at Boston University and organized by (among others) Dan Brooks, the titular Brooks of PitchF/X site BrooksBaseball. I took some notes throughout the weekend, and I’ve typed them up below, broken into smaller observations. (All of the reflection was done on the bus home, so any mistakes are my own due to typing with a fried brain.)

One other thing to tell is that I presented some research I did on shifts in the strike zone (it actually came out of this article about high socks), and I’m going to be writing that up as an article soon, though it may appear at a different site. All in all, quite an enjoyable weekend even after factoring in the scattered criticisms below.

It was my first time at a baseball conference and the first time I’d been at any conference in quite some time, and it actually struck me as pretty similar to an indie music festival. The crowd wasn’t huge (a couple hundred people), and they all knew most of the speakers, who mostly stuck to greatest hits sort of things. (Most of what I saw presented wasn’t novel, especially by the more prominent folk.) That’s not to take away from the sessions—it was still interesting to meet and hear people that I had only read, and it was still great to be around a group where everyone was interested in the same sort of stuff and you could bring up things like SIERA and wOBA without much risk of confusion.

They had a panel discussion featuring three of the Red Sox baseball operations interns, and I was reminded of how skeezy some aspects of that system are. The moderator talked about how there were fewer MLB intern slots than there used to be because the feds cracked down on illegal internships, which he framed as a bad thing. I found that a bit horrifying; it seems odd to me that a team with a payroll of hundreds of millions would cut staff rather than pay a semi-reasonable wage to their junior people. (Even if it makes economic sense, it seems like a bad way to treat people.)

The interns, for their part, didn’t offer a whole lot of insight into things. (Not that I blame them; it’s hard to be insightful during a panel discussion.) They twice dodged the question of how many hours they work, only saying “a lot.” (It’s possible I’m being too harsh and they actually don’t know; because I’m billed out by the hour at work, I have to keep very accurate time logs, which means I know how much I’ve worked every week since I’ve started, but I may be an outlier in that regard.) One of the failings of the panel was that it didn’t include anyone who had been an intern and washed out (either quit or wasn’t offered a job), which would have been more informative and more interesting. (I understand not wanting to irk any of the teams, but I don’t think this is too inflammatory.) This is the same problem I ran into in college a lot, where most of the advice I got about whether to get a Ph.D. came from people who had not only loved grad school but also met with astounding success afterward.

I also thought about the fact that they are hiring recent grads of extremely expensive schools (Columbia, Yale, and Georgetown, in this case) to be extremely underpaid interns; I wonder how much of their labor pool is indirectly disqualified simply due to a lack of connections or a need to make money for family or student loan reasons. It’s very puzzling to me that teams, despite being flush with cash, hire people similarly to high-prestige, low-money companies like magazines rather than high-prestige, rich firms like banks and tech companies. I’d love to see more discussion from people who know more about the industry than I do.

One mostly unstated theme that kept occurring to me throughout the weekend was the issue of class, opportunity, and privilege, which popped up in a number of different ways:

Baseball Prospectus’s Russell Carleton discussed how teams can (or should) help their players develop into better adults by focusing on their financial, practical, nutritional, and mental well-being. While he focused mostly on the positive effects it would have on a player’s career and thus a team’s investment (fewer distractions and a better makeup will help talent win out), it seems to me that it’s a good thing in its own right to improve the life skills of the wash outs—especially the ones that skipped college and/or come from poorer backgrounds. Fewer guys in society who behave like Dirk Hayhurst’s teammates is probably a good thing.
Tom Tippett, a senior analyst for the Red Sox, talked about how he always appreciates players that are diamonds in the rough, i.e. guys who went undrafted, played in the independent leagues, etc. The thing is, though, that equality of access doesn’t exist, and it’s an interesting thought experiment to think of how many guys get cut before they figure this out. In particular, I wonder how many guys are able to get to a good college or (building on Carleton’s point) hang around the minors longer (thus increasing the opportunity that they make the big leagues) because they have better “makeup” that really comes from growing up with a few more advantages.
There was a demo of TrackMan, which is a portable radar system that can be used to evaluate pitch speed and rotation. One of the guys I was talking to pointed out that they’d sold as many of the systems as they could to clubs and agencies and were hoping to move on to selling it to amateurs. I don’t know what one costs, but the idea of buying a portable radar system for your high school pitcher seems like a caricature of what a rich family gunning for a scholarship would do (analogous to all of the academic tutoring and test prep that a lot of people I know did).
Relatedly, a number of guys talked about the various high school showcases that pit the best high school talent in a region against each other, and one casually mentioned how much money they bring in. Again, seems like the sort of thing that serves to extract cash from hyperzealous parents and limit the opportunities for kids of less means, but I don’t know enough about the system to comment.
Internship opportunities for big league teams, which I discussed above.
I’ve been of the conviction for a while that pro sports would be equally or more enjoyable and substantially less ethically problematic if the teams were run as non-profits in the same general manner as art museums and whatnot (the Green Bay Packers are something like this already), and while I won’t go into that further here, if that were the case I think it’d be easier for a lot of the explicit privilege issues to be brought up in the game. While teams are still nominally concerned with profits, it’s a lot easier for them to sidestep problems that in a better world they could help address.

There were a number of talks with a more medical and scientific focus, and they provided good examples of how hard it is to apply these things rigorously to baseball (or any other real world application). There are lots of studies with very small N (“N=4” appeared on one slide describing research that had been published), and they are presumably subject to the same sorts of issues that all public health and social science papers are. While I’m sure lots of teams (in all sports) would love to bring in scientists to help them with things like sleep and vision, I imagine there’s a lot of stuff that falls apart between the lab and the field (if it even exists at all).

I should mention that the first talk of the conference, by a UC-Riverside professor who focuses on vision, did have experimental evidence of how improved vision helps college player performance, but it’s still tiny samples and only college students, and thus to be taken with a grain of salt.

There was an interesting panel featuring Matt Swartz, Ben Baumer, and Vince Gennaro discussing the relationship between winning and teams’ making money that prompted at least a couple article ideas for the future. Gennaro said he thought that the way teams spend money might change a bit after the addition of the second wild card, as there is now much greater variety across playoff teams in terms of how valuable the postseason slot is—the first seed became more valuable and the wild card slots substantially less. I have some thoughts on that, but will leave them for future articles.

Astros’ GM Jeff Luhnow gave a pleasant enough if relatively fact-free talk, the main focus of which was the importance of convincing the uniformed personnel of the importance of the sabermetric principles that buck conventional wisdom. He used the example of the shift and how it took the Astros three years to actually get people on board with it; obviously, if the players and the manager don’t like it, it won’t work as well as it would otherwise. I honestly wouldn’t be surprised if this becomes much less of an issue in 10 or so years, when the reasoning will have permeated a bit more through the baseball establishment and managers and young players will be a lot more open to things.

Vince Gennaro gave a very similar talk, and one thing he brought up was that you need to strike a balance between sticking to general principles (about shifting, pitcher workload, etc.) and making exceptions where warranted. Given that Luhnow talked about how he had made too many exceptions about when to shift last year, the point was hammered home, though it’s a vague enough point that it’s hard to really implement. (I was also reminded of this recent article about Ruben Amaro and exceptions.)

While Red Sox GM Ben Cherington didn’t discuss anything much more novel than what Luhnow covered, he was a lot more personable and down-to-earth while doing so. I imagine some of that is personality and a lot of it is the result of being the GM of the defending World Series champs and talking in his own backyard instead of presiding over three years of horrible teams and a lot of criticism from around baseball.

The last question asked of Luhnow was a minute-long ramble that wasn’t really a question and basically turned into “haha, you screwed up the Brady Aiken situation,” and Luhnow looked pretty peeved afterward, prompting Dan Brooks to tell the crowd not to be jerks to the presenters. There were a lot of bad questions all weekend, especially to Cherington, Red Sox manager John Farrell, and Luhnow, who clearly couldn’t say anything about their teams to us that was any more interesting than what they tell the media after a game. That didn’t stop people from bugging Farrell about his bullpen, though.

There was a bit more offensive humor in some of the talks than I would have expected—one professor made a joke about George W. Bush being brain damaged, and another managed to have a slide showing him wearing a t shirt that said “Drunk Bitches Love Me,” a slide with a cartoon captioned “I Will Fucking Cut You, Bitch,” and threw in a couple fat jokes for good measure. As anyone who knows me will attest, I have a reasonably sharp sense of humor, but throwing around jokes with misogynistic overtones at a conference that I would estimate was 90–95% men made me cringe.

More bullets from John Farrell’s talk:

When asked (I believe by accomplished sabermetrician Mitchel Lichtman, aka MGL) about his handling of the bullpen on Friday night, he discussed how “it’s about 162 games” (specifically about pulling Koji Uehara after an easy inning). While that’s certainly important, I wonder how often it’s just used as a crutch to justify poor decisions. As Grantland’s Bill Barnwell has written before, you basically only ever see nebulous qualitative concerns like Uehara’s overwork invoked to defend conservative decisions, never more aggressive ones.
Farrell thinks that the spree of Tommy John surgeries is more due to guys overthrowing (to impress scouts) than it is due to the sheer volume of pitches. The surgeon that spoke later in the conference disagreed.
He mentioned that it’s easier to use less conventional strategies once opponents start doing it, because it seems more normal to the players. One obvious consequence of this is that really conspicuous tactical advances like shifting are going to be relatively short-lived advantages.
He thinks that introducing replay and the corresponding promotion of several umpires has led to an expanded strike zone and that’s part of the continued downturn in baseball. That seems like a testable hypothesis, and something I’ll probably look at soon.
Farrell mentioned that figuring out how guys will react when they fail as baseball players for the first time is an important part of helping players move through the system, which I guess is a big part of what Russell Carleton later talked about. It reminded me a lot of what people say about elite colleges and places like Stuyvesant High School, where a lot of people have to adjust from being in the 99^th percentile of their peers at their prior institution to being median or below at their new one.

Russell Carleton gets major points for treating data as a plural noun rather than a singular one; he was the only one I noticed doing that all conference. I’d be curious to see what the usage rates are depending on background, with my guess being that people with more academic experience use “are” more than people who mostly use data in a private sector setting. (Yes, I’m a pedant about some of these things.)

Two White Sox notes from people’s presentations:

Apparently Tyler Flowers is fifth in the bigs in saving runs by catch framing. Some of that is surely because of his workload, but it’s still a surprise.
Not surprising, but something I’d forgotten: Erik Johnson was BP’s #1 prospect for the White Sox at the beginning of the year. Sigh.

Another one of those themes that kept popping up to me during the weekend was the idea of how teams preserve their edges, especially the ones they derive through quantitative and sabermetric means. (I’m reminded of the Red Queen hypothesis, which is an evolutionary biology idea I learned about through quiz bowl that applies pretty well to baseball analysis. Basically, you have to keep advancing in absolute terms to stay in the same place relatively, because if you are complacent people will catch up to you naturally.) A few places that this issue manifested itself:

MGL said that, for general short term forecasting, any of the major projection public systems (Oliver, Steamer, PECOTA, and ZiPS) will do. He actually suggested that teams were probably wasting their time trying to come up with a better general forecasting system and that understanding volatility and more specific systems is probably more important. Jared Cross, developer of Steamer, disagreed a little bit; his view was that the small gap between current projection systems and perfect estimates meant that a seemingly marginal improvement would actually mean a lot because of how competitive things are.
The sort of innovations Carleton discussed are the sort of thing that would pretty quickly spread throughout baseball, making any one team;s advantage relatively fleeting. Of course, if it spreads then it’s likely to increase the overall quality of the talent pool, which would lead to either better games or more teams, which is probably good. (It’s not necessarily good for the current players; any progress in player evaluation and development is unlikely to actually help players, given that the total number of jobs isn’t increasing; if one player does better, another loses his job.)
Tom Tippett said that if he had his druthers the public wouldn’t have access to PitchF/X data, because it allows teams who don’t have good analysts to borrow heavily from the public and thus decreases the advantage that analytically-inclined organizations hold. While I think that’s true from his self-interested perspective, I think it’s a bit short-sighted overall, and I wish he’d answered with a “good for the game” perspective. When you think about what’s best for fans, I think defending closed systems is probably harder; one of the things I like about baseball is how freely available the data are, and to the extent that becomes less true I think it’s a sad thing.
On that note, I’m a bit amused by what people think of as being “trade secrets”; there were lots of teams bouncing around and a few presentations by companies that are using highly proprietary data analysis methods that they are trying to sell people on. Again, it’s hard for me to evaluate how meaningful that stuff is (though people love kicking around the figure that pro teams are five years ahead of the public in their understanding of things), but even if it does represent a competitive advantage it’s still pretty funny to step back and think about the secrecy that’s applied to sports.

Several different people brought up StatCast, which is the new data collection system MLB Advanced Media is going to roll out some time soon; it will provide a huge amount of data on how fast players move and how quickly they react that will allow for analysis that’s a bit more along the lines of what the SportVU cameras do in basketball. (See the videos in the above link for examples.) There’s still no sense of whether or not it will be made public (and in what form it might be made public), but people were uniformly excited about it.

The projections folk were united in the belief that it would have a huge effect on projecting defense, to the point where MGL thinks defense will go from being the hardest component of the sport to analyze and predict to the easiest. There was a bit more divergence about what it might do for pitching and batting analysis, as well as about when it would come out—one speaker quoted MLB and said it would be ready to go by the beginning of next year, whereas Dave Cameron pointed out that test data hadn’t been released to the teams yet despite what was originally promised and thus thought it was highly unlikely that the data would be ready for teams by next year, much less ready for public consumption.

Dan Brooks jokingly introduced a hitting metric he called “GIP,” for Google Images Performance. It was prompted by the fact that a Google search for “miguel cabrera hitting” or “david ortiz hitting” finds pictures of them hitting home runs, whereas a query for “jose molina hitting” gets mostly pictures of him behind the plate.

More notes from Tippett’s talk:

He started by mentioning that the hardest part of his job is deciding when a player’s underperformance is real and not just noise; I don’t think I’ve ever seen a rigorous evaluation of how to do that using Bayes’ rule and the costs of Type I and Type II errors (booting a good player and keeping a bad one, respectively), but I would love to read one. (How projection systems react to new performance data is in principle just Bayesian reasoning regardless.)
He non-snarkily talked about the “momentum” that the Red Sox had, and nobody asked about it. All that probably means is that when he’s talking outside the office he’s perfectly willing to let a bit of narrative creep in.
He talked about their decision to cut Grady Sizemore and how it was related to certain incentives in his contract that would have vested. I casually wonder how cavalier teams are allowed to be in explicitly making decisions based on players’ contract incentives, given that there have been talks about how players might opt to file grievances about these things in the past. (The one that comes to mind is Brett Myers, who would have had an option vest if he finished a certain number of games for the White Sox a couple years back. The commentary I read suggested that conspicuously changing his usage pattern would have been grounds for a grievance, but I don’t know how that applies to situations like Sizemore’s.)

What’s the Point of DIPS, Anyway?

Leave a reply

In the last piece I wrote, I mentioned that I have some concerns about the way that people tend to think about defense independent pitching statistics (DIPS), especially FIP. (Refresher: Fielding Independent Pitching is a metric commonly used as an ERA estimator based on a pitcher’s walk, strikeout, and HR numbers.) I’m writing this piece in part as a way for me to sort some of my thoughts on the complexities of defense and park adjustments, not necessarily to make a single point (and none of these thoughts are terribly original).

All of this analysis starts with this equation, which is no less foundational for being almost tautological: Runs Allowed = Fielding Independent Pitching + Fielding Dependent Pitching. (Quick aside: Fielding Independent Pitching refers both to a concept and a metric; in this article, I’m mostly going to be talking about the concept.) In other words, there are certain ways of preventing runs that don’t rely on getting substantial aid from the defense (strike outs, for instance), and certain ways that do (allowing soft contact on balls in play).

In general, most baseball analysts tend to focus on the fielding independent part of the equation. There are a number of good reasons for this, the primary two being that it’s much simpler to assess and more consistent than its counterpart. There’s probably also a belief that, because it’s more clearly intrinsic to the pitcher, it’s more worthwhile to understand the FI portion of pitching. There are pitchers for whom we shy away from using the FI stats (like knuckleballers), but if you look at the sort of posts you see on FanGraphs, they’ll mostly be talking about performance in those terms.

That’s not always (or necessarily ever) a problem, but it often omits an essential portion of context. To see how, look at these three overlapping ways of framing the question “how good has this pitcher been?”:

1) If their spot on their team were given to an arbitrary (replacement-level or average) pitcher, how much better or worse would the team be?

2) If we took this pitcher and put them on a hypothetically average team (average in terms of defense and park, at least), how much better or worse would that team be?

3) If we took this pitcher and put them on a specific other team, how much better or worse would that team be?

Roughly speaking, #2 is how I think of FanGraphs’ pitcher WAR. #1 is Baseball Reference’s WAR. I don’t know of anywhere that specifically computes #3, but in theory that’s what you should get out of a projection system like Baseball Prospectus’s PECOTA or the ZiPS numbers found at FanGraphs’. (In practice, my understanding is that the projections aren’t necessarily nuanced enough to work that out precisely.)

The thing, though, is that pitchers don’t work with an average park and defense behind them. You should expect a fly ball pitcher to post better numbers with the Royals and their good outfield defense and a ground ball pitcher to do worse in front the butchers playing in the Cleveland infield. From a team’s perspective, though, a run saved is a run saved, and who cares whether it’s credited to the defense, the pitcher, or split between the two? If Jarrod Dyson catches the balls sent his way, it’s good to have a pitcher who’s liable to have balls hit to him. In a nutshell, a player’s value to his team (or another team) is derived from the FIP and the FDP, and focusing on the FIP misses some of that. Put your players in the best position for them to succeed, as the philosophy often attributed to Earl Weaver goes.

There are a number of other ways to frame this issue, which, though I’ve been talking in terms of pitching, clearly extends beyond that into nearly all of the skills baseball players demonstrate. Those other frames are all basically a restatement of that last paragraph, so I’ll try to avoid belaboring the point, but I’ll add one more example. Let’s say you have two batters who are the same except for 5% of their at-bats, which are fly balls to left field for batter A and to right field for batter B. By construction, they are players of identical quality, but player B is going to be worth more in Cleveland, where those fly balls are much more likely to go out of the park. Simply looking at his wRC+ won’t give you that information. (My limited knowledge of fantasy baseball suggests to me that fantasy players, because they use raw stats, are more attuned to this.)

Doing more nuanced contextual analysis of the sort I’m advocating is quite tricky and is beyond my (or most people’s) ability to do quickly with the numbers we currently have available. I’d still love, though, to see more of it, with two things in particular crossing my mind.

One is in transaction analysis. I read a few pieces discussing the big Samardzija trade, for instance, and in none did they mention (even in passing) how his stuff is likely to play in Oakland given their defense and park situation. This isn’t an ideal example because it’s a trade with a lot of other interesting aspects to it, but in general, it’s something I wish I saw a bit more of—considering the amount of value a team is going out of a player after adjusting for park and defense factors. The standard way of doing this is to adjust things from his raw numbers to a neutral context, but bringing things one step further, though challenging, should add another layer of nuance. (I will say that in my experience you see such analyses a bit more with free agency analyses, especially of pitchers.)

The second is basically expanding what we think of as being park and defensive adjustments. This is likely impossible to do precisely without more data, but I’d love to see batted ball data used to get a bit more granular in the adjustments; for instance, dead pull hitters should be adjusted differently from guys who use the whole field. This isn’t anything new—it’s in the FanGraphs page explaining park factors—but it’s something that occasionally gets swept under the rug.

One last note, as this post gets ever less specific: I wonder how big the opportunity is for teams to optimize their lineups and rotations based on factors such as these—left-handed power hitters go against the Yankees, ground ball hitters against the Indians, etc. We already see this to some extent, but I’d be curious to see what the impact is. (If you can quantify how big an edge you’re getting on a batter-by-batter basis—a big if—you could run some simulations to quantify the gain from all these adjustments. It’s a complex optimization problem, but I doubt it’s impossible to estimate.)

One thing I haven’t seen that I’d love for someone to try is for teams with roughly interchangeable fourth, fifth, and sixth starters to juggle their pitching assignments each time through the order to get the best possible matchups with respect to park, opponent, and defense. Ground ball pitchers pitch at Comiskey, for instance, and fly ball pitchers start on days when your best outfield is out there. I don’t know how big the impact is, so I don’t want to linger on this point too much, but it seems odd that in the era of shifting we don’t discuss day-to-day adjustments very much.

And that’s all that I’m talking about with this. Defense- and park-adjusted statistics are incredibly valuable tools, but they don’t get you all the way there, and that’s an important thing to keep in mind when you start doing nuanced analyses.

Leaf-ed Behind by Analytics

Leave a reply

As you may have heard, there’s been a whole hullabaloo recently in the hockey world about the Toronto Maple Leafs. Specifically, they had a good run last year and in the beginning of this season that the more numerically-inclined NHL people believed was due to an unsustainably high shooting percentage that covered up their very weak possession metrics. Accordingly, the stats folk predicted substantial regression, which was met with derision by many Leafs fans and most of the team’s brass. The Leafs have played very poorly since that hot streak and have been eliminated from the playoffs; just a few weeks back, they had an 84% chance of making it. (See Sports Club Standings for the fancy chart.)

Unsurprisingly, this has lead to much saying of “I told you so” by the stats folk and a lot of grumblings about the many flaws of the current Leafs administration. Deadspin has a great write up of the whole situation, but one part in particular stood out. This is a quotation from the Leafs’ general manager, Dave Nonis:

“We’re constantly trying to find solid uses for [our analytics budget,” Nonis said. “The last six, seven years, we’ve had a significant dollar amount in our budget for analytics and most of those years we didn’t use it. We couldn’t find a system or a group we felt we could rely on to help us make reasonable decisions.”

[…]

“People run with these stats like they’re something we should pay attention to and make decisions on, and as of right now, very few of them are worth anything to us,” he said at one point during the panel, blaming media and fans for overhyping the analytics currently available.

This represents a mind-boggling lack of imagination on their part. Let’s say they honestly don’t think there’s a good system currently out there that could help them—that’s entirely irrelevant. They should drop the cash and try to build a system from scratch if they don’t like what’s out there.

There are four factors that determine how good the analysis of a given problem is going to be: 1) the analysts’ knowledge of the problem, 2) their knowledge of the tools needed to solve the problem (basically, stats and critical thinking), 3) the availability of the data, and 4) the amount of time the analysts have to work on the problem. People who know about both hockey and data are available in spades; I imagine you can find a few people in every university statistics department and financial firm in Canada that could rise to the task, to name only two places these people might cluster. (They might not know about hockey stats, but the “advanced” hockey stats aren’t terribly complex, so I have faith that anyone who knows both stats and hockey can figure out their metrics.)

For #3: the data aren’t great for hockey, but they exist and will get better with a minimal investment in infrastructure. Analysts’ having sufficient time is the most important factor in progress, though, and the hardest one to substitute; conveniently, time is an easy thing for the team to buy (via salary, which they even get a discount on because of the non-monetary benefits of working in hockey). If they post some jobs at a decent salary, they basically have their pick of statistically-oriented hockey fans. If a team gets a couple of smart people and has them working 40-60 hours a week thinking about hockey and bouncing ideas off of each other, they’re going to get some worthwhile stuff no matter what.

Let’s say that budget is $200,000 per year, or a fraction of the minimum player salary. At that level, one good idea from the wonks and they’ve paid for themselves many times over. Even if they don’t find a grand unified theory of hockey, they can help with more discrete analyses and provide a slightly different perspective on decisions, and they’re so low cost that it’s hard to see how they’d hurt a team. (After all, if the team thinks the new ideas are garbage it can ignore them—it’s what was happening in the first place, so no harm done.) The only way Toronto’s decision makes sense is if they think that analytics not only are currently useless but can’t become useful in the next decade or so, and it’s hard to believe that anyone really thinks that way. (The alternative is that they’re scared that the analysts would con the current brass into a faulty decision, but given their skepticism that seems like an unlikely problem.)

Is this perspective a bit self-serving? Yeah, to the extent that I like sports and data and I’d like to work for a team eventually. Regardless, it seems to me that the only ways to justify the Leafs’ attitude are penny-pinching and the belief that non-traditional stats are useless, and if either of those is the case, something has gone very wrong in Toronto.

Brackets, Preferences, and the Limits of Data

Leave a reply

As you may have heard, it’s March Madness time. If I had to guess, I’d wager that more people make specific, empirically testable predictions this week than any other week of the year. They may be derived without regard to the quality of the teams (the mascot bracket, e.g.), or they might be fairly advanced projections based on as much relevant data as are easily available (Nate Silver’s bracket, for one), but either way we’re talking about probably billions of predictions. (At 63 picks per bracket, we “only” need about 16 million brackets to get to a billion picks, and that doesn’t count all the gambling.)

What compels people to do all of this? Some people do it to win money; if you’re in a small pool, it’s actually feasible that you could win a little scratch. Other people do it because it’s part of their job (Nate Silver, again), or because there might be additional extrinsic benefits (I’d throw the President in that category). This is really a trick question, though: people do it to have fun. More precisely, and to borrow the language of introductory economics, they maximize utility.

The intuitive definition of utility can be viewed as pretty circular (it both explains and is defined by people’s decisions), but it’s useful as a way of encapsulating the notion that people do things for reasons that can’t really be quantified. The notion of unquantifiability, especially unquantifiable preferences, is something people sometimes overlook when discussing the best uses of data. Yelp can tell you which restaurant has the best ratings, but if you hate the food the rating doesn’t do you much good.*

One of the things I don’t like about the proliferation of places letting you simulate the bracket and encouraging you to use that analysis is that it disregards utility. They presume that your interests are either to get the most games correct or (for some of the more sophisticated ones) to win your pool. What that’s missing is that some of us have strongly ingrained preferences that dictate our utility, and that that’s okay. My ideal, when selecting a bracket, is to make it so I have as high a probability as possible of rooting for the winner of a game.

For instance, I don’t think I’ve picked Duke to make it past the Sweet Sixteen in the last 10 or more years. If they get upset before then, my joy in seeing them lose well outweighs the damage to my bracket, especially since most people will have them advancing farther than I do. On the other hand, if I pick them to lose in the first round**, it will just make the sting worse when they win. I’m hedging my emotions, pure and simple.***

This is an extreme example of my rule of thumb when picking teams that I have strong preferences for, which is to have teams I really like/dislike go one round more/less than I would predict to be likely. This reduces the probability that my heart will be abandoned by my bracket. As a pretty passive NCAA fan, I don’t apply this to too many teams besides Duke (and occasionally Illinois, where I’m from) on an annual basis, but I will happily use it with a specific player (Aaron Craft, on the negative side) or team (Wichita State, on the positive side) that is temporarily more charming or loathsome than normal. (This general approach applies to fantasy, as well: I’ve played in a half dozen or so fantasy football leagues over the years, and I’ve yet to have a Packer on my team.)

However, with the way the bracket is structured, this doesn’t necessarily torpedo your chances. Duke has a reasonable shot of doing well, and it’s not super likely that a 12th seeded midmajor is going to make a run, but my preferred scenarios are not so unlikely that they’re not worth submitting to whichever bracket challenge I’m participating in. This lengthens how long my bracket will be viable enough that I’ll still care about it and thus increase the amount of time I will enjoy watching the tournament. (At least, I tell myself that. My picks have crashed and burned in the Sweet Sixteen the last couple of years.)

Another wrinkle to this, of course, is that for games I have little or no prior preference in, simply making the pick makes me root for the team I selected. If it’s, say, Washington against Nebraska, I will happily pick the team in the bracket I think is more likely to win and then pull hard for the team. (I’m not immune to wanting my predictions to be valid.) So, the weaker my preferences are, the more I hew toward the pure prediction strategy. Is this capricious? Maybe, but so is sport in general.

I try not to be too normative in my assessments of sports fandom (though I’m skeptical of people who have multiple highly differing brackets), and if your competitive impulses overwhelm your disdain for Duke, that’s just fine. But if you’re like me, pick based on utility. By definition, it’ll be more fun.

* To be fair, my restaurant preferences aren’t unquantifiable, and the same is true for many other tastes. My point is that following everyone else’s numbers won’t necessarily yield you the best strategy for you.

** Meaning the round of 64. I’m not happy with the NCAA for making the decision that led to this footnote.

*** Incidentally, this is one reason I’m a poor poker player. I don’t enjoy playing in the optimal manner enough to actually do it. Thankfully, I recognize this well enough to not play for real stakes, which amusingly makes me play even less optimally from a winnings perspective.

Justice, Unobstructed

Leave a reply

There’s already been a large amount of figurative ink spillage about this, but I wanted to throw in a couple of thoughts. The first is that I’m of the opinion the call is unimpeachably accurate, though one can probably make a reasonable argument that a no-call would also have been correct. The rest of the thoughts are more about the reactions to the call.

An awful lot of people have been criticizing Saltalamacchia for throwing to third. While it (obviously) didn’t turn out well, I don’t think it’s quite as unambiguous as others do. Craig was safe by a pretty small margin, and if Salty had had a bit of a quicker release I think he could have had him. Middlebrooks probably should have caught it, also. Rob Neyer has a more thorough breakdown of all of this.

While the process was not perfect, though, I’m fairly certain that if Craig had been out at third or Middlebrooks had caught it, nobody would have said anything about the throw. I know that we can’t (and shouldn’t) ignore results entirely, because that’s why they play the games, but I’m pretty sure nobody would have ever said anything about that throw if Middlebrooks makes the catch, and that’s a shame for Saltalamacchia, who’s the goat in this scenario.

Also in the process/results bucket: from a baseball standpoint, Molina should have clobbered Saltalamacchia, which I haven’t seen anyone point out. (I say from a baseball standpoint because I can see broader philosophical objections to home plate collisions, even if they’re legal.) Sliding, he’s guaranteed to be out, and there’s a slight possibility that Craig is out and the inning is over. (There’s also a veeeeeeeeery slight possibility that the ball gets thrown away and Craig scores on obstruction. Baseball’s weird.) If he does the full charge into Salty, he scores with a dropped ball, and either way he definitely prevents a throw to third, functionally guaranteeing that Craig gets in safe. He got really lucky, but it’s still a baserunning error.

Finally, a bit of philosophical musing. There’s a healthy undercurrent of people saying “let the players decide the game, not the umps,” though less in this case than in other games. Not to put too fine a point on it, but that’s a crock of shit. For one, not making a call has just as much of an effect as making a call. For another, that philosophy rewards teams for going a little over the line with the knowledge that the penalty can’t match the crime, which usually degrades the quality of play and is unfair to the rule-abiding team. This leads to things like the holding on the Ravens intentional safety in the Super Bowl, endless moving screens in basketball, and defenders’ mugging forwards in the box on restarts in soccer because they know the ref won’t call the PK. It’s unsightly and unfair, and we shouldn’t encourage it.

The only time I can think of that the rules should maybe be called differently at crucial times is when the rules are intended to govern a part of the game that’s not really related to who wins and loses. The best example of this is the Pine Tar Game, where the rule was so clearly unrelated to Brett’s home run that it was moronic to alter a game outcome because of it. Other examples are things like time wasting and decorum calls in tennis (though those are hazier), the Jim Schwartz rule, and potentially broader safety rules like the pushing penalty in last week’s Pats-Jets game. If there’s no competitive advantage derived, then maybe don’t call the foul.

All told, it’s pretty hard to say that the Red Sox didn’t derive a competitive advantage, so I’m damn glad Joyce and DeMuth made the call. Maybe the NBA refs can take a hint.

That's a Clown Hypothesis, Bro!

Sports analysis and commentary, mostly empirically-based.

Category Archives: Musings

Does the Call Need to Stand?

Notes from SaberSeminar

What’s the Point of DIPS, Anyway?

Leaf-ed Behind by Analytics

Brackets, Preferences, and the Limits of Data

Justice, Unobstructed