For an article I’m working on (plan to see it at Hardball Times some time TBD), I had cause to analyze the data from the Fans Scouting Report run by Tom Tango. While FanGraphs hosts the data from 2009 onwards, I couldn’t find a clean version of the 2003–2008 dataset, so I pulled the data off Tango’s site and then did some annoying but largely insubstantial cleaning so they can be combined with the data available on FG.

For anyone that wants to read the code or use the data, I’ve posted the code and datasets on my GitHub. If you click around, you may notice that that’s all I have up there, but my intention is to post code and data for articles from now on, and ideally go back and fill in some of my old posts as well.

Finally, if you haven’t read my most recent piece, it went up about 5 weeks ago at THT; it’s a slightly out-there proposal to rearrange baseball’s schedule and alignment to improve the quality of the regular season. Read it here.


