You really have to feel for Cubs’ pitcher Jeff Samardzija. He’s been pitching great so far this season, with an ERA (as of this writing) of 1.62 and 51 strikeouts in 61 innings pitched over 9 starts. But thanks to essentially no offensive support and crappy defense behind him, he’s stuck with an 0-4 record.
This, along with a few other unusual situations (e.g. Chris Sale in 2013 (who, by the way, was ROBBED of the All-Star Game MVP that year), Felix Hernandez in 2010), has led a number of fans and writers to argue that the “pitcher’s wins” stat is irrelevant, or at least vastly overrated. First, they argue, it is too dependent on factors outside the pitcher’s control – like run support and defense. Secondly, they note that the assignment of a Win is often arbitrary, as it can depend on the whim of the official scorer. Other stats, such as WHIP (Walks and Hits per Inning Pitched) and ERA+ are far better at showing the quality of a pitcher.
That might be true. But those stats, being derived from odd and occasionally arcane formulae, can have similar problems. If we accept that these new, advanced metrics can accurately describe the “quality” of a pitcher, how does the “old school” stat of Wins compare?
Fortunately, there’s a way to tell. But you have to step away baseball stats for a while, and dive in to the mathematical field of Statistics….
There’s a number called the “Coefficient of Correlation” (or “Correlation Coefficient”, if you want to use one less syllable). Normally given by the letter “r“, it’s a way to help you decide if there is any correlation between two variables. A perfect correlation has gives r=1 (or r=-1 for an inverse relation where one variable decreases as the other increases); when there’s no correlation at all, r=0. It’s very easy to calculate values of r when every spreadsheet program worthy of the name has the function built in. So it’s just a matter of choosing a data set.
I went to Baseball Reference, and got complete pitching stats for 2013. A big enough data set for anyone – especially when you’re going to let Excel do all the number crunching. However, I did decide to limit my data set. I agree with the critics that the assignment of a Win can be both arbitrary and highly dependent on outside circumstances – but mostly for relief pitchers. So much of relief pitching depends on the game situation, and it has become so highly specialized these days, that I don’t think any of the usual metrics have any real merit there.
I decided to limit my set to pitchers with 20 or more starts. I figured that would eliminate relievers and spot starters, leaving only full-time starting pitchers. After all, starters are the ones who will be racking up the wins, and they are the only ones for whom a Won-Loss record is worth noting.
As far as comparison stats, I decided to go with WHIP (Walks plus Hits per Innings Pitched – a good, easy to understand metric), ERA+ (a modified version of Earned Run Average that takes into account the league average and park factors), WAR, and WAA (Wins Above Replacement and Wins Above Average; two of the new advanced metrics that combine all manner of stats in one arcane formula that supposedly reduces everything into one convenient number).
Now to crunch the numbers….
At first glance, it looks like there’s a pretty good correlation between Wins and our chosen stats. The data points make nice ovals with a clear axis. But just how good are the correlations? The values for r are in the middle. None are zero, so there is some correlation, but they aren’t close enough to 1 for the correlation to be mathematically obvious. The r value for WHIP vs. Wins is negative, since a lower WHIP is better and correlates with more wins.
Of course, mathematicians have developed a method for determining just how strong a correlation is. I’m not going to get in to the math here, but it is highly dependent on how many data points you have. With over 120 in my data set, that’s a heck of a lot of data. There are plenty of tables and even a calculator online.
What we get when we check for our data set of 120+ pitchers is that our r values indicate an extremely high likelihood of a correlation between wins and our other stats. With WAR and WAA, it’s somewhat to be expected, since “wins” might be taken into account somewhere in their calculation.
Now I know very little about statistics and their methodology. I will leave a more detailed analysis to those better able to do it (for one thing, this correlation could and should be done with a wider variety of measures of a pitcher’s effectiveness). But I’m convinced that the “win” stat does have validity, especially for a starting pitcher. After all, “wins” are how teams are organized in the standings. True, there’s not going to be a big difference between a pitcher with a 21-5 record and one with a 19-7 record. But both are likely to be significantly better than one with a 14-12 record.
So don’t ditch the “win”. Keep it; it’s a good measure of overall quality for a starting pitcher. Just remember that like any other stat, it has its limits.