The Sabermetrics Revolution
Will Leitch at New York magazine has a piece on the Cardinals’ hacking scandal that includes some very strange, to my thinking, notions about how the sabermetrics revolution affects, or should affect, our enjoyment of the game:
Over the past 30 years, the Moneyball revolution and the dominance of statistics and analytic thought have altered the way we watch baseball to the point that you have to sort of remind yourself to have a good time at a regular-season game. If advanced baseball theory has taught us anything, it is that one game absolutely does not matter. Every game, every inning, every out, every pitch, falls into what the world of baseball considers “small sample size” — just another bit of data thrown into the yawning pit of Analysis. Performances that used to play as heroic are now dismissed as freak accidents. And those peanuts, the Cracker Jacks, the seventh-inning stretch, all that crap you used to do with your dad, they’re just quaint accoutrements to a data set, the result of this one game merely a deviation destined to regress to the norm over time.
I find this attitude mysterious. What is new about modern advanced statistics? Leitch seems to be arguing that the concept of small sample size is new. This is ridiculous. It’s not as if Bill James was the first person to realize that a few games are not necessarily representative of, well, anything. In the 1860s they called this the “glorious uncertainty” of the game. The catchphrase was used so often that it was subsequently abandoned as clichéd.
The Moneyball revolution isn’t that we now use statistics to analyze baseball. We have been doing that since before the Civil War. The revolution is a realization about the traditional statistics–the ones printed on the back of the bubblegum cards we stuck in the spokes of our bicycles as we rode past the white picket fences in Hometown, USA. It is the realization that they suck. The point of advanced stats is to replace the old, sucky stats with better, more meaningful stats.
Batting Average and On Base Percentage are both fractions. The discussion is over that the numerator and the denominator should be. This is true even of more exotic modern stats such as BABIP. This ought induce an existential crisis in no one. There are some new concepts, but seriously: you can’t enjoy a summer afternoon at the ballpark because of this? Really?
Of course Leitch is far from the first person to claim this. Something is going on. I don’t disagree with that. I just think that Leitch et al. are far off the mark on identifying what it is. I think there are two changes, one permanent and one temporary.
The permanent change is that stats are now available in real time. When I was a kid, the scoreboard showed a player’s batting average as of that morning. Now it shows the batting average as of that at bat. We used to turn to the Sunday paper to find the averages of everyone on the local team, plus the league leaders. Nowadays I can go online and find a huge range of stats for every player, updated daily. I can see how it would be easy for someone to be overwhelmed, convinced that reality is found in these stats, not on the field.
The temporary change is that these new stats are, well, new. They won’t be to the next generation. This too is nothing new. We are in a transitional period, but it will stabilize, with a new set of standard stats. People of my generation have this notion that the traditional stats– Batting Average and Earned Run Average and Runs Batted In–are timeless and obvious. Neither is true. They aren’t timeless. They are the product of a process of discernment that occurred in history and spanned decades. They only seem timeless because these decades were long ago: the half century or so beginning about 1860. They aren’t obvious. They only seem that way because they were on the back of those bubble gum cards we were putting in our bike spokes. Run them by someone who doesn’t follow baseball, and you will get blank stares. Or take a look at traditional cricket stats, which are even older than baseball stats. They will be utter gibberish to the typical baseball fan.
Here is my prediction. The next generation will grow up with better mental filters, and won’t find the availability of all this information so overwhelming. And they will find OPS and WAR, or perhaps the successors to OPS and WAR, perfectly familiar and obvious.
In the meantime, take a ten year old kid to the ballpark. If you don’t have one of your own, I’m sure you can borrow one. Enjoy the game. Ignore the scoreboard. Sing Take Me Out to the Ballgame. It’s fun.
Yay! Thank you for this!
Stats are fun playing around on a computer and putting together a fantasy team. But the stats are intended to analyze and understand what’s happening on the diamond: A .303 OBP so far for the season is all well and good, but another way of looking at reality is, the batter either makes it on base this time, or he doesn’t. And that other way of looking at it is in most ways a whole lot more meaningful.Report
To me, Leitch’s piece is another pebble in a huge gravel pit labelled “Math is soul-killing”. The entire genre seems to me, a math guy, like it is based on a category error.
Though to be a bit more fair, there are always going to be things that the math doesn’t model, that the model doesn’t capture. Right now, I din’t know of any math that captures the emotional effect a player can have on a team. Though I think that effect can be both significant and overrated. Ya gotta have heart, but a .475 slugging percentage is also pretty handy.Report
I think there’s some John Henry vs the Steam Engine in there, too. People don’t hate the new stats in themselves so much as they see the move to Big Data-ism robbing the game of elements of heroism. Moneyball pushes older heroes out the door faster, or seems to deny that a manager’s Sixth Sense is real, or seems to makes it less likely that we’ll see players struggling against and overcoming “the odds.” Instead, we’re more likely to see a statistical-probabilistic answer or supposed answer to some traditional questions relied on.
Personally, I’ve come to like seeing how some of the new tactics play out, and anticipate periods of re-adjustment or adjustments to the adjustments to the adjustments playing out in the future, but it still rubs me the wrong way when, say, a pitcher pitching a great game is taken out in the 8th inning because he’s hit his number…Report
This. Also the stats do suggest that certain things people long believed are wrong (lineups don’t matter, bunting is usually stupid) while other things people obsessed over are near insignificant (pitcher wins, RBIs).
That’s not a message anyone receives well in any context, including others more commonly discussed here (e.g. cutting taxes cuts revenues). In the context of recreation, dismissing someone’s beliefs (and being right to do so) makes the whole thing less fun.Report
It isn’t that lineups don’t matter… it is that they don’t really matter in the way people thought they did. I think the #1 spot gets 50-60 more PAs per season then the #9 spot so making egregiously bad lineups (like all those teams that put shitty hitting speedsters in the top spot) can cost a team in the long run.Report
It’s a few runs (less than one win) per season generally.Report
Great piece.
As someone who has more or less bought in fully to the analytics revolution, the one difficulty I have with “new” stats is accessibility. If I know a guy has 24 HRs and 118 RBIs coming into the game and then he hits a 3-run shot in the third, I know he’s now got 25 and 121. Even BA, OBP, and other rate stats… I may not know them exactly but if the guy came in hitting .262 and he goes 3/4, then he’s probably somewhere between .268 and .272 at the end of the day.
WAR? OPS+? wOBP? I have zero idea how they are calculated and would never be able to figure them out on my own and, to me, that feels like a bit of a loss.
On the whole, these new stats are huge plus for the games themselves. But they aren’t without downsides.Report
Recognizing your first bit, I’ll just note that OBP is easier to calculate on the fly than BA. wOBA seems similar to AVG (the math is messy, but you know how it moves).
WAR, or anything involves defense, really is near impossible to track live. Of course, if what you want is a stat to help you follow along live, that’s WPA and LI.Report
@nevermoor
But you can’t figure out WPA and LI while sitting in the stands with a scorecard in your hand. If they flash it up on the big screen, that’d help. Or if you had Tango’s book handy (I think it had all the charts for expected winning percentage?), maybe.
At the same time — what the author criticizes — watching the game in this manner is what makes baseball great. We don’t need advanced metrics to know that Mike Trout is a better ballplayer than Bartolo Colon (though they can help us understand just how much better and why). And yet, when Trout digs into the box and looks out at Bartolo, we have no clue what the outcome is going to be and can just settle in for the battle.Report
In a sport with a 161 game season is there any stat you can really track ‘live’ from the stands? Maybe total home runs and stolen bases?Report
WPA and LI both, though @kazzy Is right that you need an Internet connection to do so.Report
But that’s the consensus opinion, right? If you’re on the Internet at the ballpark ur doin it wrong?
(And in contrast to NFL games where, when i was growing up, people would routinely have radios and Sony watchmen in the stands)Report
I don’t get the “ur doin it wrong” concept, so long as you aren’t interfering with the enjoyment of the people around you. I have little patience for declarations about how “real fans” act, which always match the self-image of the person making the declaration. I personally don’t feel the need for internet access while at a game, but what of it?Report
@kolohe
Not a lot. But it isn’t just the ability to calculate. It is the accessibility.
If you tell me a guy hit 24 HRs, I know that 24 times he hit the ball over the fence (assuming no inside the park jobs). Now, that is about ALL that tells me… but it tells me something very specific and very precise.
Now if you tell me a guy had 7.0 WAR for the year, I know he had a great year… but that is about it. I don’t know whether he did it with offense or defense or base running or power or plate discipline or what.
The WAR stat is more useful in just about every way when it comes to analyzing the game. And yet, it is a harder stat to comprehend because it requires a computer and a formula that pretty much no fan knows. You can see a guy hit a HR or strike a guy out or steal a base or clear the bases. You can’t see him accumulate WAR in any sort of meaningful way.Report
With home runs you have picked the lowest hanging fruit. How many fans actually know how to calculate batting average? Sure, it is just hits divided by at bats. But that just pushes the complications down a level. How many people really know what constitutes a hit and what constitutes an at bat? Even the more knowledgeable are likely to be a bit weak in the marginal cases. (Quick, without looking it up: man on first, the batter hits a sharp ground ball that splits the first and second basemen; but it hits the runner. How does this affect the batters BA?) (How about if the batter is given first on catcher’s interference?)
There also is a bunch of non-obvious ideology built in here. Man on second, no outs; the batter bunts the ball, is thrown out at first, but advances the runner. This is a sacrifice hit, and does not affect the batter’s BA. Same situation, but this time the batter hits a ground ball to the right side; he again is thrown out, but advances the runner. This is, for BA purposes, simply an out, and the batter’s BA goes down. I actually think that this is the right way, because I don’t believe that the vast majority of batters can intentionally direct the ball in any particular direction on a full swing. But there certainly is an argument to be made the other way.
This is before we even get into the question of what is and is not an error, with the answer being that an error is what the official scorer sitting in the press box thinks is an error.Report
Again, I am very pro-analytics. I’m just saying they are not without their downsides.
If someone asks me what HRs or BA or ERA or even OPS (which I think sucks!) are, I can tell them precisely. I can’t do that with WAR. Not the same way, at least. That’s all.Report
I think Kazzy has the nub of it – the formulas of the new stats are just too complex to know what they are (even the acronyms don’t really tell you anything). This leads to a certain confusion that is unsettling to the old poetry of the game: #1 Speedy OBP, #2 Contact Lefty, #3 Best Hitter, #4 HR Hitter, etc. This is where I’d normally insert a “but,” but, there’s no but. The game has moved on.
What I like about the current era is that while the book is being re-written, there is an asymmetrical adherence to said book. Some managers don’t really get “the book” so, ironically, they follow it to the letter. Other managers seem to get “the book” and enjoy pantsing the managers that coach the book without knowing why the book is “the book”.
As a Cubs fan from youth, I’ve enjoyed watching Joe Maddon for the first time. He is sometimes called unconventional, but actually he’s the measure of convention. His line-ups are simply designed to get his best hitters the most at-bats, with a declining order in (secondary) preference of OBP – it turns out that runners = runs and runs=wins. Yawn.
What Maddon does after this is what makes it interesting; since this is an evolving game, I don’t pretend to understand all of Joe’s methods – but one has stood out an really impressed: Anthony Rizzo stealing bases in front of (rookie) Kris Bryant. See, “the book” dictates that you should never steal with Rizzo (who is slow, but young) in front of Bryant (who is young, but hits). So, since the sabermetric book says no, the opposing pitchers/managers would focus a little bit more on the rookie since they could safely assume Rizzo would remain planted at 1st. Until Rizzo started sprinting for 2nd. 17 times in the first half. At a rate 400% higher than his previous years. And for a while, he was never thrown out. Two things were in play (that I can see), as Pitchers focused on Bryant, their times to Home Plate were increasing… so, to protect Bryant, he would send Rizzo against the book – because the book assumes that all things being equal, you don’t send Rizzo… but the opposing teams were taking the book for granted and forgetting the “all things being equal” part and saw their times to Home increase. Really, it was artistically beautiful… but perplexing to Cubbie fans – both old timers (Rizzo shouldn’t steal, its not his role) and even new timers (Rizzo shouldn’t steal – he’s slow, and never give the team an out).
So? Well, I’d guess I’d say that I agree with both Richard and Kazzy and that the Rizzo/Maddon escapades simply illustrate how the game sort’a doesn’t make sense to *anyone* right now… but that doesn’t mean there isn’t some new poetry to discover. What I cannot foresee is whether Baseball is really all that compelling a game to master Time-to-Home as a pre-requisite for appreciating it.Report
Kids like the hot dog and the popcorn. I like the dog and the peanuts still in the shell. I like the beer, too… although $18 for a 12 oz pour is a little ridiculous.
I used to keep score, way back when. I don’t any more… because I don’t need to do so. I doubt I could even mark up the scorecard properly any more.
Doesn’t cost me any enjoyment in the game.Report
And the organ music and the chanting and the kissie-face cam and the cathedral-like look of the park with the immaculate grass. And the suspense of not knowing what’ll happen next, and second-guessing the manager and reading the players’ body language and arguing with your friend about whether that pitch was just inside or just outside.Report
I went to the Mets game on Sunday. We maybe spent half of the game in our seats, with the other half of the game spent walking around the stadium (it was hot and we have an easily bored 7 year old). It was a wonderful time – I taught my daughter how to keep score, the Mets won, the food was waaaaaaay better than it was in the old days (I had sweet potato fries with a basil dipping sauce, a hot dog smothered in fried pastrami, and a $9 microbrew).
The hero of the game was Kirk Nieuwenhuis, who entered the game with traditional stats of .091 BA (he was 6 for 66 on the season), 3 RBIs, and 0 HR, despite having spent the majority of the season thus far on major league rosters. His sabermetrics stats included a no-less embarrassing .230 OPS and a negative WAR.
By the end of the game, he had a .143 BA, 7 RBIs and 3 HR. He also had a .564 OPS and had brought his WAR to -0.0.
In terms of enjoying the game, though, the new and old numbers both told the same story – a dude who was having a really tough year had an absolutely monster game, and that was the only thing that mattered.Report