Playoff teams and records

There are two things wrong with this post. One, it's way too early to say anything about the playoffs. Two, all publishing conventions scream not to publish posts on Saturdays. But after seeing one too many people saying something akin to the following:

I decided I had to act fast, so here we are. Let's get this over with.

Out of the 156 teams to have made the playoffs since 1995, when Major League Baseball first introduced the wild card, 60 of them had a losing record against teams that had a winning percentage above .530 during the regular season.

Let me repeat that in a slightly different way: 38.5% of playoff teams in the last 19 years had a record below .500 against playoff-caliber opponents that season.

All of you should be giving a collective "no duh" right now. This is a fundamental aspect of baseball, one team wins and one team loses and when you include a couple extra good teams who just missed the playoffs, you invariably have some teams with an above .500 record and some with a below .500 record against playoff-caliber opponents.

And yet most people if asked blindly would probably say a below .500 record against playoff-caliber opponents was a near death stroke to playoff hopes. Just four games into the series against the Braves and some have pegged the Nats 1-3 record as a key indicator that they'll miss the playoffs in 2014.

Yes 38.5% does not mean the odds are in a team's favor if they finish below .500 against playoff-caliber teams, nor is that a complete list of every team that does so. Still, that's a hefty percentage of teams.

"But Jamessss," you say incredibly whiny in my imagination. "Just because they make the playoffs doesn't mean they'll do well. Surely being below .500 against playoff-caliber teams means they won't succeed in the playoffs."

To which I answer "Shut up, of course I have some data on that too and stop calling me Shirley."

Here we see the aforementioned 156 playoff teams with the number of wins they had in the playoffs compared to their win percentage against playoff-caliber teams that year.


Well that's just a gigantic mess that has no correlation whatsoever. But that isn't very rigorous and you dear reader clearly demand more. So we press on and calculate the Root Mean Square Error (RMSE) using the formula for the best-fit line displayed on the graph above. This will tell us how accurate our predicted playoff win total based on the win percentage is compared to the teams' actual playoff win total. For this data the RMSE is equal to 3.7 wins. In other words, on average our prediction is within 3.7 wins of the actual total in either direction.

To better illustrate this I grouped teams into buckets by their win percentage against playoff-caliber opponents, with each bucket covering .050 points of win percentage. Here's a graph of each bucket's average number of playoff wins compared to their average win percentage against playoff-caliber teams.


Here we now see a much nicer linear relationship, with a good correlation coefficient. The formula for this best fit line has a RMSE of 3.8 wins, essentially in line with our previous estimation.

So now we have a question to ponder: how helpful is being able to predict on average a playoff team's total playoff wins in a season within 3.7 wins? That's practically an entire series and is in fact more wins than a team needs to win the divisional series.

So we can see there is a bit of a correlation between these two factors, but not enough of one to be particularly helpful or worried about. And especially not in April when we don't even know which teams will end up winning 86+ games by the end of the year. So if you would like to freak out over every little occurrence in every single game, please do so offline. Or maybe talk to your doctor about some anti-anxiety medication.

