The Crucible
"Say, didn't you almost get traded for Mark Teixeira that one time?"As warmer weather brings spring training updates to cast off memories of a particularly harsh winter and help ramp up for another year of baseball (baseball? Baseball!), I’m looking forward to getting out to the ballpark and experiencing one of my favorite spectator sports. In between debating the finer points of barbecuing and chili-making, I have been thinking about how difficult it is for a baseball player to even reach the majors to hear the roar of the crowd at RBiA. Most of us can recall a time in the spring (many) years ago when we tossed around baseballs pretending to be the next Nolan Ryan or Kirby Puckett. At the time, we had little idea of how hard it actually is to become a major league player; I am not sure we even fully realize it today.
Baseball executives and scouts need to not only be able to understand and evaluate the talent required to get to the majors, but also how to evaluate major league talent in comparison to each other. For us to understand why they make the moves they do, we need to understand the underlying principles that guide the practices of a smart baseball front office.
Those familiar with sabermetric statistics in baseball have probably heard this oft-espoused tenet: a team should not overpay for below-average or replacement-level players. This started as a reaction to baseball teams often overpaying for virtues like home runs or a pitcher’s win-loss record that do not correlate very well with actual contribution; it was a matter of using better tools to measure actual contribution. However, the trend develops further once one looks at the economics of baseball and realizes that the greater quantity of replacement-level players makes them inherently less valuable.
So the question then becomes: what exactly is a replacement-level player? How good or bad are they in relation to a league-average player? What about in relation to elite players? How many more replacement-level players are there in relation to average players?
It was trying to answer this question that I charted the occurrence of yearly wOBAs (weighted on-base average) from 2009 and 2010 and ran into an interesting problem. The basic distribution changes as the minimum plate appearances (PA) are increased. At a minimum of 50 PA (or about 10 games), the distribution looks exponential or like the far end of a normal curve:
This helps prove the point that the sabermetric community has been saying about replacement-level to average players -- there are many of them, and you should almost never have to overpay for them. (There are times when certain temporary restrictions due to position or fit for team may come into play, but this is speaking in generalities.) As a corollary to that, it also means that if you have what you expected to be at least a replacement-level player on your team who is giving you worse than expected performance, it should be a relatively easy matter to go out and acquire someone who can replace them. It’s not just that there are more replacement-level players available out there -- there are exponentially more.
However, as the minimum PA is increased, an interesting thing happens. Looking at the .gif below, the exponential distribution seen with the low PA disappears and becomes a distribution approximating a normal (bell) curve. This happens because, despite the decreases at the high end and middle of the chart, there is a disproportionate decrease in data points at the low end of the chart.

At first, this may seem a bit esoteric, but it ‘s a quite impressive example of how good Major League Baseball is at getting playing time to those players who are performing. Players who are performing at the low end of the spectrum get rotated out as various other options are tried.
Another benefit is that being able to visualize the distribution can help define at what level a replacement-level player performs. Defining replacement level is a bit of a difficult proposition, and there are a few different approaches; the most common approach is to use the quick and convenient 20 runs below average. With average being a .330-.335 wOBA over this time frame, the 20 runs below average estimate ends up looking like a .290 wOBA over 500 PA.
A better method to evaluate where replacement level is would be to look at the chart and find the region where the distribution starts to deviate from the exponential curve. Where that occurs is important because it means that even though there are theoretically more players who are available and can play at that level, there are simply not enough spots for MLB to be able to get them playing time. Based on the data shown for 2009 and 2010, this would fall roughly around a wOBA of .290, the same as the estimate. The difference is that this lends more precision and gives a better understanding of where that number comes from.
The players who accumulated 500 PA despite having such a low wOBA in the .gif above often have other skills -- like playing a difficult defensive position -- that make up for the decreased hitting performance. When looking at the total picture including offense and defense using WAR (based on wOBA for offense and UZR for defense), the picture becomes even clearer. WAR easily denotes a replacement-level player (0 WAR); with a low minimum PA of 50, the distribution is exponential and resembles the talent distribution; with a high minimum PA of 500, the effectiveness of major league baseball at weeding out poor players becomes evident as the distribution becomes relatively normal.
To make it to baseball’s highest level is a challenge in itself. Every step towards that minimum threshold to be a replacement-level player removes dozens of elite athletes from the pool of available players. Yet it is at this level that most of us are first really introduced to the players and, thus, it is from this level that we set the baseline in determining who is good, who is great, and who is "bad," to making it to the highest level of the sport only begins the next sorting process, the hardest one yet, and the difficulty of performing in this crucible completes the selection process from millions of children down to a very few elite players ... who then become the icons for the next generation of hopeful children.
Analysis,
Spring Training 
