Latest Forum Topics
Search
Sponsors

Featured Article

MJH on accountability

Sponsors

Sponsors

Forum > Sabermetric stat to substitute for batting average/OBP?

I know how sabermetricians prefer to use stats like slugging % and OBP over BA and RBIs, but is there any kind of sabermetric stat out there that indicates what a player's batting average or OBP should be based on the quality of balls put in play? Now, when I say "quality of balls put in play" I mean whether they are line-drives (LD), ground balls (GB), or fly balls (FB). If I assign a numeric value to each, I hypothesize that I can determine how often a batter SHOULD get on base. In the spirit of FIP ERA, I think this could be a potentially workable formula to try and eliminate the defensive "luck" based element of the game.

I'm sure that someone out there with a genius superior to mine has either developed a formula similar to this or experimented with it and determined that such a calculation is mathematically impossible. Anyway here is what I came up with to try and adjust batting average and OBP.


1. Take batter, John Doe's, stats from these categories: at-bats(AB) line-drives (LD), strikeouts (SO), walks/HBP (BB), homeruns (HR), ground balls (GB), fly balls (FB), and sacrifice hits (SAC)
For example, lets say John has AB: 500 - LD: 225 - SO: 25 - BB: 50 HR: 20 - GB: 125 - FB: 105 SAC: 10

2. I'd like to cite the idiom that "a walk is as good as a hit" and add back in the BB and SAC to AB to arrive at plate appearances. (500+50+10=560)

3. Then take the totals from LD, SO, BB, HR, GB, FB, and SAC and divide each by the number of plate appearances. This tells us how often a plate appearance from John Doe results in each outcome by giving us a percentage. (LD%: 225/560=.401 - SO%: 25/560=.045 - BB%: 50/560=.089 - HR%: 20/560=.036 - GB%: 125/560=.223 - FB%: 105/560=.186 - SAC%: 10/560=.018)

4. Then take those quotients and multiply each by a unique constant that reflects the average league outcome which could be determined by how often the average player gets a hit on a LD, GB, or a FB. Let's say you take the play-by-play box score analysis which specifies whether each ball hit into play was a line-drive, ground ball, or fly ball. For this purpose a line-drive triple to the gap is treated the same as a single as long as both are refered to as "line drives". A bloop single is treated the same as a fly ball, and a bunt is treated as a ground ball. Determine what percentage of each results in a hit. Let's say 70% of the time, a line drive was a hit, 15% of the time a ground ball was a hit, and 10% of the time a fly ball was a hit. Then you take the John Doe's percentages from step three and multiply them by the probabilities that they would result in the batter getting on base. (LD%: 401*.70=.281 - SO%: .045*0=0 - BB%: .089*1.00=.089 - HR%: .036*1.00=.036 - GB%: .223*.15=.033 - FB%: .186*.10=.019 - SAC%: .018*0=0)

5. Lastly, add up the final products, and you get the total percentage of time that a runner should reach base, given the probabilities based on league averages of the quality of balls put in play. (LD%: = .281 - SO% = 0 - BB% = .089 - HR% = .036 - GB% .= 033 - FB% = .019 - SAC% = 0), so .281+0+.089+.036+.033+.019+0=.458.

Obviously this model doesn't work as the probabilities were just my estimates. But if they can be identified, I think this could be a decent formula. Any thoughts?

May 31, 2011 at 1:57 PM | Unregistered CommenterDavid G.

David G,

Are you sure you're not Joe?

Based on length of post subject and length of post, I thought it was a given.

May 31, 2011 at 3:24 PM | Unregistered CommenterScooby Dude

Yes. I assure you that I am not Joe. I guess I can see why you would make that assumption. I am, however, geniunely curious to know whether or not a variation of this formula could be a substantial approach to evaluating hitters. After all, this is a sabermetrics-friendly board, right?

May 31, 2011 at 3:49 PM | Unregistered CommenterDavid G.

That's funny Scooby...before I got halfway thru his post, I scrolled down expecting the author to be Joe. Not because of the content or subject but because of the post length and numbered bulletpoints.

Interesting idea tho. You would think that there would be a stat for hitters similar to FIP ERA...I mean there is a stat for EVERYTHING in baseball, right?

May 31, 2011 at 3:58 PM | Unregistered CommenterAdam

Well, Adam, that was kind of what I was thinking. It seems like a fielding-independent batting average or OBP would be appropriate. And for FIP ERA there is a constant (usually between 3.10 and 3.20) that is determined by league averages. You would think that a similar standard could be established on the offensive side. Are line-drive, ground ball, and fly ball percentages actually kept as an official stat by any resource like the Elias Sports Bureau?

That really is the only difference between the FIP ERA and this proposal. The FIP ERA completely disregards the quality of hits and leaves statisticians to lump the majority of at-bats into BABiP, which can be categorically misleading. Of course all of the key factors in FIP ERA (HR, BB, and SO) are reflected here in a similar fashion. But instead of giving them values outright, it treats them like OBP. (i.e. you have a 100% chance of reaching base on a HR or BB, and a 0% chance of reaching base on a strikeout) Therefore the primary factors that affect a player's adjusted BA/OBP are those that measure how often and how well-hit a ball in play actually is.

May 31, 2011 at 5:08 PM | Unregistered CommenterDavid G.

Are line-drive, ground ball, and fly ball percentages actually kept as an official stat by any resource like the Elias Sports Bureau?

Yes. FanGraphs has batted ball types, using the database of Baseball Info Solutions.

The FIP ERA completely disregards the quality of hits

That's because there isn't much (if anything) to indicate that hit "quality" among types (GB, FB, LD) significantly varies over significant sample sizes.

One thing you can do to get close to what you are looking to do here is to calculate a player's xBABIP and adjust his wOBA accordingly (e.g. this piece on Hamilton last year).

May 31, 2011 at 11:18 PM | Unregistered CommenterRangers100

I think what you're looking for is wOBAr. It can be found at StatCorner.

It was developed by Matthew at LookoutLanding and it's debut article can be found here:

http://www.lookoutlanding.com/2010/3/8/1362878/trar-and-wobar

Note that you need a sufficiently large sample size for it to mean much.

June 1, 2011 at 6:54 PM | Registered CommenterPrashanth Francis