<$BlogRSDURL$>

Tuesday, March 14, 2006

BOOK REVIEW: THE FIELDING BIBLE BY JOHN DEWAN
The last fifteen or so years have brought some exciting developments in the analysis of fielding statistics. The computer revolution has allowed researchers to track the destination of every hit ball, as well as its fate (hit or out?), allowing fans to get some objective glimpse into one of the most ineffable of the game's questions: who is a better fielder than whom?

Unfortunately, for many years, all fans have been given is the end data -- you can look up a player's zone rating, or catch a reference to his UZR on a message board, but you can't really do anything with the data yourself, to see if these end figures make any sense to you, in the same way we can look at batting stats. That's because play-by-play data costs several thousands of dollars to buy, and would take several man-hours to code and analyze, and 99.9% of us don't have the means to make that happen.

John Dewan's The Fielding Bible goes some way toward filling that gap. Though not as groundbreaking as its advertising would lead you to believe (in fact, it's not groundbreaking at all), Dewan's book is invaluable as a source of information and data, and includes enough verbiage and analysis to satisfy number-phobic readers.

Its opening salvo is an article by Bill James comparing Adam Everett's defense to that of Derek Jeter, whose poor defensive statistics have been a polarizing issue for years. This is an excellent article by James, as in the process of comparing the two he gives a cursory introduction to all of the statistical methods used in the rest of the book, and complements them with his own analysis of how the two shortstops look on tape (or, rather, DVD). The conclusion that Everett is a far better defender than Jeter will startle no one who's been following the discussion, but James provides a useful and clear summary to those just joining in.

Dewan follows with an introduction to the primary defensive statistical rating of the book, the Plus/Minus System. Essentially, every ball is tracked (by Baseball Info Solutions, of which Dewan was a founder after the purchase of his former STATS, Inc.) by a number of factors (where it's hit, how hard it's hit, and for outfielders, how far it is hit), allowing the BIS analysts to state how frequently a ball hit to X spot becomes an out. Players are then credited and charged for every out they make -- or fail to make.

For instance, assume there's a spot where a batted ball is turned into an out 26% of the time. If the fielder records an out, he is credited +.74 plays; if he does not, he is charged a -.26. Add this up for every batted ball over the course of a season, and you have a defender's Plus/Minus.

Dewan has also directed his collective to ascertain, for every position aside from the middle infielders, what the probability of any ball has of becoming an extra base hit; this way, they can state how many total bases were prevented by a defender. They call this "Enhanced Plus/Minus."

As an illustration, the best-ranked center fielder for 2005 was Aaron Rowand, who has a Plus/Minus of +15 and an Enhanced Plus/Minus of +30. This means that, by this method, Rowand prevented 15 more hits than the average CF, and prevented 30 more total bases than the average CF -- or, 15 hits and 15 extra bases.

Now, the book's cover tells you that this is groundbreaking, but it just ain't. This is highly similar to how Mitchel Lichtman produces Ultimate Zone Rating (though MGL uses run values, which we'll touch on imminently) and nearly identical to David Pinto's Probabilistic Model of Range (though PMR does not provide a figure for extra bases prevented, just hits prevented). Also, as far as I can tell, the Plus/Minus System is exactly the same as the original "Ultimate Zone Rating" introduced by STATS in their 2000 Scoreboard, though that iteration of UZR, like PMR, did not include any reference to extra bases. It's also reminiscent of the defensive work of Mike Gimbel, though Gimbel used actual extra-base data instead of probable, and had more extensive park adjustments (something else we will touch on below).

Dewan summarizes his system in the introduction (though, as a wonk about these things, I'd have liked more detail), and includes a small section about how, in the future, he'd like to convert the Plus/Minus figures to runs. He then states that a hit prevented is worth nearly half-a-run, so for middle infielders you can divide their Plus/Minus in half to derive runs prevented, and for positions with Enhanced figures, you can multiply by .2 or something.

This is demonstrably false, and it's a little frustrating to see this in print, especially as people have been working on this for awhile. Not only does an additional play made by a defender prevent a hit, it also creates an out. So a shortstop making a play has prevented a single (.47 runs) and recorded an out (roughly .28), so that's really .75 runs for the play.

(In a comment over on Rally Monkey's review of this book, Tango Tiger walks us through that, so if you don't believe me, check that out. MGL also touches on it in the UZR link I have above.)

In fact, people were doing this awhile ago; there is another defensive landmark I didn't mention above: Dale Stephenson's work with Sherri Nichols' Defensive Average data.

This was going on on rec.sport.baseball about a decade ago. Stephenson used his data to derive how many hits a defender saved and how many extra bases he saved. For instance, in 1991, Brett Butler saved 7 hits and 14 extra bases. Stephenson then converted this, rather easily, to a runs saved figure.

This is a lot easier than it seems. The run values for different kinds of hits are relatively stable, and are usually around:
1B:  .47
2B: .78
3B: 1.09
HR: 1.40
If you look at that closely, you discover that a hit is worth .47, and each extra base is worth .31.

Stephenson actually used .42 for a single, along with .31 for each extra base, so in that Butler example you get:
7 hits prevented at .42 runs each:     2.94 runs
14 extra bases prevented at .31 each: 4.34 runs
7 outs recorded at .30 runs each: 2.10 runs
For a total of: 9.38 runs prevented
If you follow that link above, you'll see that Stephenson actually gave Butler +9.24 runs, which is almost certainly a result of rounding off somewhere in the process. No big deal; if you really want to argue that a guy is a better defender than another because a method gives him an extra .14 runs, you're putting way too much faith in this stuff.

So, given that Stephenson's work is ten years old, it's a little frustrating to have this data in The Fielding Bible staring its authors in the face, and somehow everyone just missing it. That's a minor complaint, really. Because when I look at that Stephenson work, I think, "This is great! I sure wish someone was presenting the date like this today!"

Well, now, someone is. Aaron Rowand prevented 15 hits and 15 extra bases, and using the updated run values, we can estimate that (by this data, anyway), he prevented 15.9 runs. That's superb.

And this data is more revealing than just plain ol' Zone Rating or PMR, from which we can derive a hits saved figure, but the bases saved remains a mystery. The Plus/Minus System, like UZR, Mike Gimbel, and Defensive Average, not only gives us a means to measure a defender's "OBP Against," but also his "SLG Against." Honestly, this book gets my seal of approval just for displaying this data, though had they done the easy run conversions it would have saved the rest of us a lot of time.

(Of course, I haven't even touched on the reliability of the Plus/Minus figures. There are a number of things that probably should be adjusted for that haven't been, amongst them the handedness of the pitcher, game situation [first basemen have less range with a man on first base], and ballpark. Such adjustments take place in UZR, and I think it's likely the better for it. I don't think such lacks make the Plus/Minus System unreliable; its methodology seems to be fairly sound, and I think the breaks tend to even out over time, so looking at a player's Plus/Minus over a number of years, as this book allows us to do for regular players, should give us some insight into their relative merits.)

***

Anyway, after the Introduction, we get a one-year register, where we can view detailed information on anyone who played defense in 2005. Not only can we see how many hits and bases were prevented, but we can see how well a defender did going to his right or left (something of a misnomer; it's really how well they did going to the right or left of where someone in their position normally lines up, so we can't really tell if Orlando Hudson is great going to his left because of his talent or because he cheats that way, or both), and any number of marvelous things (how defenders rate on double plays, bunts, holding runners, etc.). This information will be presented again later in the book for regular players, and will show us figures for 2003 through 2005.

After the three-year register comes a sixty-page color section that's the hottest thing I've received in the mail since I swiped the Victoria's Secret Catalog from the disposed mail bin. For each team, we get a diagram of the field, and we can see at a glance how many hits were allowed all over the field.

For instance, last year the Angels allowed 178 hits to left-center and 171 to right-center; major league averages were 149 and 144, respectively. Were parks a factor? The Angels were able to get 153 hits to the LF-CF gap and 164 to RF-CF, so maybe -- but clearly the Jurassic Outfield was a bit slow at getting to the tweeners.

Combining information from all sections can reveal some interesting findings. For instance, the Angels were above-average at preventing hits to the 3B/SS hole, allowing only 91 when the major league average was 115. Park? Maybe (one year is too small of a sample), as Angel opponents only allowed 97 hits through that hole. Perhaps the Angels have thicker infield grass? We can't really tell from one year of data.

Also, oddly, our main shortstop was substandard going to his right (The OC was -8 plays going that way, and all shortstops of the LAA combined to be -3), and our third basemen as a group were +5 to their left, adding up to a modest +2 between the two positions. So how did the team fare so well in that area? I honestly don't know, but my guess would be that there just weren't a lot of balls hit there against the Angels.

A strength of the book is that all the data is there, if you want to try to figure that sort of thing out.

After the three-year register I reference above, we get to a section giving commentary on all regular defenders.

Dewan got help with this section, and credits his collobarators. Though there is no way for one man to have written every comment, the downside is that the quality of comments is somewhat inconsistent. Some are in-depth and put the Plus/Minus numbers we've seen in some context; others make you wonder if anyone ever saw the defender play. For instance, about Steve Finley, it is said, "[he] is still solid," which answers a question that has oft kept me up at nights: "How would I evaluate Steve Finley's defense, assuming I had the observational powers of Doug Eddings?"

Another odd comment comes in regards to Legs Figgins: "Some scouts feel that the Angels like to move him around because he can be exposed defnsively by just playing one position every day." Now, I'm not blaming the book for this, since they're reporting something, but that strike me as batty; we've seen that Figgins gets better at a position the more consistently he plays there. This happened at third base, and his solid defense at second in 2005 (when he had all of Spring Training to prepare) was head and shoulders above his defense there in 2004 (it had been awhile since he played there, and it showed).

Despite a few shortcomings in individual comments, the subjective evaluations are a good complement to all the cold data that make up the rest of the book, and this section should prove to be a worthwhile resource, and a quick and reliable guide for fans who don't like reading through charts.

Also, there are some interesting numbers in the commentary section. For each defender listed, we get home/away splits for 2005 (last year Manny Ramirez was -13 plays at home, but only -1 on the raod), as well as splits for defense behind right-handed and left-handed pitchers. That last choice strikes me as odd; I'd be more interested in seeing results based on righty and lefty batters. I also wish this data were presented for more years than 2005; if Khalil Greene was 14 plays worse [10.5 runs] at home than on the road in 2005, does that say something about the characteristics of the park, or is it just random?

Anyway, after the comments section, we get a few articles: Bill James presents a Relative Range Factor to try to make sense of traditional fielding statistics, Dewan presents an improved version of Zone Rating (I'll spare you the detail, but players' performances in and out of their "zone" are presented, which is very nice), and in following sections we get data on infielders fielding bunts and turning double plays, and outfielders holding and/or throwing out opposing baserunners. Again, it's somewhat frustrating that the last category isn't adjusted for anything; as you all know, there are some batted balls and situations where the slowest runner will advance on the strongest-armed outfielder, and others where the weakest arm will hold a speedy sir; such factors include where the ball is hit, the number of outs, and whether or not a baserunner was in motion with the pitch. Numerous people have looked at this to some degree, including recently John Walsh in the Hardball Times, and people like Dan Fox and James Click have looked at baserunning in such a way that doing the "flipside" for outfielders holding runners should be a fairly simple task for those with the resources of BIS. The outfield arm ratings in The Fielding Bible, while not meaningless, are identical to those found in STATS' inaugural Scorboard that came out more than fifteen years ago. Bill James rounds out the articles talking about Defensive Misplays, which he is trying to objectively record.

The Fielding Bible isn't perfect, but to the author's credit, it doesn't pretend to be. It positions itself as a first step, and while its information is a step already made, the presentation of this information is strong enough to make this book unique and eminently useful. Hopefully, Dewan and BIS will continue to publish this book, make its results available during the season, and start to integrate run conversions, park adjustments, and other factors into their ratings. With more and more people able to look at the data and make their own conclusions and toy with the numbers, our enjoyment and understand of the game should be enriched. The Fielding Bible is a good step at making that a reality.

Comments:
David Gassko also has a review on the Hardball Times. Excellent. Its like we're in Saber School and we all get the same assignment for Book Reports.
 
Wait -- you mean I'm not being graded on this?
 
Excellent review, and I agree with all of it.

-- Tangotiger
 
It does not approach me. There are other variants?
 
Post a Comment

This page is powered by Blogger. Isn't yours?