Friday, January 07, 2005

       G   AB  H  2B  3B  HR  SO  BB  SB  CS  AVG  OBP  SLG  OPS+  EqA  ZR(RF)

2004 156 612 206 39 2 39 74 52 15 3 337 391 598 154 .322 .871(5)
Car. 156 588 191 36 5 34 75 58 19 10 325 390 589 147 .315 .886
Pro. 144 546 179 33 2 34 68 64 19 8 327 401 585 149
Was Vlad really the MVP of the league last year?

As I see it, there are/were about a dozen players in the running. Let's take a look at them all as measured by several different metrics that are out there. I like looking at a panoply of different measures because no one of them is perfect or tells us everything.

(Key: What we have here are Clay Davenport's Pitching Runs Above Average or Equivalent Runs Above Average from Baseball Prospectus; Keith Woolner's Value Over Replacement [Position], also from BPro; the player's ERA+ or OPS+ as reported by Baseball Reference; and Lee Sinins' Runs Saved or Runs Created Above Average as from The Hardball Times Baseball Annual.)

For summary statistics, including offense and defense, we have Win Shares as reported by The Hardball Times and Davenport's Wins Above Replacement (Position) 1 from BPro. After Vlad, they are all listed alphabetically.

              PRAA        ERA+  RSAA

Vlad 51 88.5 154 56 29 9.2
CGuillen 31 70.5 141 36 23 8.9
Hafner 44 70.9 158 51 22 6.6
Ichiro! 39 80.9 135 56 27 8.6
Matsui 36 57.5 139 44 30 6.3
Mora 44 73.6 149 54 25 6.4
Ortiz 38 71.3 145 46 25 6.4
Ramirez 43 68.6 152 49 27 6.7
A-Rod 34 62.3 133 35 30 8.6
Santana 48 88.8 182 54 27 11.4
Sheffield 38 63.4 143 40 31 7.3
Tejada 28 73.0 126 23 30 9.9
Okay, let's pare this down.

First of all, we can throw out the Yankees immediately. The only reason I have them on here is because of the Win Shares. Y'see, there's this kind of glitch in that system ... if a team wins more games than you would expect from their runs scored and allowed, those extra wins get distributed evenly to the team's players. The Yankees won a lot more games last year than they "should have," so all their players benefit. In theory, that seems fine, but in practice, I think it's silly. Sheffield and Matsui were not legit MVP candidates, and A-Rod was very good (fueled by excellent defense) but not quite there, either. They gone.

Sticking with the AL East, Manny Ramirez is a wonderful hitter, but he can't field and at all and is a disaster on the basepaths. In the meantime, David Oritz doesn't field at all, nor does Travis Hafner. Melvin Mora: good bat, bad glove. Good hitters all, but see ya.

Ichiro and Guillen can go, too. Ichiro wasn't as good as Vlad, and though Carlos Guillen was at least as good as Miguel Tejada on a per game basis, he got hurt and didn't finish the season. That counts. So if neither one of these guys is the tops in the league at their position, how could they be MVP? Goodbye.

So, back to the final three:

Player        RAA   VORP  O/E+  RCAA  WS  WARP1

Vlad 51 88.5 154 56 29 9.2
Santana 48 88.8 182 54 27 11.4
Tejada 28 73.0 126 23 30 9.9
Well, things don't look so great for our hero in the summary stats. He's second in Win Shares and third in WARP1. But let's take a closer look.

First of all, it's obvious that Vlad created about as many runs beyond average as Santana prevented. But remember, Vlad does things with his glove, and Santana doesn't really.

Of course, it's in the glove that Tejada gains on Vlad. As a plain ol' hitter, Guerrero cleans Tejada's clock, whipping him in 30 points of OPS+ and between 20 and 30 runs. The question becomes: does Tejada's being a good defensive shortstop in 2004 overcome this difference?

The problem with evaluating defense is that even most of the good work done out there is pretty clunky. Bill James with Win Shares and Clay Davenport have done decent work, but there's a real limit on what can be done with out play-by-play information. It's guesswork, and sometimes it's flat-out wrong.

The best publicly available defensive metric that deals in runs is Mitchel Lichtman's Ultimate Zone Rating. The problem is that full data stopped being public this season, as MGL is now employed by the Cardinals. (Well, another problem is that it isn't perfect, either, but I think it's a lot closer in most cases than the other systems I've mentioned.) However, he does chime in at BTF every now and then with figures for a select few. He did so in this thread.

He finds Tejada to have been +20 defensively last season (Davenport says +17). He also says Vlad was -1 (Davenport says +1), so in both cases UZR is actually very close to what Davenport uses.

MGL also has slightly different offensive numbers. As it was on a public forum, I will quote directly from that linked thread: "Tejada was +20 UZR plus +23 in batting lwts, or +43. Vlad was -1 UZR (0 in arm) and 43 in batting lwts, or +42."

Given the accord between UZR and Davenport, I am comfortable in saying Tejada was around +20 and Vlad around average. This also comports with zone rating, which has Vlad in the middle of the pack amongst AL right fielders and Tejada closer to the top amongst shortstops. Win Shares, as best as I can tell, has maybe a 10-run lead for Tejada.

Now, MGL does some things with positional adjustments I don't fully support, so let's back off and summarize what we (think we) know:

Vlad was about 20-30 runs better with the bat than Miguel Tejada.
Miguel Tejada was about 10-20 runs better of a shortstop than Vlad was a right fielder.

Now, intuitively, we all pretty much know a shortstop's defense is more crucial than a right fielder's defense, if only because they handle more chances. Yes, the right fielder's chances have more extra-base potential than a shortstop's, but in general the shorstop has been considered more important.

If this is true, to what degree is it true? Over the course of a season, is an average defensive shortstop worth 2 more runs than an average right fielder, or 20 more runs?

Traditionally, what sabermetricians have done is use the offensive difference between positions as a proxy for this when making positional adjustments. I reject this entirely. Well, not entirely ... it has applications, but doesn't really address our question.

As MGL alludes to in the BTF thread to which I linked above, the Tango Tiger has done some work on this. Over 150 games, the difference between an average shortstop and an average right fielder is about 10 runs.

So the average shortstop starts out at +10 against an average right fielder. Tejada gets an additional 20 for being above average for a shorstop, so he's up to +30.

Now Vlad, we decided, was an average defender last year, so he's still at zero.

When we bring offense into the equation, Vlad picks up. He was probably about +50 last year -- MGL had him a bit lower, and Sinins has him a bit higher, so let's stay around Davenport's number (+51), which is in the middle. So we have Vlad +50, Tejada +30.

Tejada was around +25 or so in offense (MGL says +23, Davenport says +28, Sinins says +23). So that gets Tejada up to +55.

Using MGL's numbers strictly, but with Tango's positional adjustments, Tejada leads 53-42. Using Davenport's numbers with those positional adjustments, it's 55-52 for Tejada. Using Sinins' offense with the defensive average and the same positional adjustments, it's Vlad 56-53.

Now maybe it's just me, but if you have one guy at +55 and one guy at +50, that's a toss-up.

Look, it's foolish and dangerous to pretend we know things we don't. There has been a lot of great baseball research in the last twenty years, and as I'm sure you can tell, I'm quite fond of it. But I think the idea that we have everything pegged down to five runs -- especially when such a big part of the equation is defense, about which our measures are far less than perfect -- is a bit silly.

Obviously, you can make the argument that Tejada was better/more valuable/whatever than Vlad last year. I can't tell you you're wrong.

But you can also make the argument that Vlad was better.

Coming back to Vlad vs. Johan Santanadana for a moment ... Johan saved about 50 runs beyond average and Vlad was +50 for offense, defense, and position, so there you go. Another toss-up. And while it's true that a run saved is worth ever so slightly more than a run created, the effect is tiny, and the difference between Vlad and Santana in terms of runs is sufficiently negligible that picking one as being better as the God's truth is out of the question.

The way I see it, there are three guys who are too close to call, all wrapped up withing about five runs of each other. Now, I haven't considered the fact that Vlad and Santana played for division winners when Tejada did not; that's because I don't think it matters. That sort of thing is a last-ditch tiebreaker for me, I couldn't care less about the quality of the rest of a team in determining the most valuable player. Well, we do have pretty much a tie here, so perhaps that should be a factor. If there were ever a case, this would be it.

Sometimes you look at an MVP award and you can say without a doubt: "That's a mistake." Sosa over McGwire, Tejada over A-Rod, Juan Gonzalez a couple of times, and so on. And then sometimes you look at an MVP award and you say, "I'd argue for this guy, but it's pretty damn close either way." At worst, Vlad's win falls into that category. He was deserving, he earned it, and he'll be a bargain for the Angels even if he drops off a bit in 2005.

What my ballot would have looked like, had I gotten my act together in time for the Internet Baseball Awards:

1.   Vlad

2. Santana
3. Tejada
4. Carlos Guillen
5. A-Rod
6. Ichiro!
7. Mora
8. Ramirez
9. Sheffield
10. Matsui

