Hafner, Hope and Small Samples
The brightest light in what has been a dark beginning to the Tribe's season has been the performance of Travis Hafner. As has been well chronicled, Hafner was a major question mark coming into this season following a regression in 2007, a dismal 2008, and off-season shoulder surgery. The question of Hafner is made many times larger by the $57M the Indians are on the hook for him through 2012. A repeat of last year would seemingly assure the Indians of the largest contractual albatross in club history. A return to pre-2007 levels would alternatively be a huge boost to what should be a good offensive lineup even in Pronk's absence.
As we are pretty much all aware, Hafner has begun the season with 3 HRs, a double and two singles in his first 24 plate appearances (spanning 5 games). This is great. But does it give us any indication if Pronk is back? One of the things I found most striking about Jay's interview with Antonetti was the repeated references to the need to get a large enough sample to assess performance and the difficulties of performance analysis in the context of small samples (hello, bullpen). This is a fairly well addressed topic in sabermetric research, with several studies examining how many plate appearances are necessary for current season performance to be reliably predictive of total season performance (i.e. how many ABs do you need before something like SLG% or OBP% to become significant?). 24 plate appearances is about an order of magnitude too small to predict what Hafner will do the rest of the season.
But it doesn't mean we can't ask questions about his performance thus far. As it turns out, my real life profession regularly involves me asking questions strongly constrained by sample size (morphological variability in the fossil record). And while sample size is a major obstacle, it is possible to ask simple questions even given the limited information available to us.
So that's what I want to do, ask a simple question about Hafner 2009. And my question is this - what is the likelihood of observing Hafner's 2009 performance across 24 plate appearances in the 2008, 2007, or 2006 Hafner seasons?
To do this, I did a few things. First, I coded Hafner's plate appearances over the past three years based on result; how many outs, walks, singles, doubles, triples and HRs (for simplicity sake, I'm excluding sacrifices, HBP, etc). I then created a distribution of performance based on 24 plate appearances for each of Hafner's past three seasons by randomly drawing 24 plate appearances out of the entire sample of his season and repeating this process 10,000 times. For example, in 2006 Hafner had 563 plate appearances, 100 of which were BBs, 66 of which were singles....42 of which were HRs. What I did was randomly draw 24 of those plate appearances against which I can compare Hafner's actual 24 plate appearances this season. To compare them, I simply calculated the OPS across those 24 plate appearances. I then repeated this 10,000 times for each season.
For those worried about my sanity, this only takes about 5 minutes and 10 lines of code. This is essentially a "bootstrap" test for those with a little stats background.
As a reminder, I'm interested in what the likelihood is of observing Hafner's 2009 performance in 24 plate appearances based on his 2008, 2007 or 2006 performance.
Here are the results:
Compared against his 2008 performance (for which he put up a stellar .628 OPS in 233 PAs), Hafner's 2009 performance was better than all but 111 of the 10,000 trials. In other words, his numbers thus far would put him in the 99th percentile of expectations based on his 2008 performance (see distribution below). At the 1% level we can make a pretty strong claim that Hafner's performance is better than you would expect based on his 2008 numbers.
Compared against his 2007 numbers, Hafner's 2009 totals exceed the simulated trials in 9,065 of the 10,000 trials. Being conservative, although this is pretty far onto the right half of the distribution (see below) we can't say that his performance thus far has been different than what you would expect from his 2007 numbers.
Finally, compared against the 2006 season, Hafner's 2009 totals exceed the simulated trials in 6,570 of the 10,000 trials. This is pretty much smack in the middle ground of the simulated distribution.
So what does this mean? It provides some evidence, even given the limited data available to us, that Pronk is back. This in no way suggests he is going to continue on this path. This in no way predicts what he will do the remainder of the season. But it suggests his performance this far is pretty inconsistent with his performance last season - which since he was terrible, is reason for hope.
55 comments
|
9 recs |
Do you like this story?
Comments
This is awesome. I wish I had any clue at all how to do things like this. Will you teach me?
I'm *always* in the driver's seat, cugino -- Chuck
It’s actually very easy. Basically:
- enter your data into something (you can do this in excel)
- randomize the order of your data
- draw 24 things and calculate whatever metric you are comparing (OPS in this case)
- compare your randomly simulated data (what you’ve just calculated) with your observed data
- save that information
- repeat a lot of times
this is the best thing i’ve seen since my sister-in-law showed me a peep in the microwave yesterday.
by Brick. on Apr 13, 2009 5:25 PM EDT reply actions 2 recs
I should add – there are a few assumptions with this approach. The first is that each plate appearance is independent. This probably isn’t actually true, but it is difficult to get around and I’m guessing it doesn’t have a major impact on the final result. It would be possible to consider Hafner’s performance in 24 consecutive plate appearances, but that would necessitate me entering the exact sequence of Hafner’s plate appearances, which I have no desire to do. The second is that the level of competition Hafner has faced is equivalent to that of previous seasons. This is potentially a more substantive problem, but with the limited performance available to us in 2009, can’t really be corrected.
yes – to do 24 consecutive I would have had to enter data for each season in sequence and not just as aggregate data
You don’t have an intern for that sort of thing?
Though I look right at home, I still feel like an exile
by Manhattan Tribe Fan on Apr 13, 2009 5:37 PM EDT up reply actions
I can’t get around this. These aren’t 24 randomly selected at-bats. I think you did great work here but it would be significantly more meaningful the other way.
by jakesinger777 on Apr 13, 2009 6:49 PM EDT up reply actions
Here’s one paper on topic. I still think it’s not going to make a huge difference in Adam’s study though. It would broaden the distribution curves a bit, but I doubt enough to change his conclusion.
In 2006, Pronk had 541 combinations of 24 consecutive PA. Of these, 198 (or about 36.6%) produce an OPS of more than 1.217.
In the next 5 2009 PAs, of course, Pronk’s OPS dropped to 0.985. Changing the 2006 data to streaks of 29 PAs shows that 2006 Pronk had an OPS of more than 0.985 in 323 of 526 samples (60.3%).
Quick question (for my sake): is it clear based on my description what I did and how it addresses my question?
I thought so.
Though I look right at home, I still feel like an exile
by Manhattan Tribe Fan on Apr 13, 2009 5:36 PM EDT up reply actions
what are you, some kind of nerd?
by Cap'n Snegiryov on Apr 13, 2009 5:46 PM EDT reply actions 1 recs
And I just noticed Brick’s fanshot. Hope. And remember, even though we’ve only won 1 game, we’re only 4 games under .500. As I used to repeat mantra-like as an undergrad, getting behind early just gives you more time to make it up.
Me: lazily cherry pick an encouraging quote from an article I was reading and link it.
You: do substantive analysis, compile and present it.
good community = quality + depth
LGT = good community
by APV on Apr 13, 2009 5:51 PM EDT up reply actions 1 recs
along those same lines: web 2.0 + indians fans = LGT > any other analysis you can find in the MSM
kind of funny how this went up the same day as jay’s fanshot of the morning journal article—that is, seeing your work thrown up against that “analysis.” you produced something in five minutes that the LMJ guy couldn’t even dream of doing—no wonder the interwebs have the mainstream writers going into a full blown epistemological crisis. nice job btw.
by Cap'n Snegiryov on Apr 13, 2009 6:08 PM EDT up reply actions
Good stuff! I’m going to show this to my stats class tomorrow — we’re talking about probability distributions right now.
well…you can tell them there are a few logical alternatives to what I did. Instead of creating a distribution based on 10,000 random draws of 24 (which is the simplest approach computationally), I could have created distributions based on every possible combination of 24 plate appearances (arguably more accurate, but in practice probably not different at all). I also could have looked at every possible combination of 24 consecutive plate appearances (but see above for the data difficulties).
you’re just going to have to redo it anyway when he goes yard twice off of Greinke tonight.
by Brick. on Apr 13, 2009 6:03 PM EDT up reply actions 2 recs
Seriously, and this brings up another point. Watching the past couple games (I guess 3, since Pronk didn’t play in one of them), it occurred to me. I had total confidence in Pronk coming up in that situation yesterday and I couldn’t remember how long it’d been since I felt that way. Then it hit me. I never actually lost that utter faith in the man, it was more like each time he failed anew over the past 2 years a small piece of it died within me, but never completely.
by jakesinger777 on Apr 13, 2009 6:07 PM EDT up reply actions
I’m not so sold yet. He looked like he was wincing and taking a lot of time in between pitches in the ABs I’ve seen. He could just grimace and wince because he’s a bad dude, but I’m still fearful he’s hurting. Also, he got a day off Saturday – I don’t know if it was to protect him from Halladay, basic rest, or something more ominous.
I agree. He was atrocious, but I always was thinking “This is the at-bat where he destroys the ball and turns it around.” It was the opposite of Casey Blake, who even when he was doing well, I thought: “This guy is going to strike out looking.”
by OddlyGaussian on Apr 13, 2009 6:20 PM EDT up reply actions 1 recs
This raises a real issue, though, which is the ‘sensitivity’ of this analysis to sample size. For example, an 0-4 at this point is going to make a significant difference in what Hafner’s performance looks like at the moment. And I’m also not totally convinced on Hafner. I watched him in 4 straight games the penultimate week of spring training and came away decidedly unimpressed. But I’m hopeful.
I’m going to blame it on Scott Lewis’s elbow.
Though I look right at home, I still feel like an exile
by Manhattan Tribe Fan on Apr 13, 2009 6:56 PM EDT up reply actions
A good community knows how to use the word “penultimate.”
Though I look right at home, I still feel like an exile
by Manhattan Tribe Fan on Apr 13, 2009 6:56 PM EDT up reply actions
This is good news, I think. Though I was made to understand there would be no math.
B-Man would be proud, Adam.
Il faut d'abord durer.
by CU Adam on Apr 13, 2009 7:26 PM EDT reply actions 2 recs
All the time I spent in that classroom was either detention or some theology class being held there for some reason. There’s a reason I became a lawyer. I wish this sort of stuff made any kind of intuitive sense to me. You explain it well and simply, though, and that is very appreciated.
Il faut d'abord durer.
A college buddy of mine said he once drew a full-body Chief Wahoo over an entire one-page B-man quiz. B-man graded each problem individually and gave him a C-minus.
by fleerdon on Apr 14, 2009 9:41 AM EDT up reply actions
I was horrified to receive his first quiz after I misinterpreted his study tip of “Know Definitions” to be “No Definitions”.
by The DiaTriber on Apr 14, 2009 9:56 AM EDT up reply actions 1 recs
I wanted to repost this here since APV did an excellent job in his analysis and this is another way at answering the, “Is Pronk back?” question.
Well a quick look at hit tracker would say that his line drive HR on 4/10/09 against TOR was harder hit if you go by speed off the bat (115.8) than the one he hit yesterday (113.1), but the one yesterday went the farthest of the two 421ft true distance to 379 ft true distance.
He hit one of his five HRs from last season harder (116.2) on 4/4/08 in Oakland.
None of his 24 HRs in 2007 were harder hit than either of the two listed above that he has hit this year, and going back to 2006 he had 7 HRs hit at 113.0 and above and only 2 HRs hit harder than the line drive HR on 4/10/09 listed above. 2006 was the year he hit 42 HRs.
All and all Wedge is pretty spot on in regards to Hafner’s power showing already in the early month of the season. Its pretty impressive that he has hit those two most recent HRs as hard as he has.

by 



















