
Minor League
Equivalencies
A conversation on the subject between
> DG: David Grabiner
> GH: Gary Huckabay
>GH> Why should we? Minor league statistics do a fine job of predicting
>GH> how well someone will hit in the majors. About as good a job as major
>GH> league statistics. If you wait until someone PROVES they can play in
>GH> the majors before you play them, you're losing a lot of productive
>GH> player-years, and you're not going to be a long-term contender.
I wrote something similar myself, and decided to check it out. I should
now revise the claim: MLE's are just as good as major-league statistics
in predicting future performance, with the exception of strikeouts and
walks.
>DG> I can confirm this. Bill James did MLE's for 30 rookies in the 1985
>DG> Baseball Abstract, and Frobel was the only player whose MLE was
>DG> inconsistent with his actual performance; his MLE was a .282 average,
>DG> and he hit .203. No other MLE was off by more than 48 points.
(numbers corrected slightly from previous posting)
I checked Bill James's sample; he looked at every rookie who had an MLE
of at least 200 AB in 1983, and played for at least 200 AB in 1984. The
average change in batting average was 25 points.
For comparison, I looked at a sample of 30 players who had 200 AB in two
consecutive seasons. I chose the 1988 and 1989 seasons because the 1990
STATS handbook was the reference I had at the time, and looked at the
first 30 players in alphabetical order who met these standards. The
average change in batting average was 24 points. There were a similar
number of major errors.
In the MLE sample, Doug Frobel lost 79 points of batting average, Ken
Phelps lost 48 (but hit with more power and walks), and Bobby Meacham
gained 42. In the major-league sample, Wally Backman lost 72 points,
Jerry Browne gained 70, Greg Brock gained 53, and Kevin Bass gained 45.
I defined a major error in the approximations as a number which was two
standard deviations away from the predicted value, excluding very small
values such as a prediction of one homer for a player who actually hit
four. (In such cases, the statistical assumptions break down.) A
perfect prediction method would still make one major error for every 20
predictions, just because of normal statistical fluctuation. If a
player's abilities did not change from one year to the next, using one
year to predict the next year would make one major error for every six
predictions.
There were 21 major errors in the major-league data, out of 138
meaningful projections (average, doubles, walks, and strikeouts for
everyone, and homers for 18 of the 30 players). There was also one
error in a projection of a small number; Marty Barrett hit one homer in
1988 and 11 in 1989.
There were 29 major errors in the MLE data, out of 137 meaningful
projections, and one error in a projection which wasn't likely to be
meaningful; Juan Samuel projected to hit 8 triples and hit 19.
However, 11 of 30 MLE projections for walks, and 9 of 30 for strikeouts,
were off; that's 1/3 of all the projections. (6 of 30 major-league walk
predictiions and 5 of 30 strikeout predictions were off, about what
would be expected.) The errors were not consistently in either
direction. This suggests that MLE's are not good for projecting
strikeouts and walks, probably because of differences in minor-league
pitching.
Everything else projects just as well from MLE's as from major-league
data.
And as for the inconsistency of Doug Frobel's MLE with his performance
(and also Ken Phelps, whose season was as good as his MLE's, but with
more walks, fewer strikeouts, fewer doubles and a lower average),
Backman, Brock, and Browne all had two consecutive seasons which don't
look like they belong to the same player.
As an interesting side note, Buechele was responsible for two of the
major errors in the major-league sample; he fell from 65 walks to 36,
and increased from 79 strikeouts to 107. It could happen again.