ERA++

ERA++

by Kevin Harlow
June 5, 2003


Purpose
In this paper I will derive ERA++ or "ERA plus plus", which is an extension of ERA+ that accounts for the ERA distributional differences for each league.

Background
ERA has historically been one measure that is used to evaluate a pitcher. One of the problems with ERA is that the reference levels are not constant across seasons - a 3.5 ERA was great in 1998 but would be below average in 1968. Also, some parks such as Coors are conducive to high run scoring while others such as Dodger Stadium are more difficult to score runs in.

(ERA+) = (PF*LgERA)/(ERA)

Where:
PF = Park Factor/100
LgERA = League average ERA
(ERA+) = Ratio of a league average pitcher's ERA in park to pitcher's ERA

Theory
(ERA+) is a simple yet effective means of comparing pitchers regardless of what time period or park that the pitchers played in. However, it does not account for the ERA distributional differences in each league.

If you assume that during each year-league that the park adjusted era's approximately follow a normal distribution then you could express a pitcher's ERA as

ERA = (PF*LgERA) - z * Stdev(PF*LgERA)

Where:
Stdev(PF*LgERA) = Standard Deviation of park adjusted league ERA's
alpha = number of standard deviations ERA is below park adjusted league average

z = [(PF*LgERA)-ERA] / [Stdev(PF*LgERA)]

If you look up z in a standard normal table you can find the cumulative probability that a pitcher would achieve a lower ERA in the give league and park. For example, a pitcher who obtains an ERA two standard deviations below the league average ERA would be at the 97.5% percentile. These z-scores or the percentiles could be used directly to rank pitchers or pitcher seasons just as ERA+ does.

Dividing though by (PF*LgERA) gives

ERA/(PF*LgERA) = 1 - alpha*Stdev(PF*LgERA)/(PF*LgERA)

Which may then be simplified using definitions of (ERA+) and COV, coefficient of variation to

1/(ERA+) = 1 - alpha * COV(PF*LgERA)

Where: COV = coefficient of variation = standard deviation / average

Which when inverted gives:

(ERA+) = 1 / [1-alpha * COV(PF*LgERA)]

alpha = [(ERA+)-1] / [COV(PF*LgERA)*(ERA+)]

alpha = [1/COV(PF*LgERA)] * [1-1/(ERA+)]

(ERA++) = 1 / [1-RefCOV(PF*LgERA)/COV(PF*LgERA) * (1-1/(ERA+))]


Go to Kevin Harlow's home page.