6. Percentage Expectancy

The basic fallacy of the Elo System is that a probability function can be applied as a rating formula without compromising its original meaning.  Syntactically, the Percentage Expectancy Curve may be identical to the logistic function or to normal distribution.  Semantically, it is worlds apart.  The fallacy appears to have escaped notice because it is thrown together with probability concepts which make sense in the context of rating systems.  Among these is the concept of percentage score against a given opposition as a sample of long-term percentage score, which may be called, for lack of a better term, percentage expectancy.

Percentage expectancy may be thought of as a probability in the trivial sense of a distribution of percentage scores for a sampling of results between two players.  Relative performance in this sense is clearly a random variable.  This observation is the starting point for conventional inferences about sampling error and was claimed by Elo as the basic axiom of his system.  The relation that emerges as the Percentage Expectancy Curve is not, however, a distribution of percentage scores, and it makes little sense to speak of the probability associated with a pair of ratings. Percentage expectancy in this sense is not a random variable at all, but a function of the ratings (an inverse function, to be precise, since ratings are usually calculated from percentage scores).

A principle implicated in the fallacious interpretation of percentage expectancy is transitivity of probabilities. (For an amusing counterexample see "Nontransitive Dice and Other Paradoxes" [G].)  It is a commonplace observation in chess and other games that results are not transitive.  If x defeats y, and y defeats z, it does not follow that x defeats z, even though the latter result may be expected.  This general expectation of transitivity leads erroneously to a probability interpretation, wherein

          P(x defeats z) 

may be inferred from 

          P(x defeats y) and P(y defeats z).  

Elo begins the development of his ratio system by citing an obscure reference to the effect that the odds of x to score over z are

          (Pxy / Pyx) (Pyz / Pzy)  =  Pxz / Pzx

where Pxy is the probability of x scoring over y, etc. [E1].  A little thought reveals that there is no logical necessity in this calculation. One implication is that if results between x and y are equal, their results against z will be equal, which would be a rare form of justice.  The postulate is not a probability function; nor does it reveal any transitive quality inherent in the nature of chess performance.  Interpreted as a mere convention, as the the basis perhaps for a rating statistic, it is more promising.  Elo's argument in broad outline may even be said to parallel the ratio systems developed here.  The primary difference, aside from difficulties arising from a logarithmic treatment, is one of interpretation.  

The term percentage expectancy nevertheless has its uses in rating theory.  Despite its specious probability connotations, it generally denotes nothing more than the expectation that ratings will remain fairly constant, an observation that is useful for extrapolations.  For a given pair of ratings the percentage expectancy is the hypothetical result that produces no change in the ratings. It is calculated in any rating system as the inverse of its basic formula, solving for percentage score. Determining overall percentage expectancy against several opponents is problematic.  The Elo System has been aptly criticized for applying arithmetic averaging, a linear process, to its nonlinear percentage expectancy function.  For nonlinear systems in general, the expected score We against an average opposition rating, arithmetic or geometric, does not precisely equal the sum of percentage expectancies over the opposition ratings.  The preferred method for calculating expected score is to take the sum of the percentage expectancies for each opponent..