|
|
Back to Basics in Chess Ratings 1. Beyond the Elo System Rating systems in chess began in postwar Germany with the Ingo System, first described in the periodical Bayerische Schacht [E1]. The system does not bear the name of its originator, Anton Hoesslinger, but rather its place of origin, Ingolstadt . It establishes the basics of ratings in a remarkably simple formula, [1.1] R = ERc - (Pct - 50) , where ERc is the arithmetic average of the opposition ratings and Pct is the player's score in percentage points. A peculiarity here, from the standpoint of subsequent systems, is that lower ratings represent greater playing strength. There is no pretense of theory in the Ingo System, though its simplicity invites mathematical ideas. The actual development of rating theory took a different tack about 1960 with the introduction of probability formulas. The main proponent of this idea was Arpad E. Elo, one of the founders of the United States Chess Federation (USCF), and his system was subsequently adopted by the International Chess Federation (FIDE). The Ingo and similar systems were henceforth condemned as being based on an inadequate rectangular (uniform) probability distribution, but the precise probability function on which Elo's Percentage Expectancy Curve should be based is by no means clear. Elo himself offered two complete systems: one based on the normal curve, another on the logistic. Apologists are quick to point out that there is little practical difference between the two systems, but the difference raises fundamental questions about the role of probability in rating systems: ultimately, whether probability is the proper tool for their analysis. By analogy with scales of measurement, Elo distinguished three types of rating systems: ordinal, interval, and ratio. This classification is useful enough in discussing the alternatives of statistical treatment, but it can also be misleading. It is tempting to think of ratings as measurements of chess performance in the same sense as measurements of physical phenomena. As a trained physicist, Elo was especially susceptible to this interpretation. It is important to remember that ratings are essentially nothing more than statistics. The information they convey is based solely on the data provided by pairings and outcomes, certainly not on analysis of chess performance, which is beyond the reckoning of a mere mathematician. An unfortunate consequence of the literal interpretation of ratings as measurements of performance is the tendency to speculate prematurely on probability distributions, which leads to circular arguments for probability treatments based on assumed distributions. The task of rating theory is to investigate the nature of statistics used to measure chess performance. Differences between systems by Elo's classification are essentially differences in statistical treatment. |