This article is from the Table Tennis (Ping Pong) FAQ, by ttennis@bu.edu with numerous contributions by others.

We want to extract from the data the probability of winning a match as a

function of the difference in ratings of the two players. Let's look at the

distribution of the matches by rating.

------------------------------------------------------------- Rating | Pre | Adjusted | Post difference |------------------------------------------------- | Matches Upsets | Matches Upsets | Matches Upsets ------------------------------------------------------------- 0- 299 | 973 272 | 1126 260 | 1123 212 300- 599 | 229 15 | 275 4 | 283 1 600- 899 | 69 1 | 86 0 | 80 0 900-1199 | 11 0 | 17 0 | 18 0 1200-3000 | 3 0 | 6 0 | 6 0 -------------------------------------------------------------

The reason there are fewer total matches in the "Pre" column is that we

have excluded those matches that involve an unrated player. For our

purposes, the main thing to notice is how few matches there are with large

rating differences and how few of them are upsets. Hence any estimate we

calculate for the probability of winning when there are large rating

differences will be of questionable accuracy. Of course we are using only 8

tournaments; there are over 200 tournaments per year.

TECHNICAL STUFF

To proceed we need a model for the probability of winning a nonhandicap

match as a function of the rating difference. This gets technical for

awhile. We will use a logistic model. Let D be the rating difference, P be

the probability of winning a nonhandicap 2 out of 3 match, and b be the

model parameter. The form of the logistic model is

P( D ) = exp( bD ) / ( 1 + exp( bD ) )

We fit the model to each of the three sets of data by maximum likelihood.

Here is the result.

------------------ Ratings | b ---------|-------- Pre | 0.00795 Adjusted | 0.01115 Post | 0.01517 ------------------

Each model lets us calculate the probability of winning a nonhandicap 2 out

of 3 match for any difference in rating. Given standard assumptions

(probability of winning a point is independent of the score and of who is

serving) a probability of winning a nonhandicap 2 out of 3 match

corresponds to a probability of winning a point.

This suggests how to calculate a handicap chart. Pick one of the three

models. Pick a rating difference. Convert this to the probability of

winning a nonhandicap 2 out of 3 match using the model. Convert this to the

probability of winning a point. Now find the handicap such that the

probability of winning a handicap match is 0.5 (i.e., the handicap match is

fair to both players). By the way, my 386 computer (no coprocessor) needed

about an hour to compute the charts.

Continue to: