Ghost-Rating is an unofficial rating system developed by TheGhostmaker as an alternative to points. It intends to provide a more accurate measure of the ability of players, and also to give a better idea of how well people are doing compared to previously. The original version was developed in 2008, and based on the Elo used in chess, however the algorithm has been improved several times since then.

How does Ghost-Rating differ from points?

There are several differences between the two systems. Unlike points, Ghost-Rating takes into account the ratings of your opponents, so for instance if you beat some very good players, you will get more credit than if you beat weaker players. Ghost-Rating treats all games with equal value, rather than allowing one 1000 point game take much greater weight than lots of 35 point games in the ratings. Ghost-Rating doesn't top up your rating if you drop below 100, so it will give players who aren't above 100 points a way of judging how well the are doing, and it doesn't have inflation, which means that you can compare your rating now to your rating six months ago to see if you've done better.

How do I find myself on the list?

Depending on your browser, use either F3 or Ctrl+F and search for your name on the page to find where you are on the list.

How do WTA and PPSC work in Ghost-Rating?

PPSC works just the same in Ghost-Rating as it does in points, in the sense that if you want to get as good a rating as possible, you should aim to maximise how many supply centres. Similarly, WTA works the same, so you want to get a draw or a victory to maximise your rating. In a particular game, points and Ghost-Rating have exactly the same reward systems, so you don't have to choose between the two.

Do limited press and variant map games count in Ghost-Rating?

Both these types of variants count, however they are weighted down to take into account that they are often less well balanced etc.

I have fewer games in Ghost-Rating than on webDiplomacy- why?

The Ghost-Rating list filters out games where the game ends before Autumn 1903. This is because these games cannot be won without CD, and before the cancel feature was introduced, people used to 7-way draw games that they wanted to cancel for whatever reason.

Can Ghost-Rating be integrated into webDiplomacy?

The biggest barrier to this happening is that it needs to be programmed. If you know php, and want to see this happen, please contact TheGhostmaker about doing so, as he would be happy to help.

How exactly does it work?

This detailed explanation of the Ghost-rating system was published in Diplomacy World 105. The following version is edited down to the area's relevant to webDiplomacy:

In
his article on Internet Diplomacy in Diplomacy World 103, Jason Koelewyn made
the insight that “…most of us are geeks of one flavour or another, and geeks
love numbers and rankings.” Just the shortest glance at statistics will show
that this has held true at phpdiplomacy.net, the website where I play my Diplomacy.
It is the host of 380 active games at the time of writing, and boasts over 5000
completed games. Like diplomaticorp.com, this site has surpassed the 100 player
‘barrier’ for active members, as well as over 13000 registered members.

In
August of 2007 the points system was introduced by the developer of
phpdiplomacy, Kestas Kuliukas, after the number of players had dropped due to
Civil Disorder ravaging the community over the past six months. He will be able
to give a far better account than me, but suffice it to say that a sudden boom
in players at the beginning of 2007 had swamped a small community, changing it
from one where you knew every player to one where you knew few. The site was in
trouble because the social responsibility that was once there was lost. Ever
since this introduction of the points system the growth of phpdiplomacy has
been dramatic. In just one year the number of unique ‘hits’ had increased by
over seven fold. Clearly then ranking players is of the utmost importance for
successful Internet-Diplomacy. The reason for this is simple- there was a
number, a badge that said that you were a good player or a bad player. If you
went into CD or joined too many games, you never got more points. So you just
didn’t go into CD or join too many games if you actually wanted to play. There
is one thing that points don’t do however, and that is tell you with any
accuracy how good the various players are, and it would be much better to have
an accurate rating system for Diplomacy. I recognised this at once, and
actually left the site at about the time the points were introduced, for a
short interval. It is to this end that I have developed Ghost-rating, a system
designed for Internet Diplomacy rating, rather than tournament scoring, in that
it is meant to rate a large group of people.

There
were two major aims for this system:

1.To promote desirable
behaviour

2.To be an accurate
rating system.

Sadly,
these two may very well be antagonistic, although, what is really wanted is for
people to play at the best of their ability, so not playing so few games that
they don’t get a feel for how to play Diplomacy, not so many that they cannot
concentrate properly on each game, and to never enter CD. The traits of a good
player are the traits that we wish to encourage, so if we rate players
properly, in theory it should all fall into place.

The single inspiration for
my system comes from the work of Prof. Arpad Elo, who developed the Elo-rating
system since adopted by FIDE’s (the Fédération Internationale des Échecs or
World Chess Federation). His work underpinned mine, with the formula:

Here, the result is some method of scoring the game, so that the sum of
all players results always equals one (It must always equal the same, otherwise
it stops being a zero sum game, which is silly. Equalling one is a nice
convention). Expected result is defined as a function of the seven players’
starting ratings, and what that function is depends on the way the result is
defined. Clearly this too has to sum to 1 (you cannot expect anything else). V
defines how quickly the ratings change. It is desirable to have V such that a
player’s rating changes about the same amount no matter who they play.

This formula makes the rating system zero-sum, so ratings are the same
over time, unless the average standard increases/decreases. It is always hard
to compare over time, but this system give us our best shot at that. Each
player starts on the average rating which is 100 (Chosen because firstly, it
seems natural to start on a power of ten, and secondly, 10 is too low to avoid
using decimal places, 1,000 is plausible but 10,000 is too high for ratings to
be memorable.)

This formula is all well and good, but we obviously need to define V,
Result and Expected Result. I have done this for two different rating
systems. The first, and simplest, is Winner Takes All. Basically, winning gives
you a score of 1, anything else gives you zero, except for n-way draws that
give you 1/n if you are part of the draw, otherwise, zero.

Now,
for this, we can define ratings to follow a certain rule with expected result,
or rather to take it as an axiom. I used the idea that ratings could be a win
ratio. So if player A has a rating of 120, and player B has a rating of 60, in
a game with both of them playing, player A is twice as likely to win as player
B. That gives us the following formula:

Now
we need to work out how to get V. Clearly it has to be some function of the
ratings of the players involved, so let:

Consider the new rating of player 1, given that his real rating should
be r, and assuming that all other players are accurately rated, with the sum of
their ratings=k. Result, on average, should be given by

by virtue of the expected result formula. Then, on average:

And
so it works to have

but only on average. If
we were to actually do that, one defeat would be taken to be precisely your
average skill, and your rating would plummet, one win would see your rating
skyrocket, so we have to divide V by some constant to keep ratings from boomeranging
around. If we set the average (and starting) rating at 100,

gives a variance of 40, which seems about right from my models, although
discretion can be used.

Hence for Winner Takes All systems, you just combine the formulae above,
to get the ratings

The second scoring system I shall look at is “Points Per Supply Centre”.
Basically, result= SCs owned/34.This is
rather more complicated in terms of expected result, because you clearly can
only win 18 centres maximum. The reasons for imposing the maximum are two-fold.
Firstly, it is not desirable for players to draw out a game in an attempt to
try to gain extra centres, and secondly, it would be impossible to quantify how
likely a player is to get 19 centres rather than 18, for instance. (It should
be noted that using this scoring system does mean that every game must be
played to the end, with no concessions, although this isn’t an article about
different scoring systems)

Because of the complication this maximum creates, it is necessary to
look at the outcome as having two possibilities. The first is winning, and
getting 18 SCs, the second is not winning and getting 16 or fewer SCs. You then
need to look at both of these, and calculate the Expected Result that way. In
essence:

So all we need to find is the Expected
success in non-victory. Herein lies a problem- that depends on who the victor
is. If player 1, clearly there cannot be any success in non-victory for player
1, and if player 2 is victor, the chances of success are different than if
player 3 is a victor, because the people you are competing against are
different. In fact, if player j is victor, with j not 1, the success for player 1 in non-victory is
given by:

The
chance of that actually ever happening is the same as player j’s victory
chance, so we must multiply by that. Summing this for all possible j winning
(other than j=1, where there is no chance of success in non-victory because you
have won) gives us the expected success in non-victory, Dr_{1}:

Now, for V, it doesn’t make too great a
difference if you use the same formula as for winner takes all, although
clearly it would be possible to find one that works in the same way as the
winner takes all one does for winner takes all, but ultimately the exercise is
probably pointless, due to the approximation at the end, and the inherent
problems with rating a game such as diplomacy.
That and the fact that the formula would no doubt be hideous has meant
that I have not created a V formula specific to PPSC.

So you can again take all these formulae
and make the calculations necessary.