Ratings - GoldToken

Ratings

This site uses ratings to facilitate game match ups between different players. Briefly, ratings are an imprecise indication of a player's skill in a game. The more games a player wins, the higher the rating. Read on for more information about how we calculate ratings.

For ELO Rated Games to Achieve Rating			For FIBS Rated Games to Achieve Rating
Number of Completed Games	Friendly	Tournament	Number of Completed Games	Friendly	Tournament
Less than 4	Unrated Friendly (unrated)	Unrated Tournament (unrated)	Less than 4	Unrated Friendly (unrated)	Unrated Tournament (unrated)
4 to 20	Provisional Friendly (Provisional)	Provisional Tournament (Provisional)	4 to 400	Provisional Friendly (Provisional)	Provisional Tournament (Provisional)
20 or more	(Friendly)	Tournament (T)	400 or more	(Friendly)	Tournament (T)

Types of Ratings GoldToken offers two kinds of ratings for all games -- Friendly and Tournament. Each of them passes independently of the other from Unrated to Provisional to (Established) Friendly or (Established) Tournament. They are displayed as "Friendly" or as "T" after the appropriate rating number.

This site has four different qualifiers for each of these two rating types. First is Unrated. An unrated player has no information of past games on which to base any type of indication of the player's skill. Once a player has completed four games, that player acquires a Provisional rating of that game type (either Friendly or Tournament). The provisional rating indicates that, while we have some information about the player's skill, we are still a bit unsure if the games played are an accurate reflection of that player's true ability in the game. Because of that, provisional ratings are subject to fluctuate rather largely as more games are completed.

Once a player has completed the minimum number of games required for an established rating, the rating passes from Provisional to either Friendly or Tournament. Friendly ratings are calculated from games where the player is free to choose his or her opponents. Since the process of choosing opponents can have some effect on the resulting rating, the Friendly rating is seen as less accurate than a Tournament rating, which is calculated only as the result of games played in a site tournament. Note that we do not keep a separate rating for friendly games and tournament games. Instead, the rating itself is Friendly only if the required number of tournament games have not yet been played. Once they have been played, the rating passes from Friendly to Tournament.

Note: Tournament ratings are lower than Friendly ratings despite being based on the same system. The main reason for this is two fold: 1.) the pool of players is smaller for Tournament ratings and 2.) one cannot pick on weaker players to get easier wins. When one finally achieves a "T" rating it tends to be 200 to 300 points lower than one's Friendly rating in the same game.

Ratings Formulas Most games on this website use the Elo formula to calculate the ratings. This formula was first used by the US Chess Federation, and has increasingly become the standard rating system to use for games based on skill. The Elo formula states that the rating will be provisional until 20 games have been played. Some of our games use the FIBS rating, which is a modified form of the Elo rating that takes into account the element of chance and is thus more suitable for games that involve some amount of luck, such as backgammon. The FIBS formula states that the rating will be provisional until 400 games have been played.

If you are the higher rated player, you are going to win fewer points winning a game than the lower rated player who wins you. Not only does it depend on the relative ratings but also on the length of the match - you lose/gain more for longer matches.

There is a FIBS rating calculator at http://home.nordnet.fr/~fhochede/ratings.shtml

Feeding in the ratings will give you a good "guesstimate" of the outcome, but you need to know what the actual old ratings were, when the games finished to be absolutely certain.

Note that it doesn't make any sense to talk about being 'penalized' when your ratings score goes down. The ratings are not an awards based system. It is just a statistical analysis of your wins and losses. The score is adjusted after each game, the intention being to give an indication of how well you've done relative to the other people playing.

Losing ratings points after you win or lose a game is not a penalty. It is just an adjustment. If you lose a match against a lower rated player, this is an indication that your ratings score may be a little higher than it really should be, and so it will likely be adjusted downwards. It makes no difference whether the match score was 14-0 or 9-8. A win is a win, and a loss is a loss. The amount by which the ratings score is adjusted depends primarily on the difference between your rating (at the time the match ends) and that of your opponent. There is a chart here [ Ratings in Practice ] which gives you an idea of how big the adjustment will be. Another factor that can effect the size of the adjustment is the experience score if you have a provisional rating. Someone who has only played a small number of games will get a bigger adjustment, so that a rating score that accurately represents their ability level can be arrived at more quickly.

Try to not put too much significance on the individual adjustments made after each game, and start thinking about the overall aim of the system, which is to try to ensure that each player's rating score, relative to everyone else, is a reasonable approximation of their ability in the game.

Results of Ratings The end result of these formulas is that if you win a game against an opponent who has a higher rating than you do, your new rating will reflect the extra effort it took you to win; likewise, if you lose to a higher rated opponent your new rating will take that into consideration, and your rating won't drop as much. This prevents highly-rated players from picking on lower-rated players in an attempt to boost their rating. Pick on someone too much lower than you, and your rating will remain unchanged if you win, but will drop dramatically if you lose. And yes, ratings can decrease upon victory if the rating difference is more than 400.

Player Experience With FIBS, rating changes are ramped until you have accumulated 400 experience points. Experience is the total length of rated matches you have completed. For example, if you have completed three 5-point matches, your experience is 15.

FIBS takes a player's experience into account when determining the rating change after a match is completed. Only a player's own experience level is used in this calculation; not the opponent's experience. FIBS considers a player to be "experienced" when the player has an experience level of 400 or more. This number is simply the running total of the length of all matches completed. In other words, a "newbie" starts with experience 0; after completing a 1-point match, the experience would change to 1; after a 5-point match, it would then increase to 6; and so on. FIBS adds the length of the match to the player's experience before performing the ratings calculation. If your rating is 400 or higher when the match is over, experience does not affect the ratings calculation as described above; i.e., if you win a 1-point match against someone with an identical rating, and your experience after completing the 1-point match is 400 or more, your rating will go up exactly 2 points. If experience level is less than 400, the rating change for that player will be more: If experience is 300, the rating change is doubled. If experience is 200, rating change is tripled. Experience of 100 means rating change is quadrupled. And for an experience level of 0 (let's see how closely you've been paying attention; why is this not possible?) the rating change is quintupled (OK, this is getting out of hand...it's multiplied by 5!) This is actually a continuous function, i.e., experience of 350 results in an experience factor of 1.5; 385 would result in 1.15; and so on. The experience factor never falls below 1. For those of you who have fond memories of your high school algebra class, the experience factor is either 1 or 5-(E/100), whichever is greater, (where E is the individual's experience after adding in the length of the completed match).

Shortcomings of Ratings The rating system is imperfect. No mathematical system can perfectly assess your abilities in a game, nor can a comparison of ratings automatically determine the winner of a match up. Your rating will vary with almost every game, and you will find that it often fluctuates. Don't get too caught up in that. Some people mistakenly look to ratings as some sort of "reward" system. They think that if they win a lot of games, their rating will just keep on moving up. In reality, the rating should settle fairly quickly on a value somewhat close to the player's true ability. Once you have an Established rating, your rating shouldn't change by much more than about 100 points up or down unless you are taking a lot of time to study and learn new techniques in that game, or you are being careless and playing beneath your skill level.

The Purpose of Ratings The real power of ratings is seen when using ratings to find an opponent who is at a similar skill level. By pairing up two players that are of comparable skill, we think you will enjoy your games more. It's not very fun to always lose against the same person, nor is it fun to beat the same person every time. By trying out our suggested match ups, you can meet new people and get a real feel for the game against a worthy opponent.

Ratings are Opt-In By default, each player starts out unable to see ratings. We think that most people probably just want to enjoy a few good games rather than get wrapped up in a "score" that is attached to your every move. So we have left the ratings invisible, and if you want to see them, you should go to your Preferences Page to enable the ratings. But even if you stick with the default, the ratings are still hard at work behind the scenes to determine your ideal opponent.

A second note just for gammon game ratings. Because some find it confusing, it needs to be extra clear:

Some people seem to think that their rating is something that should go up and up when they win games, and always go down when they lose. That is not what the ratings are for, and that is not how they work. Your rating is intended to be a measure of your skill, and it is calculated by taking into account how many games you've won and lost, compared to your opponents. It is expected that higher rated players will generally beat lower rated players most of the time, so when this happens the rating value will not be adjusted, or only by a very small amount.

However, if a lower rated player beats a high rated player, this is unusual, and suggests that the ratings are perhaps not accurately reflecting the true skill level of the two players, and so a much higher adjustment is made in these circumstances. Obviously in games such as backgammon and salvo, where luck plays a part, this does happen from time to time, whereas in games where there is no luck involved such as chess or outbreak, this hardly ever happens. This is why there are two different ratings systems. The formulas used are cleverly designed to take all these factors into account using advanced mathematical and statistical techniques.

I might add that the ratings systems are designed by mathematicians and statisticians who know about these things, are used by many other sites and organizations, and have been proved to be relatively accurate over a long course of time. It is not just something that the squirrels have cobbled together.

How does the Top 25 work?

There are two sections that work differently. The top section has three options, you can choose to have it display only players with a Tournament rating, or also include players with an Friendly rating, or also include players with a Provisional rating. The bottom section always shows all players whether their rating is T, F or P.

The top 25 are just the top 25 Tournament rated players. The others include friendly and provisional ratings also.

If you include Friendly ratings when checking the top players list, the top 25 will tend to comprise of Friendly-rated players only, due to the fact that a players friendly rating tends to be 200 to 300 points higher as noted above.

Question

Why did my rating go down after I won a game?

Answer
Our rating system is pretty complex. However, it has no idea by looking at you how skilled you are in any game. So during your first few games, it absolutely refuses to tell you what it thinks of you. After you've finished four games, it will give you a provisional rating. Replace the word provisional with the word tentative, or unsure, and you'll understand the concept better. After each game, the system reevaluates your rating, taking into account your current provisional rating, your win/loss record, and the rating of the opponent you just played. It then issues you a new rating which it feels best reflects your skill level in relation to the skill level of your opponents. If it adjusts your rating downward even after a win, then our system is most likely saying that you are playing opponents that are too easy for your skill level. On the other hand, if your rating goes up even after a loss, it's like our system is saying that you're smarter than you look.

This is by no means a perfect system, but it tends to be surprisingly accurate after you've played enough games. In fact, after you've finished 20 games, it gives you an established rating, and you will no longer see the wild fluctuations which are possible for provisional ratings.

Question

Why do some games not show on my completed games list?

Answer
Non-ranked/rated games do not show in your completed games list, however, they are all listed on the specific finished game type when you click on that games type in your finished game list. Players still win tokens for unrated games. Watch your total Gold, Silver, and Wood tokens on your Game Sheet for your combined total number of wins, loses and draws.

Question

Does my rating change when it goes from provisional to established after 20 games?

Answer
Kinda. The calculations our system uses to determine your rating change dramatically after you have completed 20 games. Before you have completed 20 games, it uses a math equation that I can understand. But afterward, well, let's just say I'm glad that a computer has the job of figuring out your new rating. The most noticeable change is that your rating won't fluctuate as much after each game, but it will still be constantly adjusting to assess your skill level.

A special note regarding the tournament rating vs a friendly rating

The more players who go for their tournament ratings, the bigger the player base and the more balanced the rating base. Just remember, the tournament ratings are honestly a better rating as they are based on games that players do not predetermine their opponents...unlike friendly games in which some players deliberately seek out only new guests or weaker players and win by default in a larger number of games. I have found the friendly rating to not be as honest because of this and it is exactly why Chad created the tournament rating in the first place.

Regarding site sponsored tournaments and friendly ratings: The purpose of a Tournament Rating is the larger variety of opponents you are going to face when in a site wide tournament. In a club, even if it is a large club, you are limiting the opponents to a much smaller number thus changing the reason for a tournament rating. Friendly ratings are calculated from games where the player is free to choose his or her opponents. And that is what you are doing when limiting your opponents to those just within a single club. Since the process of choosing opponents can have some effect on the resulting rating, the Friendly rating is seen as less accurate than a Tournament rating, which is calculated only as the result of games played in a site tournament.

Our concern with ratings are accuracy. We want the system used here as accurate as possible. Even with friendly ratings. When a club with 4 players only plays themselves, the friendly rating won't be very accurate at all. Even less so if it were a tournament rating. At least with 6 players in any given match, you will find the ratings a little more accurate than with fewer players. If you are going to use the site system, these are the limits we are placing on it. If the limits do not suit you, you are still free to use the wiki system to set them up.

Posted on the Club Presidents discussion board 5.24.09

There has been much discussion regarding tournament ratings. Several, who want to be able to limit the participation of a tournament to just their club, want those games to be tournament rated.

Let's start with clarifying what each rating is and what it's purpose is:

Friendly ratings are for those who select their opponents.
Tournament ratings are for those who do NOT select their opponents in any way.

Fact: When you have tournaments restricted to club members only, it does not allow everyone from the whole site to play (the extent of the true playing pool). In fact, you are limiting the playing field to a select few.

Fact: When you can pick and choose your own opponents, that's not reflecting a true tournament rating.

GoldToken was built on the premise that we will be the best of the best. If the system is compromised, we can no longer claim that. Therefore in order to insure the integrity of GoldToken, all tournament rated games are computer generated. If you want to do club tournaments and chose your partners from among a limited player group, we say go for it! Have fun! However, those ratings will not be reflected as tournament ratings. It would be like having a professional baseball team choosing a minor league team to play, so they would be assured of winning. I am sure all of you will agree that would not be right. Who would want to have a rating based on something that bogus? Not me!

GoldToken is for everyone. You don't have to be in tournaments to have fun here, that's why we have friendly games. But for those of us who want to play tournaments, we do not want it to be compromised. There are plenty of sites that players can go to, and manipulate, to make themselves feel like they are the best. If that is what is important, I wish you luck. But if you truly want to be the best, or display your true playing ability, you will not have any problems playing anyone who wants to play. You will win because you really are the best. Not because you picked someone who is not up to your speed, or is only in the same rating class as yourself so that your rating won't drop should you loose because you are more concerned about your rating than playing the game.

I understand that clubs want to limit tournaments to just those within their clubs. But when the playing field is so small, (even if the club has 30 or 40 active players in the tournament) what you are really doing is limiting the playing field...much like selecting your own opponents.

Tournament ratings are going to be generated by the database. If you want to show who is best on GoldToken, then it will be in the site sponsored tournaments open to everyone.

Clubs only had friendly ratings in all their club matches up until a year ago. A tournament rating is for those who do not select their opponents. Plain and simple. If we allow a club with 30 players to play only each other, that defeats the purpose of a tournament rating.

Looking forward to some hard fought matches,
~~ Badger

More regarding club ladders and the tournament rating:

The problem with tournament ratings for ladders in clubs is that it would give unfair rating values for small clubs. If all clubs were large, then we might feel differently, but as it is, a considerable number of clubs are on the small side with only 10 to 20 players (many with even less).

For tournament ratings, there needs be a larger random player base in which one cannot select their competition. Thus, to keep a true rating standard, club ladder games can only be friendly rated.

More_Info

[ Ratings in Practice ]
[ Ratings FAQ ]