![]() |
| If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|||||||
| Tags: based, moves, rather, rating, result, than |
|
|
Thread Tools | Display Modes |
|
#21
|
|||
|
|||
|
|
| Ads |
|
#22
|
|||
|
|||
|
CeeBee wrote:
(Sterten) wrote in rec.games.chess.misc: I don't see CeeBee's point: "it's not based on differences alone, so it can't work." No. You replace a system that is based on differences in strength from _actual_ play with differences in strenght from a computer benchmark crudely translated from differences in strenght in actual play. That doesn't give more "absolute" results, but worse results. sorry, I don't understand. My system should also consider the ratings of the opponents before that game , of course. How can you know, that it's "crudely" translated ? Your logical error is that you still want to translate strength differences to "more objective values", which mixes up the basic idea of absolute strength with differences in strength. I'm probably too dumb to detect a logical error here. How can the struggle for more objective values or mixing up something be a logical error ? Sure it's based on differences, you have to fix a range. You can't know how good it works, since we've not discussed the details yet. If my system just weight's the result with 1 and the move-quality by an infinitesam amout, then it's almost equal to the existing system. These parameters have to be optimised. If one player permanently looses by time, this will be taken in account, I don't ignore the result. But if one player permanently looses on time and played well, he should get a better rating than a player who permanently looses on time and playes poorly. This reflects the expectation of his future perform: he once might learn how to manage his time-problems. But you wanted a more absolute rating, and now you introduce the notion about possible future performance. You suggest that knowing the moves is enough to be qualified as a good player. This is not the case. If my opponent -who is as strong as me- thinks longer that me in an actual game, he finds better solutions, and might win. if he "knows the moves" , then why should he reflect so long ? However the clock prevents him from doing that. So he loses on time. And now you state that shouldn't be taken into account, and he's the stronger player. No, I didn't. I think that exceeding the time should be valued as just making one very big mistake. This is the best we can do, since the reflection times for each move are usually not reported. In that case your system leads to a qualification "ceebee's opponent is stronger than cee, because he found out better moves in an actual game, yet not within his allotted time". But if we both stuck to our time, we would have drawn, because we're equal in strength. how do you know this ? If you make worse moves 39 times in a game, and then your opponent makes one big error in move 40, so you win , would you conclude that your opponent is good to make 39 moves in 2 hours but not on 40 moves in 2 hours ? Exceeding the time in move 40 is almost random, on move 35 you can't know whether it will happen or not. The Elo system values that correctly, your system doesn't. Frankly, the fact that players exceed the time at all shows, that there is something wrong , IMO. These players don't care so much about the result but more about the quality of their moves in the game. But the rating system assumes, that they only play for the result. It's just unreasonable to loose by time permanently, there is nothing you can do, if someone wants to outfox the system by playing well but then resigning or loosing by time.. But that's unreasonable. It's just rated as a bad move. suppose there only multi-game matches, no tournaments. System A only rates the outcome of a match, which player won. System B also takes in account the score, so a 10:0 gives more ELO-points than a 5.5:4.5 . Which system would you prefer ? Neither of the systems is a correct description of the Elo system. The Elo system takes into account both individual results _and_ opponent's strength. A 10:0 victory by Kasparov over a patzer is less worth for his Elo points than a 5,5:4,5 win in a 2700+ opponent match. Of course, I was assuming that the same players were playing. Let me explain in more lenght. Your idea was: "The computer analyses the positions and rates every single move of the game and finally calculates a rating-number for both players and that game based on the moves rather than the result." sorry, it should be : "rather than the result alone" as the example with time-exceeding shows. Recapitulating you suggest to replace the current rating system with a more objective benchmark by computers based om move valuing. yes. And I was asking whether similar things are already being done or experimented with. I have explained that the current rating system is not about strength , but about strength differences. and thus also about strength. If you are stronger than anyone else, then you are "strong" in usual language. Even better: this is the prime objective of the system. Why then are people so convinced that ratings tell you something about absolute strength? statistical indication for this is overwhelming. It's because chess player pools are so fluently intermixing with each other, both in time as in location. It won't surprise you that a 2200 player from say Australia is often on par with a 2200 player from say France. Players mix with each other in worldwide tournaments, they mix with players at home and de facto those players mix with players worldwide - as a result ratings are often leveled. Older players with rating established against retired players play younger players and thus transfer those strength differences from one age pool to the other. yes But it does _not_ mean the rating system gives an accurate measure of strenght. then, what was the reason that you included the previous paragraph ? You were showing, that it works quite well, although not accurate. Nothing is 100% accurate. Differences of the same order just mix throughout pools, but they stay differences in strength. Often people don't understand that: they want to compare Fischer with Kasparov. But to no avail: the playing pool of Fischer is too much disconnected from Kasparov's pool of opponents. Their rating difference has no meaning. You mean "less meaning" . They are still "connected". The reliability of the system spreads a bit over time. Didn't I see here recently Jerry Spinrad with a rating system for ancient players back to the 1830s ? I'm sure, he would not have posted if it weren't somehow reliable. Sometimes you'll read here that ratings are inflated, because in earlier day you were a top grandmaster at 2600, while now you have to be 2700 to be very strong. no surprise. Players have computers available today for preparing, learning,training,storing. Also, when the number of players increases, then the spread : best-weakest also (usually) increases. Of course this is nonsense, the rating calculation hasn't been changed: a difference of 200 still means the same in winning/losing chances as 30 years ago. Rating inflation would mean that a bigger difference is needed to have the same winning chances than in the past. depends, how you define it That's only possible if the calculation method has changed, which isn't the case. Difference, not absolute strenght. Now you want to develop a system that values chessplayers on individual moves. That is not a new idea, has it already been tried in practice ? References ? but what does it mean? It means you have to know what is important to be strong. Tactics? Positional knowledge? Recognition of standard patterns and characteristics? Knowledge of opening theory? Knowledge of endgames? Knowledge of games from past masters? Knowledge to not find one crucial move, but calculate the actual and correct move sequence? The ability to play a game without a losing move after ten strong moves? Psychological strenght in a game? Physical fitness during a tournament? The ability to think undisturbed in a noisy room? The speed at which you solve a problem? The number of games you're able to play at a constant level? All these things determine the strenght of a chess player, and even more. most of these are not open to computer measurement after the game. And they are _not_ considered by the current ELO-system. If you think, they should be considered, then you are just advocating "my" rating system. First problem: what is their comparative weight? You suggest that a computer can tell you better how strong a chess player is in a 20 games round robin tournament, at fast time controls, in a cold playing arena with thousand spectators on an uncomfortable chair, against a very strong and impressive opponent than the Elo-system. The ELO-system can't. "My" system _maybe_ can handle this better, but certainly not worse. How would that be in a luxurious environment in one's hometown, against a homesick opponent? As important is the question of chosing and especially valuing those moves. Giving a value to a move is referring to another standard. Which standard is that? What is the value of move one? And what of move two? Why would move one be more valuable than move two? as a start, just take the usual computer evaluations in pawn-units In practice you arrive at determining differences in strenght between moves. It means you suppose an arbitrary difference is better than a difference based on actual play. ? But we want to know strength in actual play, and not in a theoretical situation.The proof of the pudding is in the eating, not in knowing the recepy by heart. only the actually played moves are evaluated, not any hypothetical moves, or moves of another game. Weird but true, but there is a system for just that: the Elo-system. the ELO-system is only based on the result of a game and discards their moves. Your computer system doesn't do that. It can give you a player that scores better than Kasparov yet loses every game against him. unlikely. And less likely than with the ELO-system , IMO. Put them both back around your computer test and the results will be the same: your player is better in the computer test than Kasparov. ? And worse, your system has no way of dealing with that discrepancy. But not with the current rating system you consider inferior: that system takes into account evey win and loss. In the above example, your tested player will see his rating drop rapidly below that of Kasparov, and _that_ tells you even more about his actual strenght than your benchmark. And that while the Elo system wasn't even developed for that purpose. You made two error in your idea: first of all you consider the current rating system as a system to determine strength, which is not true, and secondly you suppose that a computer valued rating without a pool of peers and without proper understanding of all determining factors can give _better_ information than the current rating system. if we can't consider _all_ factors, then we should try to determine _as many as possible_ , right ? Neglecting the moves is like ostrich putting the head into the sand. Maybe one day we will be able to establish all factors defining absolute strength, but the current batch of computer programs is certainly no match for the well estabshed and proven Elo rating system. computers have developed a lot since that system was established. Time to adapt it to the actual situation. Guenter Stertenbrink |
|
#24
|
|||
|
|||
|
On 13 Oct 2003 10:49:50 GMT, CeeBee wrote:
No. You replace a system that is based on differences in strength from _actual_ play with differences in strenght from a computer benchmark crudely translated from differences in strenght in actual play. That doesn't give more "absolute" results, but worse results. It is fairly easy to see that the strength or quality of one's moves in a single game bears little relation to one's overall chess abilities. For instance I once, as white, played the following game, or gamlet: 1.e4 e5 2.Nf3 d6 3.Bc4 h6?! 4.Nc3 Bg4? 5.Nxe5 Bxd1?? 6.Bxf7+ Ke7 7.Nd5# Now, if I put this to our chess computer it will find that I, as white, played pretty much perfectly and possibly give me a "rating" of 3000 or more. Of course the whole game had been played many times before and I had basicly memorized it and so it required just about no thought on my part at all. At the time I was rated in the 1800s. And this is true of any single game. Sometimes I really play quite well and sometimes I play like an idiot and my "rating" is a prediction of my future *results* based on my overall past *results*. However, if we get a very strong computer to go over a few thousand or perhaps a few million games of all available players and then compute and save in a database what we might call an "accuracy index" for each game and an overall "accuracy index" for each player we will, presumably, find some correlation between this index and the player's actual rating. If so we can then get the computer to rank players according to overall "accuracy index" and rank all the players accordingly. Now since this is a computer analysis what it will really be returning is an assessment of each player's tactical accuracy. I suspect that at the world championship level each player's index will be so high as to produce quite a low correlation between index and rating and so we might find that, so far as distinguishing between world championship level players the "index" is pretty well useless. At that level we might well discover that there are other factors than mere "accuracy" that affect results more. Take Lasker for instance. Always playing second best openings he still manages to get massive results against the greatest players of his day. Yet a computer might well give him a significantly lower "accuracy index" than these other greats of his day based on his relatively poor opening play. Or it might not. The results would be extremely interesting, I think. It would be very nice if someone did them and published the results. It would take a long time and a computer dedicated to the task I imagine. Maybe the results could give someone fodder for a doctorate thesis. Or maybe not, I don't really know enough about such things to tell. BUT the only thing a computer can get from analyzing my or anyone's games move by move would be some kind of accuracy score. That such a score would reflect my or anyone else's actual results in tournament play is merely a hypothesis and at the present time I think it is a hypothesis that is unsupported by sufficient evidence to accept blindly. Certainly the idea of computing one's rating and deciding one's strength from a tactical analysis of a single game is naive at best. |
|
#25
|
|||
|
|||
|
In usual chess position there is no absolutely right move that is the only
way to win or to draw (not to lose). Computer chooses one of them as the best only as "it seems not worse than anything else in the nearest n-plies". Only critical moves that human or computer can fail to recognize but that lead to lose are important. Some critical moves, e.g. oversight of some material, or move that leads to lose of material, are able to make the rest moves redundant. The result is known - it is a lose. For computers this lose can be caused by necessity to sacrifice some figure to save the nearest game. So such rating system cannot say if player made perfect move in usual position because it should search the game to the end before say all of win is possible, draw is possible, lose is possible. The one case of win, draw, lose will come only after critical move. But there exist psychological moment for the critical move. If it is a game in a match, if human thinks about 40th move of 2h game, so tired, he/she can fail to make proper move and lose the game. But in case you found comfortable place and time to think about this only move you can concentrate better and find better move than some GM, that doesn't lead to lose. Is it worth to think about you success and rating against that GM? |
|
#26
|
|||
|
|||
|
Ed Seedhouse wrote in rec.games.chess.misc:
However, if we get a very strong computer to go over a few thousand or perhaps a few million games of all available players and then compute and save in a database what we might call an "accuracy index" for each game and an overall "accuracy index" for each player we will, presumably, find some correlation between this index and the player's actual rating. If so we can then get the computer to rank players according to overall "accuracy index" and rank all the players accordingly. This is another way to determine differences in strenght. Maybe it's a more accurate way of determining those differences. Although interesting, it might turn out to be an even more complicated way to assess strength differences than the current Elo system. And it might well add little extra knowledge to the meaning of those strenght differences we already know. The original poster wanted to use such a databaseto test people against such a database (even without playing a single game before) to establish a "more absolute" strength number. However it will still be a translation of "how does this player calculate moves in comparison with a lot of other players?" The result will be "He is accurate at 57%." Now he will be placed in a pool with players who scored 57% as well. How will he perform in real play? "He will perform like people with a 57% score" say a 1700 player. Will he win from a player that has a 54% score? If so, he's stronger. However we do know that being stronger is not determined by 'winning one game'. Oner can play hundred games against an opponent, win 60 and lose 40, and one is considered stronger, despite the fourty losses. His winning chances are 60/40. This is the only practical result of your calcuation. So you still come to the same result. His absolute strenght is not 57%, or 1700, his win/lose chances against other players will depend on their rating difference with him. In the end one has taken a long road to arrive at the same point the Elo rating system took in a fraction of the time. Certainly the idea of computing one's rating and deciding one's strength from a tactical analysis of a single game is naive at best. Not exactly naive I think, but very inaccurate. -- CeeBee Uxbridge: "By God, sir, I've lost my leg!" Wellington: "By God, sir, so you have!" Google CeeBee @ www.geocities.com/ceebee_2 |
|
#27
|
|||
|
|||
|
If my 7 year old, my 4 year old and I each played a single game with
Kasparov, we would all lose. The 0-1 result would give us no information about any of our strengths because losing a single game to Kasparov is not an unlikely event for any chessplayer. But anyone who plays chess better than we do and looked at the moves of the games would easily see which of us was stronger. For example, he could just count the frequency of errors at 1-ply or 2-ply. I don't think that anyone was proposing eliminating the result-based rating system. The issue is a software tool using available information to assess playing strength. Obviously the moves of a game contain more information about playing strength than just the results of the game. It would be a very useful function for software to have a rating function and it doesn't seem like it would be difficult to implement. If results of a chess game correlate with the quality of the moves played in the game, then analysis of the moves can be used to predict results. For a fixed number of games, this will be more accurate than a prediction based on results alone. "CeeBee" wrote in message . 6.83... Ed Seedhouse wrote in rec.games.chess.misc: However, if we get a very strong computer to go over a few thousand or perhaps a few million games of all available players and then compute and save in a database what we might call an "accuracy index" for each game and an overall "accuracy index" for each player we will, presumably, find some correlation between this index and the player's actual rating. If so we can then get the computer to rank players according to overall "accuracy index" and rank all the players accordingly. This is another way to determine differences in strenght. Maybe it's a more accurate way of determining those differences. Although interesting, it might turn out to be an even more complicated way to assess strength differences than the current Elo system. And it might well add little extra knowledge to the meaning of those strenght differences we already know. The original poster wanted to use such a databaseto test people against such a database (even without playing a single game before) to establish a "more absolute" strength number. However it will still be a translation of "how does this player calculate moves in comparison with a lot of other players?" The result will be "He is accurate at 57%." Now he will be placed in a pool with players who scored 57% as well. How will he perform in real play? "He will perform like people with a 57% score" say a 1700 player. Will he win from a player that has a 54% score? If so, he's stronger. However we do know that being stronger is not determined by 'winning one game'. Oner can play hundred games against an opponent, win 60 and lose 40, and one is considered stronger, despite the fourty losses. His winning chances are 60/40. This is the only practical result of your calcuation. So you still come to the same result. His absolute strenght is not 57%, or 1700, his win/lose chances against other players will depend on their rating difference with him. In the end one has taken a long road to arrive at the same point the Elo rating system took in a fraction of the time. Certainly the idea of computing one's rating and deciding one's strength from a tactical analysis of a single game is naive at best. Not exactly naive I think, but very inaccurate. -- CeeBee Uxbridge: "By God, sir, I've lost my leg!" Wellington: "By God, sir, so you have!" Google CeeBee @ www.geocities.com/ceebee_2 |
|
#28
|
|||
|
|||
|
"Alexander Belov" wrote in message ... In usual chess position there is no absolutely right move that is the only way to win or to draw (not to lose). Computer chooses one of them as the best only as "it seems not worse than anything else in the nearest n-plies". Only critical moves that human or computer can fail to recognize but that lead to lose are important. Some critical moves, e.g. oversight of some material, or move that leads to lose of material, are able to make the rest moves redundant. The result is known - it is a lose. For computers this lose can be caused by necessity to sacrifice some figure to save the nearest game. So such rating system cannot say if player made perfect move in usual position because it should search the game to the end before say all of win is possible, draw is possible, lose is possible. The one case of win, draw, lose will come only after critical move. But there exist psychological moment for the critical move. If it is a game in a match, if human thinks about 40th move of 2h game, so tired, he/she can fail to make proper move and lose the game. But in case you found comfortable place and time to think about this only move you can concentrate better and find better move than some GM, that doesn't lead to lose. Is it worth to think about you success and rating against that GM? Agreed ![]() Regards |
|
#29
|
|||
|
|||
|
"David Kane" wrote in rec.games.chess.misc:
But anyone who plays chess better than we do and looked at the moves of the games would easily see which of us was stronger. For example, he could just count the frequency of errors at 1-ply or 2-ply. That is true, yet it is not relevant for the current discussion. I did't doubt that one is able to see strength differences in actual play, I tried to explain that looking at actual moves is _not a better way_ to determine _absolute strength_. I don't think that anyone was proposing eliminating the result-based rating system. Again: it wasn't discussed. The original poster suggested adding a _better_ rating system with "more absolute values". With that he makes two mistakes: it suggests incorrectly that the current system gives "inaccurate absolute values". The current Elo system might give inaccurate values, but _no_ absolute values whatsoever; secondly he supposes that the validation of individual moves can give a useful official rating to even players who have never played a game before, which is also not true, because it might tell you something (very limited) about chess understanding, but doesn't tell you anything about _absolute_ chess strenght in actual play. The issue is a software tool using available information to assess playing strength. No, this is not the issue. The issue is the way to reach a correct assessment of chess strength, disconnected from actual play. Obviously the moves of a game contain more information about playing strength than just the results of the game. Again: the original poster didn't suggest a _way_ to assess strength, he suggested a _better_ way to determine playing strength. "Better" meant "better than the current Elo rating". He then suggested a system that will result in an even worse assessment of playing strength. It would be a very useful function for software to have a rating function and it doesn't seem like it would be difficult to implement. Many programs do have a rating function. The Fritz GUI rates you on basis of results against a "rated" Fritz opponent. ChessMaster even rates your play with puzzles in which you have to find the right moves. This rating is however not based on objective strenght determinants but on a comparison with a pool of players solving it with a similar rating. It's a derivative of the existing rating system. If results of a chess game correlate with the quality of the moves played in the game, then analysis of the moves can be used to predict results. For a fixed number of games, this will be more accurate than a prediction based on results alone. It's a useful _bypass_. But this is not what's the discussion is about. The discussion is not about the useability of such a bypass, but about the question if such a system is _better_ than the current system, which isn't the case. Bottom line of the problem (or misunderstanding) is that chess is a competitive sport. The opponent is not one of the many factors, she or he's the determining factor. And as long as this is not the basis of the calculation, every result of another calculation will always be a substitute for the rating you get based on play against opponents, like in the Elo-rating system. -- CeeBee Uxbridge: "By God, sir, I've lost my leg!" Wellington: "By God, sir, so you have!" Google CeeBee @ www.geocities.com/ceebee_2 |
|
#30
|
|||
|
|||
|
On 13 Oct 2003 16:32:50 GMT, CeeBee wrote:
Ed Seedhouse wrote in rec.games.chess.misc: However, if we get a very strong computer to go over a few thousand or perhaps a few million games of all available players and then compute and save in a database what we might call an "accuracy index" for each game and an overall "accuracy index" for each player we will, presumably, find some correlation between this index and the player's actual rating. If so we can then get the computer to rank players according to overall "accuracy index" and rank all the players accordingly. This is another way to determine differences in strenght. Well, I don't know if it is. The only real coherent measurement of strength I know of is how well you actually perform against actual players. Does an "accuracy index" reflect performance? I imagine it does, up to a point. But the evidence is not in since the exercise has never (to my knowlege) been done. So I'd prefer not to jump to conclusions and simply say that the results of such an analysis would be very interesting at least to me. Maybe it's a more accurate way of determining those differences. Although interesting, it might turn out to be an even more complicated way to assess strength differences than the current Elo system. And it might well add little extra knowledge to the meaning of those strenght differences we already know. Yes, I agree. That doesn't necessarily mean it isn't worth doing at some point. Maybe we should wait a few years until the micro programs are vastly stronger than all human grandmasters and so can be agreed to provide an objective evaluation of moves. Nor can one just assume that computing an "accuracy index" will be simple and straightforward. For one thing, just what we mean by "accuracy" will have to be defined in numerical terms. The original poster wanted to use such a databaseto test people against such a database (even without playing a single game before) to establish a "more absolute" strength number. In this he is being very naive, in my opinion. First get the evidence and *then* draw some conclusions from it, if possible. In the end one has taken a long road to arrive at the same point the Elo rating system took in a fraction of the time. Well possibly, but until we actually do some such project we really don't know if it will give us useful added information. I suspect it will, but it won't solve all problems by any means of course. |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Prominent TD reports major cheating incident at Foxwoods | Tim Hanke | rec.games.chess.politics (Chess Politics) | 128 | May 17th 04 08:09 PM |
| Kaspy vs X3D Fritz PGN | NetSock | rec.games.chess.computer (Computer Chess) | 4 | December 16th 03 01:07 PM |
| Q. about rating systems | Javier Fuentes | rec.games.chess.misc (Chess General) | 3 | September 15th 03 10:13 AM |
| Does unofficial rating of 2200 counts as NM? | Denis | rec.games.chess.politics (Chess Politics) | 8 | August 25th 03 03:55 AM |