![]() |
| If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|||||||
| Tags: based, moves, rather, rating, result, than |
|
|
Thread Tools | Display Modes |
|
#11
|
|||
|
|||
|
How can someone accept the ELO-rating system and still doubt
computer-strength, when they have such good ELOs ? I won't mind humans to assign ratings to the moves of chess-games instead, but computers are cheaper and impartial and have better memory to compare with other games. The exact criteria, how to rate the moves are subject to discussion, but even a simple algo should be better than the actual system. Suppose you have two games with exactly the same moves. Then obviously the rating of the 2 games should be the same for the 2 white-players and the same for the 2 black-players. and should not depend on the opponent's ELO. Quick draws should be (almost) discarded. Short games should have smaller weight than long games. As a rule of thumb, count the ratio of situations, where you avoided to make mistakes divided by the number of situations where you had a chance to make mistakes. Guenter |
| Ads |
|
#12
|
|||
|
|||
|
Even if your system were practicable – which I very much doubt – I
think it would not be a good idea for a very simple reason. If you want the rating difference to reflect the expectation value of a player's RESULTS in a tournament, you will have to base your system on past RESULTS (like the current system), not on the strength of the player's moves. There are important other factors that will contribute to determining the outcome of his games. Take the extreme example of a player who plays the best move in every position, but who just keeps losing on time. In your system such a player would have a very high rating, in spite of which one could confidently predict that in his next tournament he would perform as lousy as ever. Tobi |
|
#14
|
|||
|
|||
|
illspam (NoMoreChess) wrote in
rec.games.chess.misc: It is ohhh so easy to shoot-down an idea which may not be perfect in practice! The idea is not shot down. The idea is _discussed_ here because it has a flaw. The flaw is that Sterten, the originator of the (ofter proposed) idea, wants to establish an absolute value based on relative values. He wants to define "absolute strenght from a fixed point" as "strenght difference from an 'unfixed' point". But the idea of objectively measuring, not merely final results of games, but overall accuracy of moves, is a legitimate one. There is no question about the legitimacy of the idea, there is question about the logic behind the idea. Unlike the flying ceebee and terrybean, I am unafraid of such an *objective* approach to my own games. I'm glad you're unafraid. Casually following your messages here I can see you have a special taste in mixing with discussions ending in a fight and a possible ad hominem attack. That's okay, Usenet is free and lets you be all that you want to be. It's your stage as well. But don't confuse your regular attitude in discussions with knowledge of the subject you're discussing. Worse, your message shows that you either didn't take the time to follow the discussion, or shows that the content is way above your head. There's nothing wrong with benchmarking. The way as it is proposed by Sterten is however flawed as it is based on a factual misunderstanding of the chess rating system. He tries to solve a non-existing problem with a method based on incorrect assumptions. It has been explained to him, and I'm interested in discussing about it as long as he finds the subject interesting himself. And I leave the "shooting down" to you, even if you don't seem to have a clue in what direction you are shooting, what you're shooting at, and why you're shooting at it in the first place. Happy hunting. -- CeeBee Uxbridge: "By God, sir, I've lost my leg!" Wellington: "By God, sir, so you have!" Google CeeBee @ www.geocities.com/ceebee_2 |
|
#15
|
|||
|
|||
|
I don't see CeeBee's point:
"it's not based on differences alone, so it can't work." Sure it's based on differences, you have to fix a range. You can't know how good it works, since we've not discussed the details yet. If my system just weight's the result with 1 and the move-quality by an infinitesam amout, then it's almost equal to the existing system. These parameters have to be optimised. If one player permanently looses by time, this will be taken in account, I don't ignore the result. But if one player permanently looses on time and played well, he should get a better rating than a player who permanently looses on time and playes poorly. This reflects the expectation of his future perform: he once might learn how to manage his time-problems. It's just unreasonable to loose by time permanently, there is nothing you can do, if someone wants to outfox the system by playing well but then resigning or loosing by time.. But that's unreasonable. It's just rated as a bad move. suppose there only multi-game matches, no tournaments. System A only rates the outcome of a match, which player won. System B also takes in account the score, so a 10:0 gives more ELO-points than a 5.5:4.5 . Which system would you prefer ? |
|
#16
|
|||
|
|||
|
..
While computers are not yet so superb at all aspects of the game as to be sufficient to accurately judge positional play/longrange strategy, they are sufficiently good to passably rate the play of patzers, which constitute the vast majority of chessplayers. As one famous player put it: chess is 99% tactics. As for the attempt to compare ratings of players from different pools, the whole point would be to eliminate such isolated pools, by effectively tossing everyone into the same pool: the pool of all chessgames ever played. All such games would be evaluated by a chess program (or programs) objectively. The computer would not know or care who played the White pieces, or the Black. Its evaluation would come out precisely the same for a game between you and I, as for a game between Lasker and Capablanca, provided we followed the same moves: /* 1.e4 e5 2.Nf3 Nf6 (draw agreed)// *zero significant errors detected* *premature draw penalty applied* ratings dump: 1.Terrybean = 2600 2.Nomorechess = 2625 3.Lasker = 2600 4.Capablanca = 2625 |
|
#17
|
|||
|
|||
|
..
(Longwinded ad hominem rant snipped). Mr. Ceebee, you need to learn a few things before blindly shooting-down someone who disagrees with one of your mere opinions. First and foremost, the fact that a few others here have consistently resorted to personal attacks on me because I dared to disagree with something they wrote is hardly evidence that my criticism of YOUR attack is anything remotely resembling an ad hominem attack on YOU. This simply demonstrates your own problems with logic -- a field in which you have attempted to pose as some kind of expert, which you obviously are not. Secondly, you are right that I have leaped into the middle of a discussion; but you are dead wrong to assume that I need any knowledge of your prior postings in order to state an opinion on the one I did read, and to which I replied. Again, your difficulties with logic are revealing. In my opinion, the real question is not whether it can be done by using computers, but *how well* can it be done, and is that well enough to justify all the trouble? In this context, I fully understand and sympathize with your position, for the most part. But there was a time, not so very long ago, when computers were ASSUMED by the "experts" to have no potential to reach a point which they stand far above, today. The arguments for their severe limitations were as numerous as they were flawed. Let us try not to "lose the same way" twice, by vast underestimation of the computer's potential. |
|
#18
|
|||
|
|||
|
"NoMoreChess" wrote in message ... . While computers are not yet so superb at all aspects of the game as to be sufficient to accurately judge positional play/longrange strategy, they are sufficiently good to passably rate the play of patzers, which constitute the vast majority of chessplayers. As one famous player put it: chess is 99% tactics. Disagree. Computers are no where good enough to rate positional play/longrange strategy for patzers. As for the attempt to compare ratings of players from different pools, the whole point would be to eliminate such isolated pools, by effectively tossing everyone into the same pool: the pool of all chessgames ever played. All such games would be evaluated by a chess program (or programs) objectively. The computer would not know or care who played the White pieces, or the Black. Its evaluation would come out precisely the same for a game between you and I, as for a game between Lasker and Capablanca, provided we followed the same moves: /* 1.e4 e5 2.Nf3 Nf6 (draw agreed)// *zero significant errors detected* *premature draw penalty applied* ratings dump: 1.Terrybean = 2600 2.Nomorechess = 2625 3.Lasker = 2600 4.Capablanca = 2625 It is very nice of you to put me in such illustrious company )Regards |
|
#19
|
|||
|
|||
|
|
|
#20
|
|||
|
|||
|
(Sterten) wrote in rec.games.chess.misc:
I don't see CeeBee's point: "it's not based on differences alone, so it can't work." No. You replace a system that is based on differences in strength from _actual_ play with differences in strenght from a computer benchmark crudely translated from differences in strenght in actual play. That doesn't give more "absolute" results, but worse results. Your logical error is that you still want to translate strength differences to "more objective values", which mixes up the basic idea of absolute strength with differences in strength. Sure it's based on differences, you have to fix a range. You can't know how good it works, since we've not discussed the details yet. If my system just weight's the result with 1 and the move-quality by an infinitesam amout, then it's almost equal to the existing system. These parameters have to be optimised. If one player permanently looses by time, this will be taken in account, I don't ignore the result. But if one player permanently looses on time and played well, he should get a better rating than a player who permanently looses on time and playes poorly. This reflects the expectation of his future perform: he once might learn how to manage his time-problems. But you wanted a more absolute rating, and now you introduce the notion about possible future performance. You suggest that knowing the moves is enough to be qualified as a good player. This is not the case. If my opponent -who is as strong as me- thinks longer that me in an actual game, he finds better solutions, and might win. However the clock prevents him from doing that. So he loses on time. And now you state that shouldn't be taken into account, and he's the stronger player. In that case your system leads to a qualification "ceebee's opponent is stronger than cee, because he found out better moves in an actual game, yet not within his allotted time". But if we both stuck to our time, we would have drawn, because we're equal in strength. The Elo system values that correctly, your system doesn't. It's just unreasonable to loose by time permanently, there is nothing you can do, if someone wants to outfox the system by playing well but then resigning or loosing by time.. But that's unreasonable. It's just rated as a bad move. suppose there only multi-game matches, no tournaments. System A only rates the outcome of a match, which player won. System B also takes in account the score, so a 10:0 gives more ELO-points than a 5.5:4.5 . Which system would you prefer ? Neither of the systems is a correct description of the Elo system. The Elo system takes into account both individual results _and_ opponent's strength. A 10:0 victory by Kasparov over a patzer is less worth for his Elo points than a 5,5:4,5 win in a 2700+ opponent match. Let me explain in more lenght. Your idea was: "The computer analyses the positions and rates every single move of the game and finally calculates a rating-number for both players and that game based on the moves rather than the result." Recapitulating you suggest to replace the current rating system with a more objective benchmark by computers based om move valuing. I have explained that the current rating system is not about strength , but about strength differences. Even better: this is the prime objective of the system. Why then are people so convinced that ratings tell you something about absolute strength? It's because chess player pools are so fluently intermixing with each other, both in time as in location. It won't surprise you that a 2200 player from say Australia is often on par with a 2200 player from say France. Players mix with each other in worldwide tournaments, they mix with players at home and de facto those players mix with players worldwide - as a result ratings are often leveled. Older players with rating established against retired players play younger players and thus transfer those strength differences from one age pool to the other. But it does _not_ mean the rating system gives an accurate measure of strenght. Differences of the same order just mix throughout pools, but they stay differences in strength. Often people don't understand that: they want to compare Fischer with Kasparov. But to no avail: the playing pool of Fischer is too much disconnected from Kasparov's pool of opponents. Their rating difference has no meaning. Sometimes you'll read here that ratings are inflated, because in earlier day you were a top grandmaster at 2600, while now you have to be 2700 to be very strong. Of course this is nonsense, the rating calculation hasn't been changed: a difference of 200 still means the same in winning/losing chances as 30 years ago. Rating inflation would mean that a bigger difference is needed to have the same winning chances than in the past. That's only possible if the calculation method has changed, which isn't the case. Difference, not absolute strenght. Now you want to develop a system that values chessplayers on individual moves. That is not a new idea, but what does it mean? It means you have to know what is important to be strong. Tactics? Positional knowledge? Recognition of standard patterns and characteristics? Knowledge of opening theory? Knowledge of endgames? Knowledge of games from past masters? Knowledge to not find one crucial move, but calculate the actual and correct move sequence? The ability to play a game without a losing move after ten strong moves? Psychological strenght in a game? Physical fitness during a tournament? The ability to think undisturbed in a noisy room? The speed at which you solve a problem? The number of games you're able to play at a constant level? All these things determine the strenght of a chess player, and even more. First problem: what is their comparative weight? You suggest that a computer can tell you better how strong a chess player is in a 20 games round robin tournament, at fast time controls, in a cold playing arena with thousand spectators on an uncomfortable chair, against a very strong and impressive opponent than the Elo-system. How would that be in a luxurious environment in one's hometown, against a homesick opponent? As important is the question of chosing and especially valuing those moves. Giving a value to a move is referring to another standard. Which standard is that? What is the value of move one? And what of move two? Why would move one be more valuable than move two? In practice you arrive at determining differences in strenght between moves. It means you suppose an arbitrary difference is better than a difference based on actual play. But we want to know strength in actual play, and not in a theoretical situation.The proof of the pudding is in the eating, not in knowing the recepy by heart. Weird but true, but there is a system for just that: the Elo-system. Your computer system doesn't do that. It can give you a player that scores better than Kasparov yet loses every game against him. Put them both back around your computer test and the results will be the same: your player is better in the computer test than Kasparov. And worse, your system has no way of dealing with that discrepancy. But not with the current rating system you consider inferior: that system takes into account evey win and loss. In the above example, your tested player will see his rating drop rapidly below that of Kasparov, and _that_ tells you even more about his actual strenght than your benchmark. And that while the Elo system wasn't even developed for that purpose. You made two error in your idea: first of all you consider the current rating system as a system to determine strength, which is not true, and secondly you suppose that a computer valued rating without a pool of peers and without proper understanding of all determining factors can give _better_ information than the current rating system. Maybe one day we will be able to establish all factors defining absolute strength, but the current batch of computer programs is certainly no match for the well estabshed and proven Elo rating system. -- CeeBee Uxbridge: "By God, sir, I've lost my leg!" Wellington: "By God, sir, so you have!" Google CeeBee @ www.geocities.com/ceebee_2 |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Prominent TD reports major cheating incident at Foxwoods | Tim Hanke | rec.games.chess.politics (Chess Politics) | 128 | May 17th 04 08:09 PM |
| Kaspy vs X3D Fritz PGN | NetSock | rec.games.chess.computer (Computer Chess) | 4 | December 16th 03 01:07 PM |
| Q. about rating systems | Javier Fuentes | rec.games.chess.misc (Chess General) | 3 | September 15th 03 10:13 AM |
| Does unofficial rating of 2200 counts as NM? | Denis | rec.games.chess.politics (Chess Politics) | 8 | August 25th 03 03:55 AM |