![]() |
| If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|||||||
| Tags: capa, chess, cuz, greatest, karpov, kasparov, kramnik, lie, order, players, puters |
|
|
Thread Tools | Display Modes |
|
#121
|
|||
|
|||
|
Dr A. N. Walker wrote:
Also not entirely convinced by arguments about Crafty only being able to rank mere patzers. I'm well short of super-GM status myself, but I think I know enough about chess to be able to judge that [eg] Kramnik is a better player than the typical 2650-ish GM. Sure, you can tell that Kramnik is better than these players. But it would be a bit rich for you to go around saying, ``Hmm, 23.Ng3 would be ever so slightly better than 23.Bf2, though there's very little in it. And 24.Qd7 would be a bit better than the move played, too.'' For that matter, there are also players around who are *much* stronger analysts [esp in specialised areas, such as endings] than their OTB rating would suggest -- eg because of physical/mental limitations that affect them in tournaments, or because they can't handle the clock -- and such players [eg, top correspondence players] may be well equipped to rank others of much higher nominal rating. True. But these players achieve their high level of analysis by spending more time than they would in a game. The researchers' crippled Crafty is spending less time than it would in a game. One more criticism: in looking over a game at one Web site, I noticed that the depth of search DURING PLAY achieved by GM Kramnik's opponent was around 18 plys. Now why on earth would anyone try to rate the play of the world champions by cutting of crippled-Crafty's search at only 12 plys? The depth achieved during play was presumably in real time at, say, 5 hours per game? So a real-time 24-game match occupied 5 solid days of chess, and so of computer time even for a dedicated super-computer. Simply rating every WC game at that speed is several months; or years for the normal PCs that we have on our desks. The alternatives to crippled-Crafty rating every WC game are basically to have *no* results, 'cos it takes too long, or to have an utterly superficial analysis of *all* the games of Tal/Kasparov/.... False dichotomy. They could have gone to thirteen ply. Or fourteen. Of course, this would still bring the same criticism. But the real problem is that you're assuming that only one computer is available. The Kramnik-Fritz match was played on commodity hardware. An academic research grant will easily run to buying, say, ten PCs. (And this was academic research.) Or you go round your department asking if the people with modern computers will let you run your analysis on their machines overnight. In a university, it shouldn't be too hard to get night-time use of several tens of decent PCs. Dave. -- David Richerby Cheese Lotion (TM): it's like a www.chiark.greenend.org.uk/~davidr/ soothing hand lotion that's made of cheese! |
| Ads |
|
#122
|
|||
|
|||
|
On May 3, 9:17 am, Ron wrote:
In article . com, And, it seems to me, that you've basically conceded the point. The biases of the program will affect it's judgement. Just because it's not making 600-rating-point errors doesn't mean it isn't making 50-rating-point errors. No, you misread me. I am saying YOU (or Camp #1) is making that error. I happen to believe chess programs rarely make 50 centipawn errors. I only admit that if the players are ranked extremely close (see my Roman Numeral examples in the previous post), such as Capa and Kramnik, that perhaps the bias of the program can reverse the true ranking (so perhaps Kramnik comes before Capa, but in any even Kramnik and Capa are ahead of Kasparov and Karpov) RL |
|
#123
|
|||
|
|||
|
raylopez99 wrote:
No, you misread me. I am saying YOU (or Camp #1) is making that error. I happen to believe chess programs rarely make 50 centipawn errors. I have seen in my own examination of games of various rating levels and skills, differences between engines of 50cp is routine. I have often seen advantage white vs advantage black differences, and I have seen difference up and over 100 cp (though more rarely). I do not have hard numbers, other than it happens WAY more than rarely, and seems to be the fundamental difference. You will see key moves hated by one engine and loved by another. Ultimately, what seems to be desired is that the Engine says the "truth" about a position or at least whatever the error is, is uniform across all the test cases. My "camp" if there is a such a thing, doesn't believe the premises. |
|
#124
|
|||
|
|||
|
"raylopez99" wrote in message ups.com... On May 3, 8:14 am, "David Kane" wrote: 2. Even if the experiment were performed with a perfect tool, the authors have not provided any evidence that the measure chosen (average error, corrected for position type) is the one that correlates with winning play. We simply do not know whether a string of 10 moves each with error 0.1 predicts the same winning chances as 9 perfect moves followed by a single error of 1.0. As chess is a game of errors, and the serial Markov chain probability of winning is probably weak in any given sequence of moves (that is, from move-to-move, as a book by Australian chess master Purdy once pointed out, and as is well known in the chess maxim that 'every board position has to be looked at with a fresh pair of eyes, de novo, without regard to what was played before'), I would imagine that the former dominates the latter, but I agree this needs to be investigated. But the point is that we don't have to rely on vague general arguments, or notions of what seem reasonable. The authors have proposed a certain measure of move quality, and that measure can be directly tested on actual chess games. In fact, the only data they gave supports the idea that the selected measure is not that great: the correlation between the difference in error and the outcome of the game was only 0.89. Moreover, they give no details of that analysis. 0.89 correlation is very high, no? correlation is a real number 0 X 1.0, no? 89% is very high correlation. They didn't give the actual data (itself suspicious), or the methodology (There is no mention of draws, e.g.) but if I'm guessing correctly what they are talking about, 89% isn't that great for this kind of thing. It means that a reasonable fraction of the time the player making the poorer moves will win. This suggests that their chosen definition of "poorer" isn't very good. My own instinct is that move-rating will require a more sophisticated approach than examined here. |
|
#125
|
|||
|
|||
|
On May 3, 5:20 am, David Richerby
wrote: help bot wrote: The articles I saw focused only on world championship play, which likely has a lower error rate because: 1) the players are the very best in the world, and 2) the time controls and playing conditions are close to ideal. I see your `close to idea' and raise you a Toiletgate. In the fantasy world of some "analysts", even the weakest chess programs are fully capable of accurately ranking the world champions on the basis of greatness, despite not having the slightest clue what that might be. It's about as likely as a chess-playing knight rushing in to slay a dragon. For me, it would seem to require a working definition of what exactly constitutes "greatness", and a well thought out system for measuring it. As Shrek would have put it: Yeah, like THAT'S ever going to happen! -- help bot |
|
#126
|
|||
|
|||
|
On May 3, 7:59 am, "Chess One" wrote:
We rehearsed this conversation before - but there is no data of GM play against raw chess engines. The engines are all optimised for winning, and not for anything useful, like learning - either about its on evaluation matrix of chess evaluation or even how people evaluate play. [because with book+table bases=off, it wins less] I understand the commercial need to do that, but don't understand any academic reason to choose emulation paradigms over [exigesic] real-time engine evaluation. I always thought Crafty in particular would be such a base model, since it is born out of a university system, widely distributed and adapted, and lots of people might have had a go at it. But I think Crafty got caught up in its own early success as W CH, and continued to go for 'win', rather than for 'learn'. Ah, but you seem to forget that the reason for Crafty's success was simply the fact that Cray computers were faster. The enabler was raw speed, so why expect any "academic" attempt to learn (an approach which I believe was rejected early on because of poor initial results or a lack of adequate hardware)? As far as I know, the learning approach has stemmed from neural networks, not Cray supercomputers. Mr. Hyatt, the creator of Crafty, simply exploited the raw speed of a mainframe he had (virtually unique) access to. The same thing might be said of the IBM team; they went for the win, not the far-sighted approach of gradual improvement which never ceases, but which lags the state of the art conventional programs by a wide margin. Who wants to be the tortoise, when the hare gets all the glory and hype? When it is possible that the slow "learning" technique will never catch up? Supposing we could graph the two approaches and see the trend -- only if it were clear that the laggard was fast making up ground would it seem to programmers worth their while to switch methods, to risk being the tortoise in this race. -- help bot |
|
#127
|
|||
|
|||
|
On May 3, 11:13 am, "David Kane" wrote:
"raylopez99" wrote in message ups.com... On May 3, 8:14 am, "David Kane" wrote: 2. Even if the experiment were performed with a perfect tool, the authors have not provided any evidence that the measure chosen (average error, corrected for position type) is the one that correlates with winning play. We simply do not know whether a string of 10 moves each with error 0.1 predicts the same winning chances as 9 perfect moves followed by a single error of 1.0. As chess is a game of errors, and the serial Markov chain probability of winning is probably weak in any given sequence of moves (that is, from move-to-move, as a book by Australian chess master Purdy once pointed out, and as is well known in the chess maxim that 'every board position has to be looked at with a fresh pair of eyes, de novo, without regard to what was played before'), I would imagine that the former dominates the latter, but I agree this needs to be investigated. But the point is that we don't have to rely on vague general arguments, or notions of what seem reasonable. The authors have proposed a certain measure of move quality, and that measure can be directly tested on actual chess games. They have? I thought the original paper, which the chessbase article quotes, was pulled from the website. So perhaps the original paper was half-baked, though it is an interesting and useful idea. In fact, the only data they gave supports the idea that the selected measure is not that great: the correlation between the difference in error and the outcome of the game was only 0.89. Moreover, they give no details of that analysis. 0.89 correlation is very high, no? correlation is a real number 0 X 1.0, no? 89% is very high correlation. They didn't give the actual data (itself suspicious), or the methodology (There is no mention of draws, e.g.) but if I'm guessing correctly what they are talking about, 89% isn't that great for this kind of thing. It means that a reasonable fraction of the time the player making the poorer moves will win. This suggests that their chosen definition of "poorer" isn't very good. Maybe, but maybe not, if the second move is deemed "poorer" and often in chess (it seems) the first two moves are roughly equally as good. My own instinct is that move-rating will require a more sophisticated approach than examined here.- Yes, for "very close calls" between two equally matched players (Karpov or Kasparov--which is greater?) you need sophistication perhaps, as I stated earlier. But I am saying the "non-sophisticated" Crafty-as-chess-engine-rater can tell us, right now, with the methodology presented now, that Kasparov was a better player than Janowski for example--on this we all agree I hope. From there, it's a small step (or is it?) to make the claim: !!! Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!) !!! RL |
|
#128
|
|||
|
|||
|
On May 3, 8:44 am, (Dr A. N. Walker) wrote:
In article . com, help bot wrote: [...] But the main objection was, of course, that since Rybka (and Hiarcs, etc.) is available, why mess around with something vastly inferior, unless ranking mere patzers? Um, because Crafty is not only available, but available in source-code form to anyone, so that anyone can instrument it, piggle with it, and generally use it as a tool to investigate things? If you want something reproducible, then, short of help from the commercial companies, you have to do a fair amount of tweaking. Nonsense. Every tweak you make to Crafty is a step toward irreproducibility, unless you expect everyone to go in and modify their source code to precisely match yours (just who do you think you are, anyway?). A better method would be to take the strongest program available (i.e. Rybka) and have it score all the games, and only THEN apply some selected algorithm to the standardized results. For example, I could BY HAND add up all the "significant errors", as we might call them, and calculate just how often and how large they are for each player. Of course, I would never waste my time doing this with results from a crippled-Crafty, when far stronger programs are widely available. Also not entirely convinced by arguments about Crafty only being able to rank mere patzers. No such argument has been made here. The argument is that crippled-Crafty, let's guess it is around 2600, is sufficiently strong to yield reasonably meaningful rankings of patzers, such that the use of Rybka or GM Kasparov is rendered unnecessary, overkill almost. With the world champions, however, the possibility of overkill is irrelevant; we desire an accurate ranking, period. I'm well short of super-GM status myself, Hold on there! I beg to differ... (wait, what am I doing? I actually agree with you. Huh? Now what do I do? Okay, I will shut up. Good.) :D but I think I know enough about chess to be able to judge that [eg] Kramnik is a better player than the typical 2650-ish GM. Can you do this with games you have never seen before, and do it for anybody? Without any clues apart from the moves of the games, and with a very small sample size like crippled-Crafty had? I doubt it. In short, you are incorporating outside knowledge, such as the fact that GK is the world champion, for instance. For that matter, there are also players around who are *much* stronger analysts [esp in specialised areas, such as endings] than their OTB rating would suggest -- eg because of physical/mental limitations that affect them in tournaments, or because they can't handle the clock -- and such players [eg, top correspondence players] may be well equipped to rank others of much higher nominal rating. These players are especially well-suited for game annotations, IMO. But as for ranking the world champions based solely and objectively on their games, human bias always creeps in. I have very rarely seen any game between Gary Kasparov and Anatoly Karpov, for instance, where the annotator's personal feelings did not work their way into the commentary on the moves, dictating the tone. An example of logic flying out the window is the long- standing custom of describing players who like to play 1.e4 as attacking players, and their games as open, interesting, or other such terms with positive connotations. Yet when a certain strongly-disliked player was involved, everything was reversed. He became "defensive", anti-chess, abnormal, ugly, greasy-haired and for that matter, they didn't seem to like his mother either. And he collected stamps. Stamps! Is this guy a nerd, or what? Let's all just go over and beat him up -- that will teach him. :D This is why it is better to use computers to do the rankings and ratings -- zero personal bias. One more criticism: in looking over a game at one Web site, I noticed that the depth of search DURING PLAY achieved by GM Kramnik's opponent was around 18 plys. Now why on earth would anyone try to rate the play of the world champions by cutting of crippled-Crafty's search at only 12 plys? I mean, get a REAL computer, and a clue! The depth achieved during play was presumably in real time at, say, 5 hours per game? So a real-time 24-game match occupied 5 solid days of chess, and so of computer time even for a dedicated super-computer. The computer was a fast desktop, as far as I know. The article showed a snapshot of the computer's monitor, displaying 18 or 19 plys depth, with a score of exactly 0.0, or a forced draw. Note that in the opening, both players move more rapidly, and this can extend to in the realm of move twenty. The point is, my guess is that this desktop passed up the 12th ply in just a few seconds, so why are the games of the world champions (a small sample size) being dealt with so lightly? Simply rating every WC game at that speed is several months; or years for the normal PCs that we have on our desks. Then someone is using an inefficient technique. In the old days, it took a week to properly analyze a single game, but not anymore. Especially when you consider that a large chunk of the moves were played by rote (i.e. the book openings). In many of the games between world champions, the book phase might well comprise a third of the game. The alternatives to crippled-Crafty rating every WC game are basically to have *no* results, 'cos it takes too long, or to have an utterly superficial analysis of *all* the games of Tal/Kasparov/.... No. There are plenty of other alternatives. I already showed that in a recent match, one computer was able to tackle WELL BEYOND the meager 12 plys of crippled-Crafty at OTB time controls. And while not everyone has a machine as fast as what they no doubt used against GM Kamsky, the fact remains that computers can work 24/7 (and even multi-task). And one is certainly not restricted to using a single computer. Once you set parameters such that you want to investigate a reasonable corpus of games [and WC matches seems quite sensible] The matches of GM Steinitz amount to a decent sample size, but not all of the world champions churned out nearly so many games in W.C. play. For example, GM Fischer only competed in a single W. C. match, resulting in all his data being against a single opponent, in a single venue, in a single season. using a few weeks of time on a PC, the rest somewhat falls into place. Eg, if you want to analyse 1000 games and are willing to wait one month for the results, then you have perforce to analyse 30+ games/day, or 48 minutes/game, or around 30s/move, on whatever dedicated machine is available to you. Wrong. I am typing this right now while sitting in a room with FOUR computers, albeit only one of which gets any use these days. By my calculations, that cuts your "month" down to about a week, your "1000 games" to only 250 apiece, and that is just using my own computers. Suppose I recruit a few people to help me out? It's really a matter of standards; are you willing to accept half-baked analysis? Half-reasoned justifications for dodgy work? Whether the resulting investigation is worthwhile is another matter. Indeed. My take is that it is silly to think a computer (at this point in time) can accurately determine who among the world champions was "the greatest". Greatness is not something which has yet been quantified, and computers suck at guessing. -- help bot |
|
#129
|
|||
|
|||
|
raylopez99 wrote:
You bothered to reply to help bot? help bot is pretty clueless, I think he's essentially a troll. You're the troll here Lopez. Help bot's contributions to this ng, while occasionally somewhat 'longwinded', are infinitely more entertaining than your characteristically boring & nugatory execrations (I mean, do you seriously expect anyone [with a life, that is] to wade through your last but 1 or 2 or 3 posted screeds?) This thread is surely flogged to death by now - no?.. |
|
#130
|
|||
|
|||
|
help bot wrote:
Ah, but you seem to forget that the reason for Crafty's success was simply the fact that Cray computers were faster. Not entirely. Was Crafty the first program to use bitboards? As far as I know, the learning approach has stemmed from neural networks, not Cray supercomputers. That statement makes no sense. The neural network is a programming technique and is completely independent of the hardware it's run on. Mr. Hyatt, the creator of Crafty, simply exploited the raw speed of a mainframe he had (virtually unique) access to. Um. Cray != mainframe. You clearly have no idea what you're talking about. Dave. -- David Richerby Mentholated Homicidal Chair (TM): it's www.chiark.greenend.org.uk/~davidr/ like a chair but it wants to kill you and it's invigorating! |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| rec.games.chess.misc FAQ [2/4] | pribut@yahoo.com | rec.games.chess.misc (Chess General) | 0 | February 19th 06 05:44 AM |
| Play chess online! Internet chess games. | nateg5@yahoo.com | rec.games.chess.misc (Chess General) | 0 | January 7th 06 01:24 AM |
| Play chess online! Internet chess games. | nateg5@yahoo.com | alt.chess (Alternative Chess Group) | 0 | January 7th 06 01:22 AM |
| Play chess online! Internet chess games. | nateg5@yahoo.com | alt.chess (Alternative Chess Group) | 0 | December 29th 05 07:04 PM |
| rec.games.chess.misc FAQ [2/4] | pribut@yahoo.com | rec.games.chess.misc (Chess General) | 0 | October 19th 05 05:37 AM |