A Chess forum. ChessBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » ChessBanter forum » Chess Newsgroups » rec.games.chess.misc (Chess General)
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Tags: , , , , , , , , , ,

Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)



 
 
Thread Tools Display Modes
  #201  
Old May 11th 07, 10:41 AM posted to rec.games.chess.misc,rec.games.chess.computer
Martin Brown
external usenet poster
 
Posts: 666
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 10, 3:36 pm, raylopez99 wrote:
On May 10, 1:22 am, Martin Brown
wrote:

Well if it has to be open source then Fruit 2.1 (~2780) might be
another alternative to try against Crafty (~2670). An extra 100 points
and a bit less materiallistic evaluation would be closer to human GM
level play. Fruit 2.2.1 just about stumbled onto that tricky line that
Phil Innes sets so much stall by engines not finding before I pulled
the plug.


Lots of commercial chess programs will have a database of "tricky"
positions with "model" answers, just to fool people into rating them
higher. Tricks of the trade.


Don't try to teach me about computer chess. You clearly haven't a clue
what you are talking about. They are optimised to pass certain well
known tests but that was part of my motivation to start a new thread
asking for "interesting" new positions where engines score things
radically differently. A situation that persists even with the very
top engines run for times which would in principle make them beyond
superGM.

Fruit just scraped this test after about half an hour. It looked like
it was stuck in the obvious rut for a long while.

So in other words you would be happy to see different results
if we ran the experiment again next year,


Exact reproducibility probably isn't so important here. Getting the
maximum accuracy of the move evaluation function for the limited
amount of time available is the key. Fixed depth does not do that.


I disagree. Normalization, see one of my posts in this 200 post
thread.


You cannot "normalise" base metal into gold. Although you do seem to
believe that if you repeat a lie often enough it becomes true.

Crafty is a rather conservative chess engine. It has a *very* good
quiessence search at terminal nodes, but a relatively poor search
extension strategy. As a result it tends to get set in its ways and
miss important much stronger lines having settled into a comfortable
path that looks superficially OK. It leads to a false sense of
security in that the evaluations with increasing search depth remain
too stable (ie it doesn't learn much new with each successive ply).

Even the very best engines cannot agree to within 50cp on some key GM
positions after 2 days and at ply 22+.

In fact on the Kasparov-Anand Riga 1995 Shredder now thinks that only
11. ... Kf8 (0.05/23) is playable
Whereas Rybka reckons that most of the obvious continuation lines are
playable but prefers one of:
11. .... g6 (-0.08) or 11. .... OO (-0.07/23) or 11. .... Kf8
(-0.03/22)

AFAIK Kf8 is a novelty that has not been played in top level games. It
doesn't look pretty and the engines only see it at deep ply levels.

I would not be at all surprised if a few of the 70-80 centipawn
"blunders" turned out well at greater depth and a few non-blunders
turned out to be dubious. Swings, roundabouts. BICBW.


I don't think they are swings and roundabouts though. GM level games
are littered with precisely the sort of positions that chess engines
find really difficult to score accurately. And they usually occur at
pivotal moments.


A pivotal moment is immaterial if you use normalization. As I


Like hell it is. The GM gets penalised by |Machine_best_move -
GM_move| for every move where the computer fails to understand what he
is doing and when he wins the exchange by playing the move Crafty when
finally sees it he gets nothing. They filter out (correctly) all moves
evaluated outside the range [ -2.0 , 2.0 ] to avoid penalising the
winning GM for playing for a safe win rather than a risky optimal move
or the loser for playing fast and lose bluff moves.

Scoring GMs with Crafty penalises anyone who doesn't play *exactly*
like Crafty with a fixed 12 ply search strategy.

No amount of "normalisation" bull**** will get you out of this hole.

explained in a post in this thread, the fact that a player enters a 60
move mating net set by his opponent, unseen by Crafty with a 14 ply
move horizon, is immaterial since at some point Crafty will see the
mating net (namely, 7 moves before checkmate) and rate the losing
player lower than the winning player.


No it won't their scoring system was entirely based on evaluating the
difference between the move played and the machines idea of the best
move. And restricted to move 12 with evaluation bounded in -2.0, 2.0.

OK, perhaps the implication is that they should have stopped
there and then. But if historical Elo ratings are of interest, then
I see no reason why another objective measure of *something* need not
be. They *do* have an objective measure. It *does* seem that their
results correlate well with *some* quality that we can recognise in
the play of Capablanca, Petrosian, Tal, etc. Their methodology is
at least interesting, even if flawed.


Agreed. The experiment is worth repeating with a much stronger
engine.


Yes, agreed. As I posted 47.5 posts ago, for very close, nearly tied
rankings, the stronger chess program might make a difference. But for
clear demarcation breakpoints, such as between Capa and Kramnik versus
Karpov and Kasparov, a stronger chess engine doesn't matter.


I reckon there were a fair proportion of Karpov & Kasparovs moves that
Cratfy didn't even begin to understand and it marked them down for it.
The blunder rates are much more believable (although some of them will
be wrong).

And, chess being 99% tactics (say many GMs, including Tarrach or
Teichman), the player with the lowest blunder rate is often the best
champion. Blunders = Function(overall strength). In fact, a study


You keep on repeating this lie. It isn't true and it never will be. It
is precisely because of the importance of long range strategic
planning that machine chess isn't massively stronger than it is.
Tactics are a necessary part of strong chess but they are not
sufficient on their own.

from a few years ago found that the difference in most moves between a
patzer and a GM was not so much in the unexpected move made, but
rather in the fact GMs blundered far less than a Class C player.


What a surprise - who would ever have guessed that?

Regards,
Martin Brown

Ads
  #202  
Old May 11th 07, 12:23 PM posted to rec.games.chess.misc,rec.games.chess.computer
Martin Brown
external usenet poster
 
Posts: 666
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 10, 4:23 pm, "David Kane" wrote:
"Martin Brown" wrote in message

oups.com...

Agreed. But equally when the experiment has a systematic error due to
using a relatively shallow fixed depth (but reproducible) searching to
score the moves played it doesn't take much intuition to conclude that
an engine that cannot annotate club level games accurately at that
level is completely out of its depth on superGMs.


I'd wager that this method would give generally meaningful results
for club players, *despite* the fact that it will inaccurately analyze
certain positions. It's a pity that the authors did not apply
the method to the games of players with different ELO. That's


I agree entirely. Crafty would have enough headroom over most club
players that the relatively small errors it made would not matter
compared to the fairly gross blunders that determine the outcome of
games at our level.

I do not believe this is true when it tries to rate world class
players who are intrinsically much stronger and have significantly
better stategic intuition and positional understanding than the
engine.

an easy and obvious extension that would have gone a long way
to validating the worth of the method.

The argument that the method is refuted by finding one position
that the computer analyzes incorrectly is false. There are analogous


I am not saying that. Although it is more fun to study interesting key
positions in top level GM games with deep engine analysis than to
focus on the mundane obvious wood pushing moves that no GM will ever
get wrong.

issues in ELO rating: which games should be rated, and what is the
significance of each game.


If you look back to the start of this thread I originally said I
thought the engine probably had managed something like an accurate
assessment. That was before I read the bit in the original paper that
said it was hobbled to 12 ply fixed search. I still think that is true
as far as blunder rate is concerned, but not so for accuarcy in non-
blunder play.

I revised my view after experimenting with Crafty at 12ply annotating
a few of my already Shredder10 (30s/move)annotated games. The
experiment is not difficult to do. Perhaps you will see a different
result?

Regards,
Martin Brown

PS Aplogies if this is posted twice, but from here it looks like
Google dropped it on the floor again.

  #203  
Old May 11th 07, 05:06 PM posted to rec.games.chess.misc,rec.games.chess.computer
raylopez99
external usenet poster
 
Posts: 346
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 10, 5:03 pm, Ron wrote:
In article .com,

raylopez99 wrote:
Yes, agreed. As I posted 47.5 posts ago, for very close, nearly tied
rankings, the stronger chess program might make a difference. But for
clear demarcation breakpoints, such as between Capa and Kramnik versus
Karpov and Kasparov, a stronger chess engine doesn't matter.


It appears, RayLopez, that you missed an earlier post of mine which had
two questions related to this very point.

Since I'm sure it was an innocent omission - it's easy to miss a single
post in a long thread, I'll repeat the questions here.

1) Would you feel equally confident if we only gave crafty 11 ply? 10?
8? 4? Where do you draw the line? What non-arbitrary criteria are you
using to suggest that 12-ply is meaningful whereas 3 ply, obviously,
would not be?

2) What objective criteria are you using to define "extremely close"
such that you don't trust the computer's ability to rank players
properly?

I'm very curious to hear your answers to these questions.

-Ron


Ron,

Don't confuse the PSEUDO-chess scientists and programmers answers on
this thread with REAL answers. Keep in mind I program as a hobby,
have an IQ of over 140, and am a successful and quite wealthy
businessman. My opponents *think* they have something to offer, but
they don't realize that AI (Artificial Intelligence) research has
largely abandoned chess as the experimental "fruit fly" of AI, roughly
15 years ago. Bridge and GO are the hot areas where AI is being now
applied, not to mention the quest to build a true Turing machine that
passes the Turning Test.

Another point: my opponents *think* they know the answer, but what is
their basis? Little better than a guess. In fact, little better than
my guess. But at least I base my guess/ hypothesis on having studied
chess and chess programmers as far back as 1990. I used to subscribe
to Ply mag, published by an outfit in Canada (some university up
there), and have read articles and papers on how real chess
programming works. My opponents are still upset Garry lost to Deep
Blue 2, and are 'fighting for the human race' or some such nonsense.

Now to get to the point of your questions: I don't know. My
intuition, like Bot states, says that ply will not matter unless
players are "close", and from a visual inspection of the ratings in
the summary of the original article that started this thread shows,
"close" is between Capa and Kramnik, Karpov and Kasparov, and then the
"third tier". But more plies might not make a difference (that is,
won't change the relative rating) between say Capa and Kasparov, or
anybody in the third tier vs. Karpov, etc.

In truth, nobody in this thread really knows, and indeed further
research is needed. But the burden of persuasion is on Camp #1 to
make their case--that so called "positional sacrifice" positions are
rather common in a game of chess and that chess is NOT largely tactics
(these are the assumptions behind their claims--I claim the
contrary). History has shown otherwise. Indeed, on the last point,
Kramnik missed a mate in one last year. Chess is largely tactics, and
that's why it is fair to have a chess engine rate the champions. You
can make 30 brilliant "deep" positional moves in chess, have a clearly
winning position, and still lose a chess game in a mate in one. That
is chess. A PC would score you poorly in such a game, even though you
were "brilliant" up until your blunder (and perhaps unappreciated by
the PC, though I have argued in this thread that PCs are in fact not
so bad at rating positions that require positional moves, even
exchange sacs).

In fact, Camp #1's arguments are better if we were trying to rate
"correspondence chess" champions rather than OTB champions, since in
correspondence chess tactics are much less important than deep
positional moves. But that was not the inquiry of the original
article ranking of champions: it was for OTB world championship play.
However, that said, I would not be surprised that even for
correspondence chess players, rating such players with Fritz 5.31 at 5
seconds a move would give you a pretty clear indication of the best
correspondence chess players, since good positional moves and good
tactical moves are largely one and the same in chess (again, this goes
to chess being 99% tactics).

RL

  #204  
Old May 11th 07, 06:37 PM posted to rec.games.chess.misc,rec.games.chess.computer
David Kane
external usenet poster
 
Posts: 1,105
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)


"Martin Brown" wrote in message
ups.com...
On May 10, 4:23 pm, "David Kane" wrote:
"Martin Brown" wrote in message

oups.com...

Agreed. But equally when the experiment has a systematic error due to
using a relatively shallow fixed depth (but reproducible) searching to
score the moves played it doesn't take much intuition to conclude that
an engine that cannot annotate club level games accurately at that
level is completely out of its depth on superGMs.


I'd wager that this method would give generally meaningful results
for club players, *despite* the fact that it will inaccurately analyze
certain positions. It's a pity that the authors did not apply
the method to the games of players with different ELO. That's


I agree entirely. Crafty would have enough headroom over most club
players that the relatively small errors it made would not matter
compared to the fairly gross blunders that determine the outcome of
games at our level.

I do not believe this is true when it tries to rate world class
players who are intrinsically much stronger and have significantly
better stategic intuition and positional understanding than the
engine.

an easy and obvious extension that would have gone a long way
to validating the worth of the method.

The argument that the method is refuted by finding one position
that the computer analyzes incorrectly is false. There are analogous


I am not saying that. Although it is more fun to study interesting key
positions in top level GM games with deep engine analysis than to
focus on the mundane obvious wood pushing moves that no GM will ever
get wrong.

issues in ELO rating: which games should be rated, and what is the
significance of each game.


If you look back to the start of this thread I originally said I
thought the engine probably had managed something like an accurate
assessment. That was before I read the bit in the original paper that
said it was hobbled to 12 ply fixed search. I still think that is true
as far as blunder rate is concerned, but not so for accuarcy in non-
blunder play.


The proof of the pudding is in the eating. A claim that
an analytical method is meaningful must be supported with evidence,
and that is true whether you are talking about "average
error analyzed by 12 ply Crafty" or some sophisticated calculation
based on 20-ply Hydra analyses.

The paper lacks any supporting evidence and therefore
its conclusions are dubious. However, I consider the method
highly interesting and worthy of discussion.


I revised my view after experimenting with Crafty at 12ply annotating
a few of my already Shredder10 (30s/move)annotated games. The
experiment is not difficult to do. Perhaps you will see a different
result?

Regards,
Martin Brown

PS Aplogies if this is posted twice, but from here it looks like
Google dropped it on the floor again.



  #205  
Old May 11th 07, 07:58 PM posted to rec.games.chess.misc,rec.games.chess.computer
Dr A. N. Walker
external usenet poster
 
Posts: 96
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

In article .com,
Martin Brown wrote:
[... I]f you have 36 computers
and a spare month available, feel free.

OK. But without doing that for the moment. What settings do you use to
analyse annotate your own games?


I don't. I enter them via ChessBase with Fritz running.
In positional terms, I trust my own judgement more than Fritz,
so I'm really using the computer only for blunder-checking. If
Fritz doesn't see anything before I move on, tough. [Of course,
it is doing this not only for the moves actually played, but also
for my own annotations, plus any off-the-wall ideas I feel like
investigating, so it is not expected to spot things "early".]
So it usually gets a few seconds for "routine" moves, much longer
for positions that seem "interesting" [either directly to me, or
because Fritz seems to be finding something].

I would be prepared to bet it is nothing like as shallow as 12 ply
fixed + quiessence.


You might lose your bet, or at least part of it. It takes
Fritz a reasonable time to get past 12 ply [of course, that's usually
something like "12/27"] in the middle-game, and I very rarely wait
for it to reach a depth that is "nothing like as shallow". The ending
is different, of course.

[The G&B experiment:]
It will penalise GMs that have formed plans extending beyond 12 ply if
there is no obvious gain made inside its quiessence horizon. And it
hardly ever sees material sacrifices for gains in positional advantage
or tempo.


I have rarely used Crafty. But Fritz usually at least sees
some compensation -- eg you sacrifice a pawn and see a 0.6 drop in
the evaluation, even if Fritz has no idea of the true worth of the
sacrifice. The experience I *did* have with Crafty, some years ago,
was that it seemed to produce better evaluations than Fritz, but it
was less tactically aware, so it was much less use *to me* [as well
as weaker in the Elo sense], paradoxically despite perhaps being a
better match to actual IM/GM play. But computer chess has moved on
a long way since then.

There is also, of course, Bronstein's dictum -- "Against
computer, is advantage to be pawn down" [as he played a gambit
against MChess]. His point was that the computer completely mis-
understood his play, expecting him to be trying to regain the
pawn, and thereby not seeing his steadily increasing advantage in
other aspects of the position.

[...] GM level games
are littered with precisely the sort of positions that chess engines
find really difficult to score accurately. And they usually occur at
pivotal moments.


This is true. But -- until someone runs the experiment --
this does not necessarily mean that Crafty-12 makes a worse pig's
ear of this than a much stronger engine. What matters to the
experiment is not whether Crafty's evaluation of the position is
the same as the GM's or is better/worse that [eg] Rybka's. We
are accumulating the difference between Crafty's [or Rybka's]
score for its own and for the GM's move.

If, for example, Crafty completely misunderstands a pawn
sacrifice, then there is a 1-pawn "mistake" in Crafty's assessment
of [eg] Spassky's play. If Spassky does this every other game
[he surely doesn't do it more than that!], that's a 0.013 or so
systematic error in Spassky's results. That could take him
above Kasparov and Karpov in the rankings, but gets him nowhere
near Kramnik and Capablanca [who are 0.03 ahead]; on the other
hand, K&K have their own share of "mysterious" pawn sacrifices,
so quite probably Spassky would stay below them.

Suppose also that Crafty has rather "static" positional
evaluations; in that case, it may well be that Crafty sees much
less difference between its own preference and Spassky's in most
relatively quiet positions than perhaps it should, or than Rybka
does. Crafty may in that case be misjudging Spassky's moves, and
his positions, but not in a way that makes his play seem bad;
whereas Rybka may be seeing and "understanding" more, but be
penalising Spassky much more for any discrepancies [which may or
may not be "real"].

It's not easy. We [someone!] should run the experiment
before jumping to conclusions. This may be a computer-chess
version of the fact that it is not always the best practitioners
who make the best teachers [or examiners].

[...] I think it mostly has found the players with the lowest
blunder rate fairly convincingly.


Yep. That's why my overall view is that their results
are probably not too far out, despite the obvious problems with
the methodology. If you were asked to rank the WCs in order of
the accuracy -- not necessarily the quality or success -- of
their play in WC matches, then who would argue with Capablanca
and Kramnik at the top, Karpov and Kasparov next, then very
little difference down to Smyslov, with Tal, Euwe and Steinitz
somewhat worse? The only surprise is perhaps iron man Botvinnik
below Tal; but MMB lost three WC matches, so perhaps we're not
seeing him at his best. If Crafty-12 is too "stupid" to have
reached this conclusion in a rational way, then it's been very
lucky [or else the chess world at large is equally stupid].

--
Andy Walker, School of MathSci., Univ. of Nott'm, UK.

  #206  
Old May 11th 07, 10:22 PM posted to rec.games.chess.misc,rec.games.chess.computer
JohnnyT
external usenet poster
 
Posts: 188
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov,*in that order* (cuz 'puters don't lie!)

Dr A. N. Walker wrote:

I don't. I enter them via ChessBase with Fritz running.
In positional terms, I trust my own judgement more than Fritz,
so I'm really using the computer only for blunder-checking. If
Fritz doesn't see anything before I move on, tough. [Of course,
it is doing this not only for the moves actually played, but also
for my own annotations, plus any off-the-wall ideas I feel like
investigating, so it is not expected to spot things "early".]
So it usually gets a few seconds for "routine" moves, much longer
for positions that seem "interesting" [either directly to me, or
because Fritz seems to be finding something].


I do something very similar, and it in general works pretty well for me.

However, I use the add kibitzer command and have Rybka running as well
as Fritz. Though I used to run Toga as my second engine until I broke
down and purchased Rybka as well. (For me, I didn't want to use a
Chessbase product for this purpose. But I have nothing but prejudice
for that decision, it may make as much or more sense to use something
like junior, shredder or zap! for this).

This adds one more interest point. When the two engines diverge
dramatically. Or I disagree, or whatever.

Many times these extra points of understanding, and extra viewpoints
will more than make up in extra knowledge than in any noise that is created.

The whole Kibitzer feature is one of my favorite features of the
Chessbase family.
  #207  
Old May 11th 07, 10:53 PM posted to rec.games.chess.misc,rec.games.chess.computer
raylopez99
external usenet poster
 
Posts: 346
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 11, 10:58 am, (Dr A. N. Walker) wrote:

[...] I think it mostly has found the players with the lowest
blunder rate fairly convincingly.


Yep. That's why my overall view is that their results
are probably not too far out, despite the obvious problems with
the methodology. If you were asked to rank the WCs in order of
the accuracy -- not necessarily the quality or success -- of
their play in WC matches, then who would argue with Capablanca
and Kramnik at the top, Karpov and Kasparov next, then very
little difference down to Smyslov, with Tal, Euwe and Steinitz
somewhat worse? The only surprise is perhaps iron man Botvinnik
below Tal; but MMB lost three WC matches, so perhaps we're not
seeing him at his best. If Crafty-12 is too "stupid" to have
reached this conclusion in a rational way, then it's been very
lucky [or else the chess world at large is equally stupid].

--
Andy Walker, School of MathSci., Univ. of Nott'm, UK.


I fully adopt Andy Walker's opinion here as my own.

This will be my last post in this thread, unless Camp #1 provokes me.

RL




  #208  
Old May 11th 07, 11:53 PM posted to rec.games.chess.misc,rec.games.chess.computer
JohnnyT
external usenet poster
 
Posts: 188
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov,*in that order* (cuz 'puters don't lie!)

Dr A. N. Walker wrote:
If you were asked to rank the WCs in order of
the accuracy


Wow, that word. That is the key of the whole thing. Lack of blunder is
by a long way, in my mind, and I think in many's, a long long way from
the word "accuracy".

And some of the questions had to do with #1 move correlation. Which
again raises the question of "accuracy". And not of blunders.

I think that Crafty-12 as an arbiter of accuracy, is a pretty tough row
to hoe.
  #209  
Old May 14th 07, 12:01 PM posted to rec.games.chess.misc,rec.games.chess.computer
Martin Brown
external usenet poster
 
Posts: 666
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 11, 6:58 pm, (Dr A. N. Walker) wrote:
In article .com,
Martin Brown wrote:

OK. But without doing that for the moment. What settings do you use to
analyse annotate your own games?


I don't. I enter them via ChessBase with Fritz running.
In positional terms, I trust my own judgement more than Fritz,
so I'm really using the computer only for blunder-checking. If


In that case it is certainly worth downloading and running something
like Fruit2.2.1 (evaluation free for 14days) as a kibitzer to see the
sort of things that you are missing. If you only buy one new chess
engine a year I would still recommend Shredder10 (or 11 if it comes
out soon) - the ultra compact and fast ram based endgame tablebases
for 34 and 345 pieces make it well worth having.

Fritz doesn't see anything before I move on, tough. [Of course,


Fritz does miss some important tactical motifs - especially at 12 ply.
If you have the entire game entered then using blundercheck inside the
chess program GUI takes only about 10s per move to reach 12 ply if you
play reasonably accurately. It stalls each time you deviate and the
cache ceases to be useful. Roughly Crafty19.19 takes 1-3mins to reach
12ply in this mode but in 60s Shredder10 typically reaches 15-16ply in
all but the most complex positions.

I guess my way of doing it comes from the fact I have muddled along
without a proper database for a long time and have still not adjusted
to using Chessbase for manipulating my own games. I still haven't
found where the blundercheck button is hidden in Chessbase - its not
on the tools menu that I can see.

because Fritz seems to be finding something].


Worth running another engine alongside it for a while. I find Fritz
blundercheck a bit dull YMMV.

I would be prepared to bet it is nothing like as shallow as 12 ply
fixed + quiessence.


You might lose your bet, or at least part of it. It takes
Fritz a reasonable time to get past 12 ply [of course, that's usually
something like "12/27"] in the middle-game, and I very rarely wait
for it to reach a depth that is "nothing like as shallow". The ending
is different, of course.


You should definitely try one of the other engines. And/or take half a
dozen games and annotate them with blundercheck set to something like
30s/move with one of Fruit/Shredder/Rybka.

[The G&B experiment:]

It will penalise GMs that have formed plans extending beyond 12 ply if
there is no obvious gain made inside its quiessence horizon. And it
hardly ever sees material sacrifices for gains in positional advantage
or tempo.


I have rarely used Crafty. But Fritz usually at least sees
some compensation -- eg you sacrifice a pawn and see a 0.6 drop in
the evaluation, even if Fritz has no idea of the true worth of the
sacrifice. The experience I *did* have with Crafty, some years ago,
was that it seemed to produce better evaluations than Fritz, but it


I have run a few tests on in this case randomly chosen matches with
somewhat interesting results. Sort of what I expected but with a few
surprises thrown in as well. AFAIK Neither of these games are known
engine traps.

The first was precomputer chess very short 25 move minature Boris
Spassky vs Jan Timman, Amsterdam 1977 (with Powerbooks strong.cbh
loaded). The first annotation was a big shock! Black was already a
rook down out of the opening book and almost inexorably set on a path
leading to a forced queen sacrfice to avoid a mate. I thought
strong.cbh was supposed to contain only the strongest opening lines
for balanced play - and not lines where one side is already dead in
the water. I have found the odd similar one in the Sicilian too
(including one highly rated line leading to immediate loss of a
piece).

Are there any tools around to debug opening books and run a sanity
check on the nodes to remove branches where one player is already more
than the exchange down?

I created myself a nul opening book to force annotation back to the
begining of the game. Ideally to mimic the experiment one culled to
exactly 24 ply would be perfect, but I don't know how to do that in
Chessbase.

The second game was a Kasparov vs Ivanchuk 1995 Riga game [E62] 53
moves. I chose it as a long balanced game leading to a draw in the
endgame. Crafty19.19 really struggled with this one at 12ply. Not only
did it fail to find the win for Kasparov at move 43. hxg5 instead of
Kf3, but it ground my machine to a complete standstill considering
move 20. ...Qg7 and although it found 20. ... Rb8 (preferred Qg5) took
nearly as long (over 30mins) on this single move as Shredder 12ply
took for the entire game!

was less tactically aware, so it was much less use *to me* [as well


If you want to see interesting tactical awareness that you can learn
from then you definitely want Shredder10. I am not yet convinced by
Rybka it may be immensely strong in ELO rating but some of the lines
it finds are well - inhuman.

as weaker in the Elo sense], paradoxically despite perhaps being a
better match to actual IM/GM play. But computer chess has moved on
a long way since then.


Indeed. Despite the clear fact the Rybka benchmarks stronger in engine-
engine matches it seems to lack something in the endgame/endgame
transition stage. I guess it matters little how it plays the endgame
if it usually wins in the middlegame.

[...] GM level games
are littered with precisely the sort of positions that chess engines
find really difficult to score accurately. And they usually occur at
pivotal moments.


This is true. But -- until someone runs the experiment --
this does not necessarily mean that Crafty-12 makes a worse pig's
ear of this than a much stronger engine. What matters to the
experiment is not whether Crafty's evaluation of the position is
the same as the GM's or is better/worse that [eg] Rybka's. We
are accumulating the difference between Crafty's [or Rybka's]
score for its own and for the GM's move.


The problem here is that Crafty is frequently out by more than 50cp on
key variations and has been in all the GM games I have fed it so far.
Admittedly the first two were engine showpieces but the second pair
were randomly chosen high level games. You can see it happen most
prominently in the longer game where it misses the crucial winning
line and mis scores a host of moves systematically wrong because it
doesn't understand what is going on.

If, for example, Crafty completely misunderstands a pawn
sacrifice, then there is a 1-pawn "mistake" in Crafty's assessment
of [eg] Spassky's play. If Spassky does this every other game
[he surely doesn't do it more than that!], that's a 0.013 or so
systematic error in Spassky's results. That could take him
above Kasparov and Karpov in the rankings, but gets him nowhere
near Kramnik and Capablanca [who are 0.03 ahead]; on the other
hand, K&K have their own share of "mysterious" pawn sacrifices,
so quite probably Spassky would stay below them.


I don't think it is quite so clear cut. I do think that a fair
proportion of the "errors" that the G&B analysis says the GMs have
made are in reality just the rms error of Crafty's evaluation which is
something like 30cp multiplied by the number of times they do
something that it doesn't expect.

Suppose also that Crafty has rather "static" positional
evaluations; in that case, it may well be that Crafty sees much
less difference between its own preference and Spassky's in most
relatively quiet positions than perhaps it should, or than Rybka
does. Crafty may in that case be misjudging Spassky's moves, and
his positions, but not in a way that makes his play seem bad;
whereas Rybka may be seeing and "understanding" more, but be
penalising Spassky much more for any discrepancies [which may or
may not be "real"].

It's not easy. We [someone!] should run the experiment
before jumping to conclusions. This may be a computer-chess
version of the fact that it is not always the best practitioners
who make the best teachers [or examiners].


Although this is possible. An engine that cannot detect important wins
and tactical lines is not a good choice, and hobbling it to 12ply even
if it was the only way to do the experiment makes matters even worse.
..
[...] I think it mostly has found the players with the lowest
blunder rate fairly convincingly.


Yep. That's why my overall view is that their results
are probably not too far out, despite the obvious problems with
the methodology. If you were asked to rank the WCs in order


That was my initial impression too until I started tormenting engines
with a few top level games to see how well Crafty 12ply fared. The
initial results are not good. OK I admit it is possible that that 4
games I picked are totally unrepresenatitive, but I think it more
likely that the same sorts of errors are present in almost every GM
game.

We could eliminate this possibility if a few more people would pick a
game and annotate it with their favourite engine hobbled to 12ply,
favourite engine 60s/move and Crafty12ply. I am not sure the resulting
games are exciting enough to post here - multiple annotations in PGN
look a real mess. But a summary of the outcome would be OK.

It is time to turn the question around slightly. Can anyone find a GM
level game where Crafty at 12ply avoids missing important winning
lines and obtains reasonable blundercheck agreement to within say 20cp
against any other top rated engine run for 60s/move? So far all the
games I have tested have shown serious discreprancies (50cp).

Regards,
Martin Brown

  #210  
Old May 14th 07, 12:05 PM posted to rec.games.chess.misc,rec.games.chess.computer
Martin Brown
external usenet poster
 
Posts: 666
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

On May 11, 6:58 pm, (Dr A. N. Walker) wrote:
In article .com,


Threading got messed up. First copy of my reply to you was dropped on
the floor by Google and the repeat post has been incorrectly threaded
under David Kane above.

Regards,
Martin Brown




 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
rec.games.chess.misc FAQ [2/4] pribut@yahoo.com rec.games.chess.misc (Chess General) 0 February 19th 06 06:44 AM
Play chess online! Internet chess games. nateg5@yahoo.com rec.games.chess.misc (Chess General) 0 January 7th 06 02:24 AM
Play chess online! Internet chess games. nateg5@yahoo.com alt.chess (Alternative Chess Group) 0 January 7th 06 02:22 AM
Play chess online! Internet chess games. nateg5@yahoo.com alt.chess (Alternative Chess Group) 0 December 29th 05 08:04 PM
rec.games.chess.misc FAQ [2/4] pribut@yahoo.com rec.games.chess.misc (Chess General) 0 October 19th 05 06:37 AM


All times are GMT +1. The time now is 09:56 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.Content Relevant URLs by vBSEO 2.4.0
Copyright ©2004-2008 ChessBanter, part of the NewsgroupBanter project.
The comments are property of their posters.
Online Loans - Debt - Homeowner Loans - Remortgages - Free Myspace Layouts