View Single Post
  #218  
Old May 15th 07, 08:00 PM posted to rec.games.chess.misc,rec.games.chess.computer
Dr A. N. Walker
external usenet poster
 
Posts: 96
Default Greatest chess players ever? Capa, Kramnik, Karpov, Kasparov, *in that order* (cuz 'puters don't lie!)

In article .com,
Martin Brown wrote:
[...] The one that Phil
Innes challenged me with earlier in the thread being a canonical
example. The key move there Nh4 aiming for a weak spot on g6 where it
becomes a major thorn in black's side is beyond hope of Fritz ever
seeing it. Rybka & Shredder both find it quickly [...]


Yes, but I don't *need* Fritz to see it -- I need Fritz
to confirm to me that Nh4 isn't getting trapped by ... g5 and/or
that it wasn't doing something important relative to d4/e5 [not,
of course, difficult in this particular position, but arguing
much more generally]. That's why I think we are somewhat at
cross purposes. You see Rybka/whatever as a strong player going
through your game and pointing out the best moves; I see Fritz
as a slightly annoying spectator saying [Harry Enfield voice:]
"You can't do that, you've just dropped a piece" [/HE] as I look
at the things I or my opponent might have done.

If you run an engine engine match with the stronger engine penalised
on time to give Crafty a chance you can watch as the game unfolds.
Both sides claim to be winning for a while until one gets a deep
tactical edge over the other.


Sure. But you surely aren't claiming that Crafty is
so stupid that it thinks doubled pawns are good and centralised
pieces are bad? So I'm guessing that when both sides think they
are winning [by something significant, not by 20cp or so], one
of them has overlooked something of tactical importance, which
is why, after a bit, it turns into a tactical win.

I did one last night which illustrates
my point - here annotated here by the victor at 60s/move.
[Event "AOI, Blitz:4'+2""] [...]


OK, so around 8-second chess, and an amusing crunch.
It looks as though Crafty-sans-book has not the foggiest idea
about developing and getting castled, and is also tactically
unaware. But not relevant to the present debate!

[There are some interesting discrepancies between
evaluations on successive ply in the annotations, but these
too are not that relevant, unless we find that Shredder is
much more or less prone to these than other engines.]

I reckon the rms noise on most lines is always around 10cp no matter
how deep you go. A few quiet lines may have smaller rms errors, but
the active ones tend to bounce around a bit.


[In which case Capa/Kramnik's 10cp difference per move is
startlingly good ....]

But that is still enough
to have some confidence in finding gross evaluation errors of 50cp or
more (which is what Crafty at 12 ply does).


Yes, but you still seem to be missing something. 100cp is a
pawn, and you can understand that very directly. 50cp is what? It
will matter if at some point we swap a 50cp advantage for a pawn-up
with 50cp compensation, but until then it's an arbitrary measure.
And even after that, it matters only if the implied equation "it's
worth giving up the two bishops in order to win a doubled pawn" [or
whatever] is so wrong that [eg] a won position is now drawn. GMs
don't normally talk in those terms, nor about a 50cp advantage, but
in terms of concrete material, specific positional pros and cons,
and plans in a specific position.

[...] Further, it's interesting that
the strongest and best WCs, by reasonably common consent, are
those whose judgement differs least from that of Crafty.

But they may well differ even less from the output of a stronger
engine.


Possibly. But if G&B's table 3 is showing anything at all
objective about Crafty, it is that Crafty12 plays "rather like" all
the WCs except perhaps Steinitz, and much more like Capablanca and
Kramnik than like other WCs. If Crafty12 is so rotten, it's been
amazingly lucky. After all, in the game you showed above, Crafty10
[assuming that's roughly what it was managing in 8s] deviates by
around 35cp/move from Shredder by G&B rules. So either there's a
*huge* improvement between C10 and C12 [and C12 would agree almost
exactly with Shredder] or else C12 is not only strong enough to
assess WC play, but is actually closer than you might expect to
emulating it.

If the GM makes the move Crafty expects it doesn't matter how wrong
the evaluation is.


Yes, but this doesn't matter *anyway* unless it results in
scoring moves in the wrong order -- and in that case, the GM should
*not* be playing Crafty's move. You can't have it all ways!

I think it is worth trying to agree a test protocol that could be used
to produce say 100 top level games consistently annotated by multiple
engines. Then we might be able to get some half decent stats. Hunches
really don't cut it.


Absolutely. But I don't think the stats will mean what you
seem to think they mean.

--
Andy Walker, School of MathSci., Univ. of Nott'm, UK.

Ads
 

MPAA - Mortgage - Remortgages - Debt Consolidation - Loans