Reply
 
LinkBack Thread Tools Display Modes
  #1   Report Post  
Old May 9th 07, 04:20 PM posted to rec.games.chess.computer,rec.games.chess.analysis
external usenet poster
 
First recorded activity by ChessBanter: Jul 2006
Posts: 1,015
Default Test Positions

I know the various Nunn test sets that exist and most engines now can
handle them pretty well. There are a few old chestnuts that still
distress particular engines and are instantly handled by others. eg.

6kr/5b1p/2p3pP/rpPp1pP1/pP1PpP2/P3P3/1K6/8 w - -

(William Harston composed anti-computer puzzle - cited by Roger
Penrose in "Shadows of the Mind")

The classic blocked pawn structure with a rook ready to be taken only
if you want to lose. Very few chess engines can see this until they
enumerate to around 23 ply in 2-15 minutes. Shredder is the exception
taking about 1s.

Time to realise that taking the rook will lose is the measure of
goodness.

There are also constructed positions intended as parts of endgame
studies or puzzles. I have found a class of these that distress the
Chessbase engine interface when submitted to infinite analysis with
multiple lines displayed (UCI engines like Rybka do not cripple to
quite the same extent). The following fairly innocent looking position
grinds to a standstill within 60s. Almost all the time after that is
spent looking at lines where mate in N has already been declared. Only
a few useful seconds are spent on the unresolved part of the problem
the rest of the time is wasted. Chessbase say this is "normal" and not
a bug. I will let you try it and decide.

Display all the continuation lines and then watch what happens when
the search ply reaches 20
It only runs at full speed for about 30s and then it is like wading
through treacle.

k6K/P1p3rP/qQP3Pb/1PP3p1/8/8/8/8 w - - 0 1

High mobility N-queens type puzzle positions can also crash the
Chessbase interface software (and in a way that is sometimes useful if
you want to reset your games won/lost/drawn counters to zero). Perhaps
unsafe..

Incidentally I would really like there to be a lock so that when an
infinite analysis is in progress and already has run for more than a
few hours clicking on the moves graph or moves list should show an
"are you sure?" msg. It is too easy to destroy some hard won analysis
with a misplaced mouse click off the edge of an overlapping window.

But I am now interested in finding new example positions from real
games where different engines and humans score variations radically
differently. This question arises out of the crippled Crafty rates
world champions thread.

New constructed positions are also OK, but the idea is to find a small
subset of positions that highlight the differences between various
engines to the maximum extent. If it comes from a game please give
details. I am keen to compare engine analysis of a position taken in
isolation against the analysis of the whole game tree.

Positions that highlight the virtues of your favourite engine are
especially welcome!

Standard test conditiions infinite analysis top N lines displayed
(please specify N).

We already have the line suggested by Phil Innes from Najer vs
Bezgodov 2003

rnbk1b1r/pp3ppp/8/2p1P3/2p5/2P2N2/P4PPP/R1B1KB1R b KQ - 0 10

Q: Which engines can see the continuation line (at probably 3rd ot 4th
best):
10. ... h6 11. Bxc4 Be6 12. Bxe6 fxe6 13. Nh4

So far Rybka2.3.1 (13ply, 15s), Shredder10 (16 ply, 4m) and
Fruit2.2.1 (18ply, 30min) pass this test.

Later in the same game there is a tricky endgame position that exposes
some gaps in Rybkas evaluation function. It lacks discrimination in
this particular endgame when compared to Shredder10.

4r3/p5p1/1p4Pp/3KBP2/P1p3P1/3p4/2k5/7R b - - 0 35

One of 35. ... Re7, 35. ... d2, 35. .... c3 might save the day or
might not. None of the engines can see sufficiently deeply into this
position to get a handle on it. Rybka notably flatlines at ply19 with
most lines having the same evaluation (either 2.45 or 3.71).

And a very effective position for sorting the wheat from the chaff was
suggested by HelpBot. It comes from the Kasparov-Anand Tal memorial
match at Riga 1995 [C51]. The key position after move 11 is

r1bqk2r/ppppbppp/2n5/3nP3/8/2P2NQ1/P3BPPP/RNB1K2R b KQkq - 0 11

At ply 16 Rybka favours 11. ... OO (expecting 12. OO in response) by
ply 18 it sees 12. Bh6 etc in a slightly long winded form. Other
engines get wildy different scores for the same moves and lines of
play. Hard to say that Rybka is exactly right, but if it agrees with
Shredder then the odd man out is usually wrong. Worth looking at the
top 10 lines in this position if you believe Rybka - other engines
like Fritz8 have *very* different ideas. Maximum difference is for
Rybkas top line at 4 mins elapsed 11. ... OO [rybka -0.20; fritz 0.75]
a massive 95cp!

Fritz clearly doesn't like to risk losing the exchange by castling
into it 11. ... g6 have similar evaluations.

I am hoping that others will add to the thread with unusual positions
where engines give significantly different answers - and preferably
where there is a GM level game to provide a context.

Thanks for any enlightenment.

Regards,
Martin Brown

Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Test Positions Martin Brown rec.games.chess.analysis (Chess Analysis) 0 May 9th 07 04:20 PM
Solving chess entrokey rec.games.chess.computer (Computer Chess) 47 February 2nd 07 08:10 PM
Test suites and ply depth [email protected] rec.games.chess.computer (Computer Chess) 3 November 14th 06 04:10 PM
ChessCafe blackmailing USCF? [email protected] rec.games.chess.politics (Chess Politics) 165 October 7th 05 08:05 PM
ChessCafe blackmailing USCF? [email protected] rec.games.chess.misc (Chess General) 117 October 7th 05 06:04 PM


All times are GMT +1. The time now is 11:00 PM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
Copyright 2004-2019 ChessBanter.
The comments are property of their posters.
 

About Us

"It's about Chess"

 

Copyright © 2017