![]() |
| If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|||||||
| Tags: games, results, set, skewing, use |
|
|
Thread Tools | Display Modes |
|
#1
|
|||
|
|||
|
I've been thinking about something and would appreciate the comments
of others. Most chess programs use an opening book, which one will typically based on a set of GM games. My understanding is that the chess analysis program whilst 'in book' will evaluate a position based on the frequency of occurance of the moves from this list of GM games, since you have used that to create your opening book. So if the opening move played by most grandmaster games is d4, then the chess program will give the highest score if white plays d4, rather than says a4, which would be a rare opening move for a grandmaster. If the most common reply by black is Nf6, and one plays that in a game, the chess engine will think that is black's best response. Is my understanding of this correct? If one then uses a set of GM games as a 'reference database' in a database program (chessbase, scid etc) to evaluate the best opening move, then if the same set of GM games is used, it will of course have d4 as the most common opening move. Then when you evaluate the position with a chess engine, it will give the highest score to white if he plays d4. Likewise the database will show Nf6 as blacks most common (and so one might infer best) response. This suggests to me that the results of computer analysis of positions get skewed if the ***same*** set of GM games are used both to create an opening book in a chess analysis program AND as a reference database in scid, chessbase or whatever. Clearly with the first few opening moves it makes no difference if the list of GM games is reasonably long. But as you progress further into the game, the skewing of results might become significant. Statistics never was my strongest subject, so I'd hate to guess when this becomes significant, but I can't help feeling one develops a false sense of what is the best move if both the chess engine and the database program use the same set of GM games. I'm using a set of about 27,000 GM games downloaded from the crafy ftp site to create an opening book and also the same set in scid as a reference database. I can't help feeling that this is not very sensible. It seemed reasonably logical at first to use a set of high quality games for both the opening book and the reference database, but now I'm not so sure. Comments ?? Dr. David Kirkby. |
| Ads |
|
#2
|
|||
|
|||
|
"Dr. David Kirkby" m wrote in message m... If the most common reply by black is Nf6, and one plays that in a game, the chess engine will think that is black's best response. Is my understanding of this correct? Not exactly, Frequency of appearance is just one criteria of move applicability in book. Here is an example, from Rookie 2.0, of how opening moves are weighted: "If n moves are available to be played then the probability for a move m (leading to p) to get chosen is proportional to P(m) = (1-Wp) x Gp x (1-Rp). This way we favor moves with higher winning expectations and also we prefer to choose moves that will keep us in the book as long as possible." Backing up for variable defintions: Each position p carries a weight, Wp(0 = Wp = 1), that reflects the goodness of the position. Also, we know the number of games Gp where this position has occured before. Finally, we have the average game result Rp from these games for the player to move. (For example, Rp is 0.75 if we have one won game and one drawn game for this position). Even more deatiled is how the book-learning functions are applied once that move has been played in-game, by the player. Not to mention, the user selectable weighting of frequency, game-results, and book-learning, as to how the engine will respond to the next occurance of said position. Opening book move selection is a complex process, more refined than mere frequency of appearance, and hopefully more successful in ultimate result. |
|
#4
|
|||
|
|||
|
"Derek Wildstar" wrote in message news:KOtKb.140584$VB2.538049@attbi_s51...
"Dr. David Kirkby" m wrote in message m... If the most common reply by black is Nf6, and one plays that in a game, the chess engine will think that is black's best response. Is my understanding of this correct? Not exactly, Frequency of appearance is just one criteria of move applicability in book. Here is an example, from Rookie 2.0, of how opening moves are weighted: "If n moves are available to be played then the probability for a move m (leading to p) to get chosen is proportional to P(m) = (1-Wp) x Gp x (1-Rp). This way we favor moves with higher winning expectations and also we prefer to choose moves that will keep us in the book as long as possible." snip I accept from your description that the chess engine will not play moves in a way directly proportial to their occurance in the games in its database. You have clearly shown that, although your state one factor will be the number of wins achieved by playing a particular move. But will the chess engine evaluate the position of the board based on the data in that database, if the position is still in its book ?? I assume there is no random number generator called score the position. If so, then that still leaves the possibility that you errornously believe the best move to play from Position A is move B, basing this on a) The chess engine thinks its the best move AND b) it lead to the most wins by GMs from that position. Based on (a) and (b), one might conclude incorrectly move B is best, simply because both the chess engine's opening book and your database are both based on an identical set of games. Again, I could be completly off track here. It is certainly not something I know much about (as I've clearly shown), but the issue did get me thinking a bit. Dr. David Kirkby |
|
#5
|
|||
|
|||
|
"Dr. David Kirkby" m wrote in I accept from your description that the chess engine will not play moves in a way directly proportial to their occurance in the games in its database. You have clearly shown that, although your state one factor will be the number of wins achieved by playing a particular move. So far so good, stipulating that frequency is just one factor in the decision making process about the 'goodness' of the move in book. But will the chess engine evaluate the position of the board based on the data in that database, if the position is still in its book ?? I assume there is no random number generator called score the position. Based on the prior written description of Rookie 2.0's methodology in-book, yes there is a form of evaluation of the position's 'goodness', which is not the exact same evaluation of the positition out-of-book. For this evaluation is again just a single factor, weighted, in determining the probability of a book-move being played... i.e. frequency of occurance. I suspect that in cases where a 50/50 result occurs, some randomness is to be expected, of course, this position when seen again, will no longer result in that same 50/50 evaluation, if the result of that last occurance is now weighted as well. If so, then that still leaves the possibility that you errornously believe the best move to play from Position A is move B, basing this on a) The chess engine thinks its the best move AND b) it lead to the most wins by GMs from that position. Based on (a) and (b), one might conclude incorrectly move B is best, simply because both the chess engine's opening book and your database are both based on an identical set of games. Again, I could be completly off track here. It is certainly not something I know much about (as I've clearly shown), but the issue did get me thinking a bit. Dr. David Kirkby I think you have unrealistic expectations of opening books. And before you think I'm the last word on books, I'm certainly not, but hopefully any errors will be corrected before they give you any bad-habits! ![]() Are you assuming that a decided upon book-move is the best move in that position? This is the theoretical aim of pattern-recognition, but the reality falls from the mark. Especially when you are under the impression that a population of GM games is the appropriate foundation for a book. Perhaps basing a book on a set of games is correct for opening practice, or specialized training, but as a general purpose aid to increasing the overall strength of a program, it's not correct. A general purpose opening book instead should have a hand picked set of positions that covers most, if not all, tournament variations, most, if not all, principle variations and Main Lines, and known traps, pitfalls and blunders, especially for lines that can cause the depth of search to stay depressingly shallow, keeping the correct line hidden. None of those conditions are met by including a list of games, however broad, especially by high rated players. They simply do not go into lines that amateurs do. I think you know all this already, or are close to knowing this, and you are wondering what exactly that DB of 27K games is going to do for you... I do not think incorporating that list into a general purpose book will do you much good. However, not willing to discount it as a worthwhile experiment, may I offer a suggestion, to put the issue to a test: Incorporate the games into an existing book, and then run an engine-match tournament with four players: 1) General Purpose Book 2) General Purpose Book + GM Games 3) GM Games 4) No Book See where I'm going with this...? ![]() |
|
#6
|
|||
|
|||
|
On Wed, 07 Jan 2004 05:47:06 GMT, "Derek Wildstar"
wrote: "Dr. David Kirkby" m wrote in I accept from your description that the chess engine will not play moves in a way directly proportial to their occurance in the games in its database. You have clearly shown that, although your state one factor will be the number of wins achieved by playing a particular move. So far so good, stipulating that frequency is just one factor in the decision making process about the 'goodness' of the move in book. But will the chess engine evaluate the position of the board based on the data in that database, if the position is still in its book ?? I assume there is no random number generator called score the position. Based on the prior written description of Rookie 2.0's methodology in-book, yes there is a form of evaluation of the position's 'goodness', which is not the exact same evaluation of the positition out-of-book. For this evaluation is again just a single factor, weighted, in determining the probability of a book-move being played... i.e. frequency of occurance. I suspect that in cases where a 50/50 result occurs, some randomness is to be expected, of course, this position when seen again, will no longer result in that same 50/50 evaluation, if the result of that last occurance is now weighted as well. If so, then that still leaves the possibility that you errornously believe the best move to play from Position A is move B, basing this on a) The chess engine thinks its the best move AND b) it lead to the most wins by GMs from that position. Based on (a) and (b), one might conclude incorrectly move B is best, simply because both the chess engine's opening book and your database are both based on an identical set of games. Again, I could be completly off track here. It is certainly not something I know much about (as I've clearly shown), but the issue did get me thinking a bit. Dr. David Kirkby I think you have unrealistic expectations of opening books. And before you think I'm the last word on books, I'm certainly not, but hopefully any errors will be corrected before they give you any bad-habits! ![]() Are you assuming that a decided upon book-move is the best move in that position? This is the theoretical aim of pattern-recognition, but the reality falls from the mark. Especially when you are under the impression that a population of GM games is the appropriate foundation for a book. Perhaps basing a book on a set of games is correct for opening practice, or specialized training, but as a general purpose aid to increasing the overall strength of a program, it's not correct. A general purpose opening book instead should have a hand picked set of positions that covers most, if not all, tournament variations, most, if not all, principle variations and Main Lines, and known traps, pitfalls and blunders, especially for lines that can cause the depth of search to stay depressingly shallow, keeping the correct line hidden. None of those conditions are met by including a list of games, however broad, especially by high rated players. They simply do not go into lines that amateurs do. I think you know all this already, or are close to knowing this, and you are wondering what exactly that DB of 27K games is going to do for you... I do not think incorporating that list into a general purpose book will do you much good. However, not willing to discount it as a worthwhile experiment, may I offer a suggestion, to put the issue to a test: Incorporate the games into an existing book, and then run an engine-match tournament with four players: 1) General Purpose Book 2) General Purpose Book + GM Games 3) GM Games 4) No Book See where I'm going with this...? ![]() I wanted to say a bit more about the problems of using a game list to try to generate a perfect opening book: 1. As Steve Lopez (in the Chessbase Technical Notes) has pointed out statstics (regarding the frequency a particular move has been played and that moves performance) can be deceptive. Say in some opening line White has played 14.Nxf6 in 20 games with a 60% performance rating and as a result Black may have even abondoned the line that led to the position where white plays 14.Nxf6, then some enterprising GM deeply analyzes Blacks subsequent play and finds a way for Black to get an advantage by force. Even if he wins the game White's performance will only drop to 57% - falsely indicating that 14.Nxf6 is still a pretty good move. 2. One way of getting around the statistical problem is to use a feature of Bookup called backsolving. Essentially backsolving starts from the end of all games in a collection and propagates the evaluation back up the move tree (using +- as the evaluation when white wins, and -+ as the evaluation when black wins). So if you backsolved the 21 games with 14.Nxf6 the game where the GM forced an advantage to Black and won the game would override other games and give 14.Nxf6 an evaluationj of -+. Unfortunately backsolving has certain limitations when it relies solely on game results: A player can lose on time in a won position. A player could play a three-fold repition allowing a draw in a won position. A player can reach a won position and through a series of mistakes throw away the win and even the draw and eventually lose. This can be accounted for by doing extensive analysis of games rather than relying on their results to get a good evaluation for a given position. 3. Lastly even if the evaluations for a moves are completely accurate, they can result in situation where the computer as white has an advantage (+/-) or even a won game (+-) when it leaves its opening book, but the program does not know how to play the subsequent positions. This is often true if the advantage is positional and requires a player to do long term strategic planning in order to realize the win; this is eactly what most programs are notoriously bad at. I have seen humourous situations where computers drop out book with a position that a human being recognizes as completely won and the computer's evaluation of the position is that its opponent has a winning advantage. Because of the above, many commercial chess program vendors have an IM or even a GM on staff to help tweak the opening book. A few years ago the chess site for Rebel would include some notes about the tweaking done to the opening book for Rebel; I don't know if they still do this. In some of the man-machine matches of the last few years with Kaspraov and other GMs the reports on the matches would talk about how the opening book needed to be modified to stay away from close positions where a GMs ability at strategic maneuvering would give him the advantage. When a computer is playing humans below master level the errors in the opening book aren't so critical. Although, we do see people bragging how they can beat a strong program time after time, which is only because they have found a new flaw in the programs opening book. Mike Ogush |
|
#7
|
|||
|
|||
|
"Mike Ogush" wrote in message ... relevant snippage Mike Ogush Thanks for those points Mike, I hope the OP appreciates the info. I know I do. I've taken my own advice and using the Chesspartner (5.3) interface, I've assembled a few personalities based on their bonus book downloads (Thank you Lokasoft), so far I have: Classic Openings Gambits Grand Master Games LCHESS Modern Openings John Nunn's Picks (!) Sharp Openings I'll set up CP's "Engine Research Tool" and using Ruffian, sic them upon each other. I do not have any expectations of opening revelation, but I'm curious to see if things play out as expected: Gambit losing to GM, Classic losing to Nunn... Even better, see which book(s) play best against people, and which play best against AI. I suspect the Modern Book and Ruffian will have the most difficult time together, Ruff has deep affinity for the principle variation, and freaky indirect openings frighten it into shallowness. |
|
#8
|
|||
|
|||
|
(Mike Ogush) wrote in message ...
On 5 Jan 2004 21:12:37 -0800, (Dr. David Kirkby) wrote: I've been thinking about something and would appreciate the comments of others. Most chess programs use an opening book, which one will typically based on a set of GM games. My understanding is that the chess analysis program whilst 'in book' will evaluate a position based on the frequency of occurance of the moves from this list of GM games, since you have used that to create your opening book. So if the opening move played by most grandmaster games is d4, then the chess program will give the highest score if white plays d4, rather than says a4, which would be a rare opening move for a grandmaster. If the most common reply by black is Nf6, and one plays that in a game, the chess engine will think that is black's best response. Is my understanding of this correct? openings story settings Video Camera Samsung Dealers Norway multi picking cracking spying warez games espionage bugs counter measures cable tv test chips red box computer hacking cable tv converter boxes videocipher smart card chip computer credit cards phrozen crew cracks telephone decoder descrambler wireless ecm programming telecode scanner crypto digital communication phone phreaking red boxes cable tv hacking smart card hacking surveillance equipment my-deja.com red my nu zite box machines troubleshoots ccctournament header key fields "Pea soup" green t616 rate my camel toe Keehn chessbase converter radi caraib Techman Head clothing jewish nobel prize winners beaulieu alaska landscaping fmccoy board +british ccct island custom tripp plane sore casio mr g watches snore the hispanics i river mp3 debt http://amateurschach.de negotiation british pop charts terminator stocks theatre in the uk landlord representation @my-deja.com florida evasion CCCT refurbished computers phrocrew magnetostriction amplify himage dvd 8800 sulzer orthopedics attorneys florida www.ancestry.com microsystems cellularseparation anxiety "business decreased libido refurbished cdr downloads st-europe @my-deja.com currants worksheet organization pdf puddle clay pile Compose colonic irrigation uk jordan La Maison Picassiette supply small looks almaty "villa monaco" overland hairstyles design weld glasses 200246 2500 ebay nikon cf-d100 arizona income property abc player christmas cat florist kent panasonic portable player locksmith motivation letter winDVD restart.com unpot emortenson Hammargren article papers rpg computer games chess looney tunes back lunatic alcohol hadieh tehrany ladiest emmanuel avi test blade computer telephone with debt consolodation flight of the navigator mobile ringtone fntbl skipper lyrics hard knocked life apartheid cure hampton keywords unops Philippine Western Visayas mediacleaner new generation water rats series Karl Mas Sveiby qnx cannot finalize dvd location hitler Many chess programs also factor in the performance of a move when selecting. Also some programs (Fritz for one) also have the ability to "learn" from the games they play and update the opening book accordingly. If one then uses a set of GM games as a 'reference database' in a database program (chessbase, scid etc) to evaluate the best opening move, then if the same set of GM games is used, it will of course have d4 as the most common opening move. Then when you evaluate the position with a chess engine, it will give the highest score to white if he plays d4. Likewise the database will show Nf6 as blacks most common (and so one might infer best) response. This suggests to me that the results of computer analysis of positions get skewed if the ***same*** set of GM games are used both to create an opening book in a chess analysis program AND as a reference database in scid, chessbase or whatever. Clearly with the first few opening moves it makes no difference if the list of GM games is reasonably long. But as you progress further into the game, the skewing of results might become significant. Statistics never was my strongest subject, so I'd hate to guess when this becomes significant, but I can't help feeling one develops a false sense of what is the best move if both the chess engine and the database program use the same set of GM games. I'm using a set of about 27,000 GM games downloaded from the crafy ftp site to create an opening book and also the same set in scid as a reference database. I can't help feeling that this is not very sensible. It seemed reasonably logical at first to use a set of high quality games for both the opening book and the reference database, but now I'm not so sure. Comments ?? Dr. David Kirkby. Yes, there is some skewing of a program's opening book based on the games or game fragments being used to compile the book. In some cases you may want the skewing: * You are studying a particular opening and want to play a number of practice games in that opening. You can do this by setting up a position and starting the games from that position; but this can be complex if you are studing a set of positions/variations (suchas as the Sicilian Dragon) and still want the program to remain in book. It is often easier to define the opening book so the program will always stay within main book lines when using that book. [Aside: Steve Lopez wrote a series of articles on using chess software to aid in the learning of a new opening as part of his technical notes series.] * You are studying the repertoire of a particular player (a possible future opponent) and you want the computer to only play the lines that player plays regularly. Also many people like to play lines that have some opening theory, but are don't played much by GMs (the Morra and Blackmar-Diemer gambits come to mind). Limiting the opening books to just games that GM plays means that the program will be out of book much earlier than published opening theory. That being said, if you do want just GM games there are a number downloadable collections of games where both players have a minimum rating; I have seen this for 2500 and 2600 Elo at www.uni-klu.ac.at/~gossimit/c/curious.htm. Alternatively you could download a large collection (such as ChessLib) and use a filtering program like pgn-extract to create your own collection of games with a minimum rating of your own choosing. Hint: If you do try the filtering approach be sure not to filter out games of players that show no rating but are known to be of GM strength (Capablanca, Alekhine, etc.) Mike Ogush |
|
#9
|
|||
|
|||
|
Dr. David Kirkby wrote:
This suggests to me that the results of computer analysis of positions get skewed if the ***same*** set of GM games are used both to create an opening book in a chess analysis program AND as a reference database in scid, chessbase or whatever. Most of the programs I've seen may choose opening moves based on frequency, but they don't typically use the book information in the search (other than perhaps to order move - which would be a subtle form of influence indeed). So the analysis the computer returns is probably tactically sound unless the database person has tried to get clever. Currently GNU Chess has a fairly crude method for using it's book games, which picks the best scoring moves, with a popularity cut off (b4 has an excellent score but hasn't been played enough). This very crude algorithmn occaisonally throws a wobbly, but against myself it often leaves it with a winning advantage by the time it leaves booksigh. Whilst GNU Chess now thinks (in it's opponents time only) in the openings - this is purely to prevent it from reaching a middle game with no analysis - which cost it a blitz game against a FIDE master where it made a really bad blunder due to having such a complex position it was unable to complete even a preliminary search of the position in time. Contary to "folk wisdom" computers often do fairly well without an opening book, and I'm tempted to go the complete opposite and just use opening books as a source of move ordering information for the search algorithmn. But I think the time penalty against players with carefully prepared books may be too large. -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQE//0fpGFXfHI9FVgYRAnPiAJ4kyk4vYVRzfvZhSZJdrwhDF1/92ACgiGjP UMGZZh9u0uCX0TRhm1tZJ6c= =CO/T -----END PGP SIGNATURE----- |
|
#10
|
|||
|
|||
|
|
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Rauzer Attack | Rik | rec.games.chess.analysis (Chess Analysis) | 3 | July 14th 04 09:28 AM |
| Best Ever Chess Games (Soltis list?) | Gregory Topov | rec.games.chess.analysis (Chess Analysis) | 0 | January 20th 04 03:00 PM |
| Jeff Sonas about computers surpassing humans | Sterten | rec.games.chess.computer (Computer Chess) | 15 | October 16th 03 06:03 PM |
| English Opening | Sam Stark | rec.games.chess.analysis (Chess Analysis) | 3 | July 17th 03 09:49 PM |
| English Opening | Sam Stark | rec.games.chess.computer (Computer Chess) | 1 | July 17th 03 02:56 AM |