Reply
 
LinkBack Thread Tools Display Modes
  #1   Report Post  
Old March 25th 04, 11:27 AM
Paul Rushmer
 
Posts: n/a
Default hash table collisions

Hi

I'm writing my own chess program, and I'm just trying to implement a
transposition table using a hash table. My Zobrist keys seem to be working
ok, and I've loaded up some opening books at program start and the positions
are recognised when played. But when I use the table with my a/b function, i
get lots of collisions, many more than i expected. Typically during a 4 ply
search of a midgame I'll get many 1000's of collisions. Any idea of a
typical collision frequency? I'm using a 64 bit key, and the C rand
function (with bit shifts) to generate the random numbers, and I can't find
duplicates in the zobrist keys at initialisation, but like i say, during
play lots of positions hash to a key that when used to access my table i
find the "key lock" does not match. (key lock i use is the hash index used
to store the position) . My table size is currently 4 Mbyte and I've tried
increasing this but it makes little difference, so I think i have a
fundamental problem. Any ideas greatly appreciated.

Many thanks - Paul Rushmer




  #2   Report Post  
Old March 25th 04, 11:56 AM
Mikko Nummelin
 
Posts: n/a
Default hash table collisions

On Thu, 25 Mar 2004, Paul Rushmer wrote:

Hi

I'm writing my own chess program, and I'm just trying to implement a
transposition table using a hash table. My Zobrist keys seem to be
working ok, and I've loaded up some opening books at program start and
the positions are recognised when played. But when I use the table with
my a/b function, i get lots of collisions, many more than i expected.
Typically during a 4 ply search of a midgame I'll get many 1000's of
collisions. Any idea of a typical collision frequency? I'm using a 64
bit key, and the C rand function (with bit shifts) to generate the
random numbers, and I can't find duplicates in the zobrist keys at
initialisation, but like i say, during play lots of positions hash to a
key that when used to access my table i find the "key lock" does not
match. (key lock i use is the hash index used to store the position) .
My table size is currently 4 Mbyte and I've tried increasing this but it
makes little difference, so I think i have a fundamental problem. Any
ideas greatly appreciated.


There is lots of things to check, first:

(1) Are the generated keys really 64 bits and all random stuff, i.e. not
32 bits and the rest of the key truncated.

(2) When comparing the keys, are you sure you are really comparing the
whole keys and not just, say 32 lowest bits. Some C-compilers can mess
this up really badly, so something like

int are_keys_equivalent(unsigned long long key1, unsigned long long key2)
{
unsigned long *key1p,*key2p;
key1p=(unsigned long *)(&key1);
key2p=(unsigned long *)(&key2);
return ((key1p[0]==key2p[0]) &&
(key1p[1]==key2p[1]));
}

might be a good idea to test for equivalence.

(3) When updating the positions, i.e. making moves and taking them back,
do the keys get updated correctly? It is important that there isn't a
single bug in sight in this part. An easy way to test this is to take a
moderate-sized game including castling, en passant and promotions, start
from a particular position in the opening, play the game move to move,
take the moves back and control that when taking back, the position keys
match the same values when playing forward. If there is a mismatch, you
know there is a bug in making/unmaking a particular type of move.

(4) What is the exact method you use to get a hash table index from
zobrist key? Debug all code which is affected by this, i.e. inserts and
lookups.


Mikko Nummelin
  #3   Report Post  
Old March 25th 04, 12:51 PM
David Richerby
 
Posts: n/a
Default hash table collisions

Mikko Nummelin wrote:
Paul Rushmer wrote:
I'm using a 64 bit key, and the C rand function (with bit shifts) to
generate the random numbers


On some systems, rand doesn't give very good randomness at all. The most
common implementation (the linear congruential generator) is guaranteed to
return alternating odd and even numbers, which would mean that bit 63 of
your Zobrist key would always be zero and bit 31 would always be one (or
vice versa) if you're just shifting two successive values of rand() into a
64-bit integer type.

Have a look at your C library documentation to see if there are better
random number generators available.


There is lots of things to check, first:

(1) Are the generated keys really 64 bits and all random stuff, i.e.
not 32 bits and the rest of the key truncated.


In particular, note that the following code is broken:

uint32_t i,j;
uint64_t k;

k = (i 32) | j;

The type of (i 32) is still uint32_t so its value is zero and the
effect of the above assignment is just k = j. You need

k = ((uint64_t)i 32) | j;

Always compile with all possible warnings turned on to give you the best
possible chance of trapping this kind of error. For bonus marks, pass
your code through a version of lint, too. (I don't use lint myself but
that doesn't stop it being good advice.)

Also, don't write code that assumes that an int is 32 bits wide and a long
long 64 bits. This will break badly when you come to recompile on a
64-bit processor in five years' time. typedef a 32-bit type and a 64-bit
type and use those for any variable whose width actually matters: that
way, you only have to change the typedef and not the code.


(2) When comparing the keys, are you sure you are really comparing the
whole keys and not just, say 32 lowest bits. Some C-compilers can mess
this up really badly, so something like

int are_keys_equivalent(unsigned long long key1, unsigned long long key2)
{
unsigned long *key1p,*key2p;
key1p=(unsigned long *)(&key1);
key2p=(unsigned long *)(&key2);
return ((key1p[0]==key2p[0]) && (key1p[1]==key2p[1]));
}

might be a good idea to test for equivalence.


!!!

It isn't. For a start, you're assuming that sizeof (long long) ==
2*sizeof (long), which is not justified by the standard.

If your C-compiler is only comparing the bottom 32 bits of a 64-bit type,
it's fundamentally broken and you should get a new one. Moral: don't
attempt to write code using 64-bit numbers on a system where int is only
32 bits unless you understand C's type system. Time spent in learning
about the type system (from, e.g., Harbison and Steele or K&R) will repay
itself many times over in the reduction in type promotion bugs.


Dave.

--
David Richerby Unholy Crystal Robot (TM): it's like
www.chiark.greenend.org.uk/~davidr/ a high-tech robot but it's completely
transparent and also a crime against
nature!
  #4   Report Post  
Old March 25th 04, 01:38 PM
Robert Hyatt
 
Posts: n/a
Default hash table collisions

Paul Rushmer wrote:
Hi


I'm writing my own chess program, and I'm just trying to implement a
transposition table using a hash table. My Zobrist keys seem to be working
ok, and I've loaded up some opening books at program start and the positions
are recognised when played. But when I use the table with my a/b function, i
get lots of collisions, many more than i expected. Typically during a 4 ply
search of a midgame I'll get many 1000's of collisions. Any idea of a
typical collision frequency? I'm using a 64 bit key, and the C rand
function (with bit shifts) to generate the random numbers, and I can't find
duplicates in the zobrist keys at initialisation, but like i say, during
play lots of positions hash to a key that when used to access my table i
find the "key lock" does not match. (key lock i use is the hash index used
to store the position) . My table size is currently 4 Mbyte and I've tried
increasing this but it makes little difference, so I think i have a
fundamental problem. Any ideas greatly appreciated.


Many thanks - Paul Rushmer



it sounds like you are defining "collision" incorrectly. Many different positions
will hash to the same table entry, but with different hash signatures. These are not
collisions...




--
Robert M. Hyatt, Ph.D. Computer and Information Sciences
University of Alabama at Birmingham
(205) 934-2213 136A Campbell Hall
(205) 934-5473 FAX Birmingham, AL 35294-1170
  #5   Report Post  
Old March 25th 04, 02:42 PM
David Richerby
 
Posts: n/a
Default hash table collisions

Robert Hyatt wrote:
it sounds like you are defining "collision" incorrectly. Many different
positions will hash to the same table entry, but with different hash
signatures. These are not collisions...


What is a collision, then? Of the first ten hits on google for `"hash
table" collision', seven define the term and all appear to be giving the
definition you dismiss. I'd refer to more authoritative sources but all
my textbooks on data structures are in the office and I'm not.


Dave.

--
David Richerby Accelerated Poetic Robot (TM): it's
www.chiark.greenend.org.uk/~davidr/ like a high-tech robot but it's in
verse and twice as fast!


  #6   Report Post  
Old March 25th 04, 03:10 PM
David Richerby
 
Posts: n/a
Default hash table collisions

David Richerby wrote:
Robert Hyatt wrote:
it sounds like you are defining "collision" incorrectly. Many different
positions will hash to the same table entry, but with different hash
signatures. These are not collisions...


What is a collision, then? Of the first ten hits on google for `"hash
table" collision', seven define the term and all appear to be giving the
definition you dismiss. I'd refer to more authoritative sources but all
my textbooks on data structures are in the office and I'm not.


On the other hand, my flatmate's copies are at home.

``The fly in the ointment of this beautiful idea is that two keys may
hash to the same slot -- a collision.''
-- Cormen, Leiserson and Rivest, _Introduction to Algorithms_, 1st
edition, p222.

``Ideally, different keys should map to different addresses, but no hash
function is perfect, and two or more figgerent keys may map to the
same table address. The second part of hashing is thus a collision-
resolution process which deals with such keys.''
-- Sedgewick, _Algorithms_, 2nd edition, p231.


Dave.

--
David Richerby Hilarious Electronic Wine (TM):
www.chiark.greenend.org.uk/~davidr/ it's like a vintage Beaujolais but
it uses electricity and it's a bundle
of laughs!
  #7   Report Post  
Old March 25th 04, 04:43 PM
Noah Roberts
 
Posts: n/a
Default hash table collisions

David Richerby wrote:
David Richerby wrote:

Robert Hyatt wrote:

it sounds like you are defining "collision" incorrectly. Many different
positions will hash to the same table entry, but with different hash
signatures. These are not collisions...


What is a collision, then? Of the first ten hits on google for `"hash
table" collision', seven define the term and all appear to be giving the
definition you dismiss. I'd refer to more authoritative sources but all
my textbooks on data structures are in the office and I'm not.



On the other hand, my flatmate's copies are at home.

``The fly in the ointment of this beautiful idea is that two keys may
hash to the same slot -- a collision.''
-- Cormen, Leiserson and Rivest, _Introduction to Algorithms_, 1st
edition, p222.

``Ideally, different keys should map to different addresses, but no hash
function is perfect, and two or more figgerent keys may map to the
same table address. The second part of hashing is thus a collision-
resolution process which deals with such keys.''
-- Sedgewick, _Algorithms_, 2nd edition, p231.


Dave.


They are talking about the hash table where "key" is some sort of data
type that gets munged into an index, such as string to int. This index
can be the same for different keys. This would compare in chess to the
board itself being the key and the zobrist hash being the munged index.
The fact is that two different boards can hash to the same zobrist
index and this is a collision.

--
"I'm a war president. I make decisions here in the Oval Office
in foreign policy matters with war on my mind." - Bush

  #8   Report Post  
Old March 25th 04, 09:16 PM
Robert Hyatt
 
Posts: n/a
Default hash table collisions

David Richerby wrote:
Robert Hyatt wrote:
it sounds like you are defining "collision" incorrectly. Many different
positions will hash to the same table entry, but with different hash
signatures. These are not collisions...


What is a collision, then?


A collision is what happens when two different "things" produce the same
signature. That isn't happening in your case. You hash to a (probably)
64 bit signature, then you probably take the right N bits as the table
address to probe. But I hope you are storing the signature (64 bits) and
using that to match before you consider the probe a "hit"...



Of the first ten hits on google for `"hash
table" collision', seven define the term and all appear to be giving the
definition you dismiss. I'd refer to more authoritative sources but all
my textbooks on data structures are in the office and I'm not.


What is happening is you are mixing terms. A hash collision is what happens
when you take some entity, like a SSN or name, and transform it to a smaller
entity like an N-bit table index. We don't do that in chess. We compute the
Zobrist signature to 64 bits. A chess position can't be compressed to 64 bits
so we stand to have multiple chess positions (perhaps 168 bits is the most
compatc position storage known) that produce the same Zobrist signature.

_those_ are collisions.

Taking some sub-set of the Zobrist signature and using it to index a table
won't produce false matches if you use the original Zobrist signature as a
match key... Multiple Zobrist signatures definitely lead to the same table
index, but we don't call those collisions since we don't get a false match
and we don't maje an error...



Dave.


--
David Richerby Accelerated Poetic Robot (TM): it's
www.chiark.greenend.org.uk/~davidr/ like a high-tech robot but it's in
verse and twice as fast!


--
Robert M. Hyatt, Ph.D. Computer and Information Sciences
University of Alabama at Birmingham
(205) 934-2213 136A Campbell Hall
(205) 934-5473 FAX Birmingham, AL 35294-1170
  #9   Report Post  
Old March 25th 04, 09:18 PM
Robert Hyatt
 
Posts: n/a
Default hash table collisions

David Richerby wrote:
David Richerby wrote:
Robert Hyatt wrote:
it sounds like you are defining "collision" incorrectly. Many different
positions will hash to the same table entry, but with different hash
signatures. These are not collisions...


What is a collision, then? Of the first ten hits on google for `"hash
table" collision', seven define the term and all appear to be giving the
definition you dismiss. I'd refer to more authoritative sources but all
my textbooks on data structures are in the office and I'm not.


On the other hand, my flatmate's copies are at home.


``The fly in the ointment of this beautiful idea is that two keys may
hash to the same slot -- a collision.''
-- Cormen, Leiserson and Rivest, _Introduction to Algorithms_, 1st
edition, p222.


``Ideally, different keys should map to different addresses, but no hash
function is perfect, and two or more figgerent keys may map to the
same table address. The second part of hashing is thus a collision-
resolution process which deals with such keys.''
-- Sedgewick, _Algorithms_, 2nd edition, p231.


We are doing two-level hashing. 168 bit position - 64 bit Zobrist
signature, then 64 bit Zobrist signatire - N bit table index.

The only collisions that are significant are 168-64 producing errors. The
"second kind" is 100% harmless...



Dave.


--
David Richerby Hilarious Electronic Wine (TM):
www.chiark.greenend.org.uk/~davidr/ it's like a vintage Beaujolais but
it uses electricity and it's a bundle
of laughs!


--
Robert M. Hyatt, Ph.D. Computer and Information Sciences
University of Alabama at Birmingham
(205) 934-2213 136A Campbell Hall
(205) 934-5473 FAX Birmingham, AL 35294-1170
  #10   Report Post  
Old March 26th 04, 12:25 AM
Paul Rushmer
 
Posts: n/a
Default hash table collisions

Thanks to the group for all the suggestions, I'll definitely examine more
closely my key
values for true randomness and do a step by step play & take back, checking
keys return to same value at each step. I give below the function I use to
gen. a 64 bit random number. This was taken from Bruce Morland's article on
position comparison which I found very helpful.

Paul R

typedef unsigned _int64 hashkey;

hashkey gen64bitRandNum(void)
{
rand()^((hashkey)rand()15)^((hashkey)rand()30) ^
((hashkey)rand()45)^
((hashkey)rand()60);
}

"Paul Rushmer" wrote in message
...
Hi

I'm writing my own chess program, and I'm just trying to implement a
transposition table using a hash table. My Zobrist keys seem to be working
ok, and I've loaded up some opening books at program start and the

positions
are recognised when played. But when I use the table with my a/b function,

i
get lots of collisions, many more than i expected. Typically during a 4

ply
search of a midgame I'll get many 1000's of collisions. Any idea of a
typical collision frequency? I'm using a 64 bit key, and the C rand
function (with bit shifts) to generate the random numbers, and I can't

find
duplicates in the zobrist keys at initialisation, but like i say, during
play lots of positions hash to a key that when used to access my table i
find the "key lock" does not match. (key lock i use is the hash index used
to store the position) . My table size is currently 4 Mbyte and I've tried
increasing this but it makes little difference, so I think i have a
fundamental problem. Any ideas greatly appreciated.

Many thanks - Paul Rushmer






Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Table Top Chess Computers: Recommendations? Isidor Gunsberg rec.games.chess.analysis (Chess Analysis) 3 July 11th 04 04:14 AM
Rebel 12 for DOS - New Version Simon rec.games.chess.computer (Computer Chess) 9 January 20th 04 10:56 PM
Chessbase engines loses so easy! kostas_1966 rec.games.chess.computer (Computer Chess) 14 November 5th 03 11:57 PM
Large Chess Variants and Type 1 hash table collisions John Weathers rec.games.chess.computer (Computer Chess) 1 August 26th 03 08:52 PM
Hash Table and Quiescence Search Jih-tung Pai rec.games.chess.computer (Computer Chess) 4 August 25th 03 04:59 PM


All times are GMT +1. The time now is 02:09 AM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
Copyright 2004-2019 ChessBanter.
The comments are property of their posters.
 

About Us

"It's about Chess"

 

Copyright © 2017