hash table size
"Anders Thulin" wrote in message
...
Guest wrote:
Basically, the larger the better provided it all fits into physical
memory
with no chance of paging out to disk. It's not going to fit into the L2
cache anyway....
I don't quite agree: every hash table entry imposes some administrative
overhead that offsets the benefit of having a hash table. If that
overhead
gets too large, you lose.
The only overhead is:
HashIndex = Hash % TableSize;
That's constant regardless of table size.
And most tables are implented one of four ways.
1) Check one entry.
2) Check two entries.
3) Check one hash line, which may contain two or more entries.
4) One small table and one large table.
All four cases behave the same regardless of table size.
The only additional overhead would be an increase in the number of TLB
entries. (The cpu internal entries that convert addresses into physical
memory addresses.) But since there are already a lot of them and they
aren't going to be able to stay in the L2 cache, every hash check is nearly
guaranteed to be a full cost random memory access.
So unless your hash table is *very* small (like small enough to fit into L2
or only slightly larger), there is no extra overhead when increasing the
size, as long as you stay within physical memory.
Now, if you try to clear the hash, then a larger hash will take more time.
But that's a one time thing before each search, and most people don't
actually clear the hash. Instead other things are done so that old entries
are very unlikely to match for a new search.
Just where the cutoff is depends on how the hash table was implemented.
Testing can be used to detect the cut-off, but it will probably take a lot
of careful testing to discover.
So: ask the software writers. They should have an idea if there
is a sweet spot or not.
I am a writer. Not a major program writer, no. But I have written a few
chess programs over the years.
And this question has come up repeatedly over the years. And the answer is
always the same by the major chess program writers. (Hyatt, etc. etc.)
People who have done these kinds of tests and always get the same basic
answer.
As to whether there is a 'sweet spot', that would suggest the most benefit
for the amount of memory consumed.
Like 75% at 16m, 95% at 32m and 96% at 64m.
In that case, the 'sweet spot' would be 95% at 32m. The most 'bang for the
buck'.
Finding that 'sweet spot' would require testing. Which I did suggest he do.
But although going to 64m would cost you an extra 32m for only 1%
improvement would make it unatractive, it's still not a bad setting if you
have the extra memory to spare.
Except for the occasional random 'bad' position that might hurt from a
larger hash, the majority of time, for normal positions, more memory is
better.
|