hash table size
In article , Guest wrote:
(a) Even in a well-written engine, you may want/need to "clear
out" the table from time to time. Initialising several terabytes
may take a little while even on a modern machine ....
Actually, current programs don't do that. They leave the hash alone. They
don't even initialize it when the program is first run.
I don't have access to the source code of the commercial
programs, so you seem to have an advantage in this respect. But:
Either they actually use the data from the previous search, or they
invalidate it at the time of access...
There are several ways to invalidate at access... [...]
OK. But they all involve trade-offs, inc extra computation
and/or extra data stored in the table; which way the trade-offs
balance would seem to depend on depth of search, expected number
of reads vs writes, relative speed of memory, etc., and it would
be at least a slight surprise if they balanced the same way for
all engines on all hardware for all time.
(b) Some of the standard operations on a hash table involve
indexing and finding remainders modulo the table size. [...]
It is true that doing an AND is faster than a modulo. But the time
difference isn't really all that significant considering how long main
memory is going to take to access. [...]
OK, but DR's contention was that swapping was the *only*
reason not to have the table as large as possible. Presumably,
it is very likely that a table with [eg] 2^25 entries is better
than one with 2^25 + 1?
And as for the 'ints' vs. 'longs'... Sorry. That was true in the days of 16
bit compilers, but these days on 32 bit systems, both int & long will be 32
bits. And on a 32 bit system, 32 bits will be enough to access all of
physical memory. [...]
I was recently using a system with 32-bit ints and 64-bit
longs and pointers; I didn't think it particularly unusual. But
perhaps it was.
(c) If you manage to lock down all, or most, of the storage of
a machine, you may perhaps be unpopular, in a multi-user environment,
Well DUH.
However, on a multi-user environment you are likely to have memory
limitations imposed so you can't use more than your fair share. In fact,
you may only be given a couple meg of physical memory and all the rest of
your data will be paged in & out as needed. Definetly not a good thing for
large hash tables.
I don't think I've used a system like *that* since the days when
2M of physical memory was more than "elephantine"! ...
[...]
And if you are on a system without that restriction and you don't have
agreements etc. and you do it anyway and cause problems... (shrug) Tough.
That's your problem when the sysadmin catches you.
... And ever since those days I've *been* the sysadmin, at
least for the machine on my desk. But not really the point:
If you are runing a chess program for a reason (such as a tournament), then
you know not to use it for other things.
OK.
If you intend to use it for other things, then just don't tell it to use all
your available memory for the hash. (Although your other programs will
reduce the cpu power available to the chess program, which may cost it a
full ply of search.)
Then we're back to contention with DR's claim that the larger
the better until swapping occurs. Personally, I've just acquired my
first "Vista" machine [no flowers, by request], with its annoying
habits of firing off all sorts of stuff to and from M$ without any
notice, suddenly deciding to re-arrange its discs, telling me at
awkward moments to insert such-and-such disc to do dumping, and so
on. But even my Unix box is liable to receive e-mail, or to start
"cron" jobs. None of these things seem likely to occupy that much
CPU, but they do load up some surprisingly large programs.
And if you do want to run some other program and a chess program with
massive hash tables at the same time.... (shrug) Get a second computer.
It's not like they are expensive. A second computer can be bought for $400
that's more than good enough for basic usage, web surfing, etc. For just a
few dollars more, you could get a low end laptop instead.
Perhaps that's a left- vs right-pondian viewpoint. The machines
you can get for that sort of price over here really aren't "good enough"
at all. If they were merely last year's low-end models reduced to clear,
that might be different, but we only seem to get a teensy discount for
that. Instead, the very cheap ones are totally inadequate. YMMV.
As it happens, Prof Kraft [the OP] really gave us no idea what
hardware or software he was asking about, if any [eg, perhaps he is
writing his own program, or perhaps he needs background info for an
academic paper] so all bets are off. Your initial advice ["Basically,
run a simple test..."] to him is sound, of course. My only concern
was that one should not be quite so confident that "bigger is better"
in all [non-swapping] cases.
--
Andy Walker
Nottingham
|