Reply
 
LinkBack Thread Tools Display Modes
  #1   Report Post  
Old February 9th 10, 11:12 PM posted to rec.games.chess.misc,rec.games.chess.computer
external usenet poster
 
First recorded activity by ChessBanter: Jun 2009
Posts: 2,381
Default Blasts from the Past

On February 14, 2003 at 07:10:40, Rolf Tueschen wrote:

Just to explain some basics for new readers, I show why the whole List is
worthless. The rankings are by chance the way they are presented.

Since only a few here have basic knowledge in statistics I explain the most
apparet things.

We are told that for instance the two first programs are seperated by 8 points.
No matter Stefan get all the credits here for his first place. But is true that
Shredder is stronger than Fritz?

Here I must tell you that we simply don't know it. The SSDF pretend to know it,
but it is NOT true. How can I say such things? Easy! Look at the deviations.
These numbers with + or -. We see that most programs have an expected Elo number
varying plus/mius of about 30 points! Note, that the Elo minus 5 is as probable
as the fially given Elo for the ranking!

If you then take a look at the Elo of the opponents in the far right you can see
that even for the top programs the SSDF was unable to create equal conditions.
Also this influence by different opponents makes the 8 numbers difference at the
top meaningless.

In sum we can say that the SSDF failed to show - exactly what they pretend to
show - the differences between the actual top programs. The SSDF presents a new
leader, but that is against its own results! So that the conclusion is allowed
that SSDF makes deliberately their own new number 1!


Your comment that being number 1 in the list is not an absolute is
completely
correct. The SSDF doesn't claim it is a statistical absolute either,
which is
why they present the data: rating performance, number of games, AND
the error
margin.


THE SSDF RATING LIST 2003-02-13 90961 games played by 251
computers
Rating + - Games
Won Oppo
------ --- --- -----
--- ----
1 Shredder 7.0 256MB Athlon 1200 MHz 2768 33 -31 547
72% 2606
2 Deep Fritz 7.0 256MB Athlon 1200 MHz 2760 29 -28 654
70% 2612
3 Fritz 7.0 256MB Athlon 1200 MHz 2740 30 -29 574
64% 2635
4 Chess Tiger 15.0 256MB Athlon 1200 MHz 2726 27 -26 704
64% 2623


If they present the error margin, doesn't this *clearly* mean that the
result
may be off by that much? However, so far the current performance is
2768 SSDF
points. How many games does a human play to get their rating? I won't
event
mention the ridiculously low requirement by FIDE to play only 9 games
to get a
first rating. Suppose I had no rating and played 100 games against a
2000 Elo
player and I scored 75/100. My performance is 2200 exactly. Is it
absolute? No,
there is a good margin of error, yet no one will question the rating
and start
telling me I'm not rated 2200, I'm just rated anywhere between 2140
and 2260. I
see no difference. They had Shredder 7 play 547 games against other
programs,
and presented the results PLUS the error margin. It *may* still be a
fraction
weaker than Deep Fritz 7, but already it is clear that it performas
better than
Chess Tiger 15 against other computers. But even if another 200 games
changed
the top ratings to Shredder 7 = 2762 and DF7 = 2763 would anyone be so
foolish
as to claim one program is actually any stronger?? I certainly would
never think
of an opponent rated 10 points more as stronger. The fact that two
such
different playing styles achieve almost identical performances shows
how rich
and flexible chess is.

Albert


(Note please that this is not a political speech, however it is what statistics
demands. The SSDF got this critic so often in the past but they still did't
change their experimental setting.)

Rolf Tueschen

Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Blasts from the Past of chess.misc ChessFire rec.games.chess.misc (Chess General) 7 January 27th 10 04:44 PM
Has the USCF Board Learned from the Errors of the Past? samsloan rec.games.chess.politics (Chess Politics) 58 June 21st 09 02:30 PM
Has the USCF Board Learned from the Errors of the Past? samsloan rec.games.chess.misc (Chess General) 0 April 8th 09 01:08 AM
Checking the Past B. Lafferty[_6_] rec.games.chess.misc (Chess General) 1 February 5th 09 12:01 AM
blasts form the past Chess One rec.games.chess.misc (Chess General) 5 January 8th 07 10:44 PM


All times are GMT +1. The time now is 11:33 PM.

Powered by vBulletin® Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
Copyright 2004-2019 ChessBanter.
The comments are property of their posters.
 

About Us

"It's about Chess"

 

Copyright © 2017