Author: Vincent Diepeveen
Date: 21:21:38 12/17/02
Go up one level in this thread
On December 17, 2002 at 16:45:37, Robert Hyatt wrote: >On December 17, 2002 at 11:46:57, Vincent Diepeveen wrote: > >>On December 17, 2002 at 11:29:19, Robert Hyatt wrote: >> >>hello Bob, >> >>please do the same tests i did with DIEP too with crafty. >> >>Of course as you always say that doing a few tests proof nothing, >>please repeat them twice. >> >>For me doing a test twice with crafty is sufficient. > >I generally say "your doing a few tests proves nothing." "Why?" you ask? >Because >you seem incapable of understanding simple ideas. Remember your nonsense about >"on a dual Crafty runs _no_ faster at all here..."??? So if I remain a bit >suspicious of it was not doing anything weird that test. i was running analyze modus in crafty and you have no asymmetric king safety. I did all my tests like that in fact with crafty *always*. wonder why you have different forms of crafty within the same program even! why not always turn it off or on? note i also have a dual K7. Well i guess you gotta pay a price to get a wintel license... ...your choice. >any number(s) you report, there is a reason for it... > > >I ran 24 positions twice and reported the NPS for 1 thread, no SMT, two threads, >no SMT, >three threads, SMT on, and four threads, SMT on. > >What more can I run??? you could email it to diep@xs4all.nl those runs. i love to calculate it with an objective calculator. yours also probably didn't show a 12.7% speedup for the position below. it's impossible for me to believe crafty having a nps speedup better than 20% for SMT (2 threads with SMT turned off versus 4 threads smt turned on). 20% is the maximum HT as is in the 2.8 Xeon could give to a decent program. No 100% like in Nalimov dreams. No 30-50 either. > >> >>I am especially interested in the completed logs too so that we all can >>see what mainline you took to compare the speedup and absolute speeds >>in nps. > >All I reported was NPS. I'm not going to post such a huge wad of output here. >I will >take one position from each of the four tests and give you those. I have no >idea what you >mean by "what mainline you took to compare the speedup and absolute speeds in >NPS." >I didn't take _any_ mainline. I didn't report _any_ speedup. I only reported >the increase in >raw NPS numbers. So the rest of your query simply makes no sense to me. This >is not about >parallel search efficiency. It is about whether SMT speeds things up or not, >and the answer (so >far) is clearly "yes it does." > >This is the last position from the 24 I ran. It is one of the Kopec positions >but It >doesn't say which one. The position is this (FEN): >3rn2k/ppb2rpp/2ppqp2/5N2/2P1P3/1P5Q/PB3PPP/3RR1K1 w > >Run 1. one thread, no SMT: >White(1): move > clearing hash tables > time surplus 0.00 time limit 166:39 (166:39) > depth time score variation (1) > 1 0.00 0.81 1. Bd4 > 1-> 0.00 0.81 1. Bd4 > 2 0.00 0.73 1. Bd4 Bb6 > 2-> 0.00 0.73 1. Bd4 Bb6 > 3 0.00 -- 1. Bd4 > 3 0.00 0.29 1. Bd4 g6 2. Nh6 Qxh3 3. gxh3 > 3 0.01 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 3 0.01 0.63 1. Qf3 g6 2. Nh6 > 3-> 0.01 0.63 1. Qf3 g6 2. Nh6 > 4 0.01 0.53 1. Qf3 g6 2. Nh6 Re7 > 4 0.01 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 4-> 0.04 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 5 0.04 ++ 1. Nh6!! > 5 0.05 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 5-> 0.06 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 6 0.06 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 > 6-> 0.10 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 > 7 0.11 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 > 7-> 0.16 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 > 8 0.20 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 > 8-> 0.33 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 > 9 0.37 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 > 9-> 0.54 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 > 10 0.61 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 > 10-> 1.49 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 > 11 1.62 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 > 11-> 3.93 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 > 12 4.22 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 > 12-> 7.90 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 > 13 8.84 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 > 13-> 25.57 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 > time=25.57 cpu=99% mat=0 n=31129467 fh=94% nps=1217k > ext-> chk=917605 cap=59244 pp=3275 1rep=119119 mate=34241 > predicted=0 nodes=31129467 evals=3804978 > endgame tablebase-> probes done=0 successful=0 > SMP-> split=0 stop=0 data=0/64 cpu=25.50 elap=25.57 > >Run 2: two threads, no SMT: > > depth time score variation (1) >starting thread 1 > 1 0.00 0.81 1. Bd4 > 1-> 0.00 0.81 1. Bd4 > 2 0.00 0.73 1. Bd4 Bb6 > 2-> 0.00 0.73 1. Bd4 Bb6 > 3 0.00 -- 1. Bd4 > 3 0.00 0.29 1. Bd4 g6 2. Nh6 Qxh3 3. gxh3 > 3 0.00 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 3 0.01 0.63 1. Qf3 g6 2. Nh6 > 3-> 0.07 0.63 1. Qf3 g6 2. Nh6 > 4 0.07 0.53 1. Qf3 g6 2. Nh6 Re7 > 4 0.07 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 4-> 0.10 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 5 0.10 ++ 1. Nh6!! > 5 0.11 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 5-> 0.14 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 6 0.15 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 > 6-> 0.17 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 (s=4) > 7 0.18 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=3) > 7-> 0.35 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=6) > 8 0.37 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=5) > 8-> 0.50 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=4) > 9 0.57 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=3) > 9-> 0.73 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=5) > 10 0.78 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=4) > 10-> 1.38 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=8) > 11 1.49 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=7) > 11-> 2.98 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=10) > 12 3.19 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=9) > 12-> 6.02 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=9) > 13 6.70 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=8) > 13-> 18.42 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=12) > time=18.42 cpu=188% mat=0 n=34809013 fh=94% nps=1889k > ext-> chk=992883 cap=70671 pp=5213 1rep=135350 mate=27885 > predicted=0 nodes=34809013 evals=4659923 > endgame tablebase-> probes done=0 successful=0 > SMP-> split=761 stop=38 data=7/64 cpu=34.74 elap=18.42 > >Run three: three threads, SMT _on_: > >White(1): move > clearing hash tables > time surplus 0.00 time limit 166:39 (166:39) > depth time score variation (1) >starting thread 1 >starting thread 2 > 1 0.00 0.81 1. Bd4 > 1-> 0.01 0.81 1. Bd4 > 2 0.01 0.73 1. Bd4 Bb6 > 2-> 0.01 0.73 1. Bd4 Bb6 > 3 0.01 -- 1. Bd4 > 3 0.01 0.29 1. Bd4 g6 2. Nh6 Qxh3 3. gxh3 > 3 0.01 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 3 0.02 0.63 1. Qf3 g6 2. Nh6 > 3-> 0.08 0.63 1. Qf3 g6 2. Nh6 > 4 0.08 0.53 1. Qf3 g6 2. Nh6 Re7 > 4 0.09 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 4-> 0.10 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 5 0.11 ++ 1. Nh6!! > 5 0.11 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 5-> 0.16 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 6 0.17 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 > 6-> 0.19 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 (s=4) > 7 0.20 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=3) > 7-> 0.24 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=7) > 8 0.26 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=6) > 8-> 0.36 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=4) > 9 0.38 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=3) > 9-> 0.50 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=5) > 10 0.54 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=4) > 10-> 1.07 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=8) > 11 1.16 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=7) > 11-> 2.66 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=10) > 12 2.86 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=9) > 12-> 5.40 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=8) > 13 6.02 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=7) > 13-> 17.69 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=13) > time=17.69 cpu=296% mat=0 n=38066840 fh=94% nps=2151k > ext-> chk=1076447 cap=74646 pp=5437 1rep=149438 mate=35093 > predicted=0 nodes=38066840 evals=5253700 > endgame tablebase-> probes done=0 successful=0 > SMP-> split=2994 stop=217 data=15/64 cpu=52.47 elap=17.69 > >Run four: four threads, SMT _on_: > >White(1): move > clearing hash tables > time surplus 0.00 time limit 166:39 (166:39) > depth time score variation (1) >starting thread 1 >starting thread 2 >starting thread 3 > 1 0.00 0.81 1. Bd4 > 1-> 0.03 0.81 1. Bd4 > 2 0.06 0.73 1. Bd4 Bb6 > 2-> 0.10 0.73 1. Bd4 Bb6 > 3 0.10 -- 1. Bd4 > 3 0.11 0.29 1. Bd4 g6 2. Nh6 Qxh3 3. gxh3 > 3 0.11 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 3 0.11 0.63 1. Qf3 g6 2. Nh6 > 3-> 0.14 0.63 1. Qf3 g6 2. Nh6 > 4 0.14 0.53 1. Qf3 g6 2. Nh6 Re7 > 4 0.14 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 4-> 0.15 0.58 1. Nh6 Re7 2. Qxe6 Rxe6 > 5 0.16 ++ 1. Nh6!! > 5 0.19 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 5-> 0.20 2.19 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 6 0.20 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 > 6-> 0.28 2.40 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 (s=4) > 7 0.35 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=3) > 7-> 0.43 2.32 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Bd4 Bb6 (s=7) > 8 0.45 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=6) > 8-> 0.60 2.31 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 b5 5. cxb5 cxb5 (s=4) > 9 0.65 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=3) > 9-> 0.76 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 Ke6 (s=5) > 10 0.83 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=4) > 10-> 1.36 2.41 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Ba5 5. Rg1 b5 6. cxb5 cxb5 (s=8) > 11 1.46 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=7) > 11-> 3.06 2.37 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Rg1 d5 7. > cxd5 cxd5 (s=10) > 12 3.27 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=9) > 12-> 5.66 2.46 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Rd7 5. Kf3 Re7 6. a4 Bb6 7. > Rg1 (s=8) > 13 6.34 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=7) > 13-> 16.09 2.34 1. Nh6 Qxh3 2. Nxf7+ Kg8 3. gxh3 Kxf7 > 4. Kg2 Bb6 5. Kf3 Nc7 6. Ba3 Ke7 7. > Bb4 Ne6 (s=13) > time=16.09 cpu=377% mat=0 n=36609790 fh=94% nps=2275k > ext-> chk=1035481 cap=72332 pp=4034 1rep=144147 mate=40133 > predicted=0 nodes=36609790 evals=4935572 > endgame tablebase-> probes done=0 successful=0 > SMP-> split=5501 stop=394 data=16/64 cpu=1:00 elap=16.09 > > > > >>>>>parallel search overhead, you have a problem on _normal_ SMP machines as well. >>>> >>>>Indeed it is true that the first seconds the HT/SMT gives big problems >>>>in speed. Only after a couple of minutes the speed shows. I see only >>>>a speedup after a minute or 3 each position. >>> >>>So? That is _your_ program's results. Mine are just like they have always >>>been. I get a reasonable speedup whether it is one second per move or one >>>hour per move. No difference. >>> >>> >>>> >>>>I need to add however that i could improve a few issues in this version >>>>which could get that down to 1 minute but like you i doubt whether the 11.4% >>>>of HT is worth it. >>> >>>11.4% is _always_ worth it IMHO... >>> >>> >>>> >>>>I prefer a dual AMD instead for the moment! >>>> >>> >>> >>>Nothing wrong with that. I got the dual xeon because I wanted a chance to play >>>with the SMT stuff since it is obviously going to be "the future" of >>>microprocessor >>>architecture...
This page took 0 seconds to execute
Last modified: Thu, 15 Apr 21 08:11:13 -0700
Current Computer Chess Club Forums at Talkchess. This site by Sean Mintz.