Page 3 of 3

Re: GTX 480 Core

PostPosted: Thu Oct 14, 2010 12:41 pm
by Rolf
Before Fermi release, NV defined SPs as SM(MCU) * 8.
When sm_20 Fermi GPUs were released(GTX 465,470,480), it was changed to SM * 32 .
But some guy in NV is a smartass, so it was again changed to SM * 48 for sm_21 GPUs(GTX 460 and other).

SM(MCU) - Shader Multiprocessor.
sm_xx - compute capability of GPU.

Since there is no GetSP() function, the application computes SPs by getting SM count and multiplying it by 8.
Ignore it, and use GPU-Z to get correct values.

Re: GTX 480 Core

PostPosted: Tue Oct 19, 2010 11:42 pm
by Bitweasil
Well, I have access to a 480 box to test on now. :)

Re: GTX 480 Core

PostPosted: Sat Oct 30, 2010 12:18 pm
by blazer
for optimal results on the GF100 i believe you can use -t 1024 and the usual -b shaders/8 , seems to be working wonders for me ;)

I'm gaining about 120 M more compared to -t 512

These are the rough numbers using -t 1024 -b 56 for my GTX470
NT: 1.1 Billion
MD5: 850 M

Ntb i guess bitweasil.
Oh i remember spotting a bug with the -u function can't remember what it was but kept on causing it to crash, it was the way the chars were arranged.

So i guess for other readers to achieve optimal results copy and paste the following

GTX480 users
-t 1024 -b 60 -m 1000 -l

GTX470 users
-t 1024 -b 56 -m 1000 -l

Re: GTX 480 Core

PostPosted: Sat Oct 30, 2010 5:34 pm
by Bitweasil
Ah, I'll have to try that when I get back to my 480s on Monday... :)

I'm working on getting a wiki & bug tracking system set up too.

Re: GTX 480 Core

PostPosted: Wed Nov 03, 2010 10:02 am
by Rolf
Did some benchmarks.

Code: Select all
timethis "CUDA-Multiforcer.exe --blocks=44 --threads=1024 --min=7 --max=7 -h MD5 -c charsets\charsetlowernumeric -l -m 250 -f MD6.txt"

TimeThis : Command Line : test.bat
TimeThis : Start Time : Wed Nov 03 12:10:15 2010
TimeThis : End Time : Wed Nov 03 12:11:50 2010
TimeThis : Elapsed Time : 00:01:34.395



Code: Select all
timethis "CUDA-Multiforcer.exe --blocks=44 --threads=512 --min=7 --max=7 -h MD5 -c charsets\charsetlowernumeric -l -m 250 -f MD6.txt"

TimeThis : Command Line : test.bat
TimeThis : Start Time : Wed Nov 03 12:15:46 2010
TimeThis : End Time : Wed Nov 03 12:17:19 2010
TimeThis : Elapsed Time : 00:01:33.397


Maybe kernel runtime = 250msec renders 1024 threads useless.
Theoretically, sm_20 Fermi supports 1024 threads / block.

Re: GTX 480 Core

PostPosted: Fri Nov 05, 2010 4:24 am
by blazer
yea same here, i guess it was a false alarm. No idea what i was doing last time.