Cryptohaze.com

Posted: **Thu Mar 15, 2012 1:03 pm**

So I did some optimizations and yours is about 6.5% faster with a compute capability 1.1 card (probably same for all 1.x), 24% faster with a compute capability 2.1 card, and probably somewhere in between for compute capability 2.0.

md5_loweralpha-numeric#1-7

Code: Select all: 9800 GTX+ (128 cores, 1836 MHz, compute capability 1.1) 330 MLinks/sec 1.065x CryptoHaze (generation) 310 MLinks/sec 1.00x mine (generation) 240 MLinks/sec 0.77x rcrack's GPU version (pre-work 100k) 82 MLinks/sec 0.26x rcracki_mt 0.7b (pre-work 100k) GTS 450 (192 cores, 1566 MHz, compute capability 2.1) 360 MLinks/sec 1.24x CryptoHaze (generation) 290 MLinks/sec 1.00x mine (generation) 200 MLinks/sec 0.69x rcrack's GPU version (pre-work 100k) 91 MLinks/sec 0.31x rcracki_mt 0.7b (pre-work 100k)

Posted: **Thu Mar 15, 2012 3:52 pm**

What algorithm are you using for your reduction function, since this is the main factor to consider?

//EDIT: And Atom has apparently gotten into a speed war with me after I proved I could outrun hashcat in multihash brute forcing on nVidia.

Posted: **Fri Mar 16, 2012 4:22 am**

This is the standard rcrack method. I guess it was more apparent in the context when I posted it on FRT because I had it next to CPU benchmarks.

This is a single 32 bit thread of a 2.5GHz Q9300.

It looks like the winner is "divcfl-3" for the CPU version:
Code: Select all
10.24 MLinks/sec md5_loweralpha-numeric#1-6 9.35 MLinks/sec md5_alpha-space#1-9 9.87 MLinks/sec md5_loweralpha#1-10 9.54 MLinks/sec md5_loweralpha-numeric-space#1-8 9.56 MLinks/sec md5_loweralpha-numeric-space#1-9 9.70 MLinks/sec md5_loweralpha-numeric-symbol32-space#1-7 10.40 MLinks/sec md5_loweralpha-numeric-symbol32-space#1-8 9.32 MLinks/sec md5_loweralpha-space#1-9 10.37 MLinks/sec md5_mixalpha-numeric#1-8 10.28 MLinks/sec md5_mixalpha-numeric-all-space#1-7 10.69 MLinks/sec md5_mixalpha-numeric-all-space#1-8 9.89 MLinks/sec md5_mixalpha-numeric-space#1-7 10.38 MLinks/sec md5_mixalpha-numeric-space#1-8 8.95 MLinks/sec md5_numeric#1-12 9.03 MLinks/sec md5_numeric#1-14 8.97 MLinks/sec md5_hybrid3(omni6.txt)#0-0 8.70 MLinks/sec md5_hybrid3(omni7.txt)#0-0

Posted: **Fri Mar 16, 2012 4:27 am**

Damn. Using the multiply to divide trick?

Posted: **Fri Mar 16, 2012 7:31 am**

Yes. I'm so glad I was too lazy to finish and properly test the fixed point multiply reduction function I came up with. I basically couldn't decide if I should do 32 bit or 24 bit multiply. Now the difference in speed is probably negligible but FPM is less uniformly distributed (and has problems with small sub key spaces compared to the total key space which is why I dropped 1-4 password lengths).

Posted: **Thu Apr 05, 2012 5:03 am**

ARTGen 0.1a

GTS 450:
273 MLinks/second for md5_mixalpha-numeric#1-9
282 MLinks/second for md5_loweralpha-numeric#1-7

Too lazy too swap out GPU for 9800 GTX+. Also If you use a 1.x compute capability card you should recompile it without defining USE___fmul_rd since it should be faster. On that note does anyone know how to tell at compile time which compute capability a .cu file is being compiled for.

Posted: **Fri Apr 25, 2014 6:25 am**

Sc00bz wrote:ARTGen 0.1a

GTS 450:
273 MLinks/second for md5_mixalpha-numeric#1-9
282 MLinks/second for md5_loweralpha-numeric#1-7

Too lazy too swap out GPU for 9800 GTX+. Also If you use a 1.x compute capability card you should recompile it without defining USE___fmul_rd since it should be faster. On that note does anyone know how to tell at compile time which compute capability a .cu file is being compiled for.

Your program is well but have bugs :

artgen rt MD5 div alpha-numeric#8-8 10 48000 1048576 0 .\ test 0 0
pause

Error: gpudivsb.cpp(468) : getLastCudaError() CUDA error : Failed kernel launch.
: (7) too many resources requested for launch.

if you can fix this ,it will be a good rt table generator .

Cryptohaze.com

Speed War :)

Speed War :)

Re: Speed War :)

Re: Speed War :)

Re: Speed War :)

Re: Speed War :)

Re: Speed War :)

Re: Speed War :)