by Bitweasil » Mon Mar 15, 2010 8:36 pm
Current progress:
I've got single GPU generate code for MD5 len 6-10 working. NTLM is a quick patch in, no big deal there. Just a different reduction function.
The tables now have an 8kb header containing, among other things:
- Hash type
- Password length
- Character set
- Chain length
- Number of chains
- Table index
- Comments
This means the filenames are no longer containing important metadata - all the metadata is stored in the file itsself. You give the cracking tool a hash, and a table file to work with, it does the rest for you. Obviously the cracking tool can't distinguish between a MD5 hash and a NTLM hash, but the rest of the details are stored in the table. Eventually, I'll probably build a file based system that can take a list of hashes and a list of files, but I'll work on that later.
I also have table verification code finished - this is a CPU implementation of the algorithm that's useful for table checking. You pass in a table and a stride, and it will check every N chains. The idea here is to catch tables that are catastrophically wrong (also, to catch bugs in my implementation - that's the primary purpose).
I'm working on the candidate hash generation code (making it GUI friendly), and then I will be plugging in my existing table search code. The table searching will be 64-bit only - sorry, I /really/ like memory mapped files and can't memory map a 1TB file into 32 bit space without some serious work that I don't care to do. Chain regeneration/searching will probably also be CPU based, for now - it'll be slower than GPU based, but it's not doing much work, and it's a lot faster to implement.
Once that's done, I'll release a beta and let people hammer on it. It'll be single GPU only, but all the code should function properly, generate tables, and do hash cracking. I'll be happily taking bug reports here.
The next stage, once everything is working, is to learn me some multithreading and vector operations (pthreads/SSE), and then recode my stuff to be multithreaded. Realistically, if I do it properly, I think I can work in some network-enabled cracking here too - at least for LAN versions of a network. I'm currently brainstorming how to handle the workunits to make this feasible, but if I can do it properly, it may be a sweet setup - able to use a variety of boxes on the network for all aspects of use. This would also mean that the GPU tables would be a bit more useful - throw 3-4 multi-core boxes on a LAN together and you'd have a CPU based setup that could still handle the GPU chain lengths.
Oh, and at some point I really should look into either OpenCL or code for ATI... OpenCL is an interesting option, but I'm not sure how fast it will be on CPUs/ATI cards. Might be worth playing with, though.