Progress on the GPU accelerated RTs!

So...
I'm sure some of you have noticed there's been pretty much zero progress lately on this whole "GPU accelerated rainbow tables" thing.
I blame doing PHP for a year. That stuff rots the brain... all those typeless variables, SQL, ... ick.
And then a move & other chaos.
I was also frustrated by a lack of uniformity on my reduction function. It was fast, but non-ideal - it represented the first characters in the charset more frequently, as it was wrapping a non-power-of-2 length charset into a power of 2 lookup array.
The good news is that I've fixed this finally (actually a simple fix, just took a year for my brain to come up with it), and am working on this stuff again.
The other major problem is that, right now, the code, while working, relies on a MySQL backend, headless Linux compute servers (the kernels run 30+s in length), and some rather inefficient merge code.
In other words, you wouldn't want the code, the way it is.
My goals for the next few weeks (assuming I stay motivated):
- Standalone table generator, with an internal pseudorandom number generator for table generation based on a seed (useful for distributed stuff later on). This will be a big challenge, as I need to take my current "Run until it's done" code and make it friendly for systems with GUIs - running in short bursts at a time. This also means moving data around on the GPU such that the startup time is shorter - right now, it spends a /lot/ of it's time initializing. Not a problem when running for 30s at a time, big problem when it takes half a second to get set up and you want to run 100ms kernels. Fortunately, I've learned a lot with the multiforcer, so this shouldn't be too terribly challenging. It may not work well for those with low end video cards... I'm not going to do much tuning for them. Sorry, an 8600M isn't going to do much.
- Tweak my table merger so it's faster. I know what I need to do, I just haven't done it as it wasn't an issue. I can't promise Windows code here, but I'll try. This code may end up being 64-bit only unless I change how I do things.
- Make the candidate hash generation code standalone. Right now it's heavily reliant on SQL to process things.
- Same goes for the table search/chain regeneration code. It may come out as a CPU version first, I'll see. See "64-bit only" here.
In general, the system requirements for this project are going to be somewhere between "absurd" and "obscene." There are plenty of CPU based rainbow tables out there, and plenty of GPU based crackers for smaller spaces. I'm not interested in them, I'm exploring things like "Full US character set, NTLM lengths 7 through 9." This involves GPU-years of time, TB worth of tables, etc.
I also intend to make this a distributed project of some form or another - not sure if I'll use BOINC, but I'd like to allow people to contribute and eventually pull tables down (though how one distributes 5TB of tables is beyond me).
Anyway, I'll update as things happen.
I'm sure some of you have noticed there's been pretty much zero progress lately on this whole "GPU accelerated rainbow tables" thing.
I blame doing PHP for a year. That stuff rots the brain... all those typeless variables, SQL, ... ick.
And then a move & other chaos.
I was also frustrated by a lack of uniformity on my reduction function. It was fast, but non-ideal - it represented the first characters in the charset more frequently, as it was wrapping a non-power-of-2 length charset into a power of 2 lookup array.
The good news is that I've fixed this finally (actually a simple fix, just took a year for my brain to come up with it), and am working on this stuff again.
The other major problem is that, right now, the code, while working, relies on a MySQL backend, headless Linux compute servers (the kernels run 30+s in length), and some rather inefficient merge code.
In other words, you wouldn't want the code, the way it is.

My goals for the next few weeks (assuming I stay motivated):
- Standalone table generator, with an internal pseudorandom number generator for table generation based on a seed (useful for distributed stuff later on). This will be a big challenge, as I need to take my current "Run until it's done" code and make it friendly for systems with GUIs - running in short bursts at a time. This also means moving data around on the GPU such that the startup time is shorter - right now, it spends a /lot/ of it's time initializing. Not a problem when running for 30s at a time, big problem when it takes half a second to get set up and you want to run 100ms kernels. Fortunately, I've learned a lot with the multiforcer, so this shouldn't be too terribly challenging. It may not work well for those with low end video cards... I'm not going to do much tuning for them. Sorry, an 8600M isn't going to do much.
- Tweak my table merger so it's faster. I know what I need to do, I just haven't done it as it wasn't an issue. I can't promise Windows code here, but I'll try. This code may end up being 64-bit only unless I change how I do things.
- Make the candidate hash generation code standalone. Right now it's heavily reliant on SQL to process things.
- Same goes for the table search/chain regeneration code. It may come out as a CPU version first, I'll see. See "64-bit only" here.
In general, the system requirements for this project are going to be somewhere between "absurd" and "obscene." There are plenty of CPU based rainbow tables out there, and plenty of GPU based crackers for smaller spaces. I'm not interested in them, I'm exploring things like "Full US character set, NTLM lengths 7 through 9." This involves GPU-years of time, TB worth of tables, etc.
I also intend to make this a distributed project of some form or another - not sure if I'll use BOINC, but I'd like to allow people to contribute and eventually pull tables down (though how one distributes 5TB of tables is beyond me).
Anyway, I'll update as things happen.