I've sort of been off the grid lately, as I'm sure some people have noticed. I've had a bunch of stuff going on in my life, and have been working on the work/life/code balance. I swung hard towards code for a while, and lost some stuff in my life that was of much value to me. I swung life-ways for a while, and now am trying to find some balance in the middle.
Anyway, in that vein, I'm starting to code on things again.
The current codebase is C/CUDA/(ugly and not expandable).
I'm working on converting all my existing code to a new paradigm: C++/OpenCL. The reasons are as follows:
- OpenCL is now usable, and decently quick. It's damned fast on ATI as well, which is what my primary focus here is. ATI + bitalign = "ZOMGFAST on crypto." Also, easier CPU code, if OpenCL does a decent job vectorizing things. This opens up the code to a larger target audience.
- C++ lets me do things more cleanly, and more easily accept new code from people in modular form. To expand:
-- By going to C++, I can use factory functions to alternate between different functionally equivalent code. I can, for instance, support a tighter RT format as well as my current format, and simply switch based on the table version in the headers. I can also support a 32-bit compatible table search that works on Windows.
-- If all the functionality for a hash type is encapsulated in a C++ class & .cl files, it should be very easy to add new hash types. And, importantly, it should be easy for other people to submit hash types to plug in.
-- The code should be much easier to read if it's broken up into class files that encapsulate sane functionality.
-- C++ lets me use Boost for cross-platform networking calls, among other things. This will again make cross-platform code easier to write.
The goal is a single codebase that compiles sanely on Windows/Linux/OSX, and functions the same on all platforms. If I do OpenCL properly, the same code can also run on CPU hosts at a sane speed. Not *fast*, mind you, but usable. Hopefully. At least for smaller table formats.
I'm also planning to write my code in a manner that utilizes all the CPUs/GPUs on a system. This will take some work, but should be doable. This should help immensely on laptops that don't have much of anything powerful, but have between 2 CPU cores & 1 or 2 GPUs, a respectable amount of processing.
And, related, I'm going to be setting up a wiki, bugtracking system, and other useful utilities to help turn this into a better quality project.
So, I guess, just keep your eyes open. I'm starting up the new codebase now, and should have something usable in a month or two for testing. I especially could use some ATI people to test my code on that platform.
Enjoy!