For those familiar with Atom of oclHashcat, I'm going to do pretty much exactly the same thing.

The nVidia OpenCL implementation, right now, is a giant pain in the ass & generally glitchy. Oh, and did I mention a giant pain in the ass?
However, the AMD/ATI OpenCL implementation is pretty solid (GPU + CPU), and the Intel OpenCL (CPU only for now) implementation seems solid.
So, I'll pull an atom & release the nVidia version as CUDA, and the ATI/CPU version as OpenCL. The really good news is that the bulk of the code doesn't change - much of my code works identically between the CUDA & OpenCL versions. I don't think I'll go as far as ifdef'ing everything in the same files, but I'll probably have a combined source tree for both.
The downside of this is that doing multi-device work (GPU + CPU on the same box) for cracking will be difficult. I suppose I'll just add network support so you can run multiple clients...

I'm hoping to have an OpenCL toolchain done in the next few weeks. The generate is done for MD5/NTLM, so now it's just the tricky bits...
Thoughts?