CUDA (runtime API) to OpenCL was a bit of a jump - if I were using the driver API it would be much faster to port.
CPU actually does have some value - it's nowhere near as fast as a discrete GPU, but it's comparable to integrated GPUs on laptops, and for rainbow tables, being able to leverage the GPUs and CPUs on a laptop will be very useful.
For instance, the early unibody Macbook Pros have a Core 2 Duo, a 9400M, and a 9600M. Each of them are slow, but combined I hope to see 100M links/sec or more for NTLM. That's a useful speed.