Is that the GTX 260 Core 216 or the GTX 260 "Core 192"
192 / 8 = 24 MPs
216 / 8 = 27 MPs
Blocks should be N * number of MPs.
Threads should be M * 32? >= 128?
I'd try N = 1, 2, 3, 4 with M = 4, 5, 6, 7, 8, 10, 12, 14, 16, 18, 20, 22, 24, maybe higher
And see which one is best. I find that with N = 1 and M = kinda high gets the best results.
-l / --lookup is for large lists of hashes it uses a 512 MiB lookup table. Hmm it looks like the small hash list code was removed (because the large list code is always faster) so it's always a good idea to use "-l". You have 896 MiB of memory so you can actually use this. It is just a flag there is no value for it just "-l" not "-l 768". I made a patch for I think 0.7 that let you select a smaller lookup table for those with less memory like me 256 MiB

. So it was "-l 30" (30 bits 2 ^ 30 / 8) for a 128 MiB lookup table.