Need help for a small dev on cuda

Hello everybody, I try for the fun to crack something.
I have a 96 byte crypted message in AES256. I also have the SHA1 digest of the decrypted message, thus I can easily know if a bruteforce tentative has found the right key.
The 32 byte AES key used to decrypt this message is built from a passphrase, and I know this passphrase is formatted as is : "2xxxxxxxx2",
xxxxxxxx are 8 characters and can be A,B,D,E,F,I,M,N,O,R,S,T,Y or Z
Thus bruteforce is theorically "easy", but the hard part is that for the 32 byte AES key generation from the passphrase, the PKCS5_PBKDF2_HMAC_SHA1() function from the libCrypto is used, with iteration parameter set to.....10000 !
Thus on my computer, when I launch my bruteforce program on one core, I can test at a rate of only 17.7 combinations per second.
A friend has a more powerfull computer, on one core it can test 35 combination per second, and because he has 8 core, he can potentially test the 14^8 combination of the passphrase in approx 60 days.
But it's to much !! On its computer, my friend have a Nvidia GTX680, and he told me that there is a possibility to use it to execute a lot of calculs in parallel because this card has more than 1000 core !!
Thus yesterday evening I searched a lot of information about that and found this forum. I looked how to develop in cuda but it really difficult for me to understand how I can port my program to compile it for cuda and exploit the power of the GTX680.
For someone which knows cuda I thing I can took less than 10 minutes to port the program !?? In fact what my program do is really simple, for all passphrase combination from "2AAAAAAAA2" to "2ZZZZZZZZ2" it :
- Generates the 32 byte AES key with the PKCS5_PBKDF2_HMAC_SHA1() function
- Decrypts the encrypted message with the 32 byte AES key
- Calculates the SHA1 digest of the decrypted message
- Compares the SHA1 digest to the official SHA1 digest of the decrypted message, if different, tests a new combination of the passphrase...
Here is my program source for those who want to look at it :
As I said above, for someone whick know cuda programation it may be easy to adapt this code ? But what about the openssl cryptographic function ? I think they must be ported too ? But I think their original source code are sufficient, no need to port them ?
Regards
I have a 96 byte crypted message in AES256. I also have the SHA1 digest of the decrypted message, thus I can easily know if a bruteforce tentative has found the right key.
The 32 byte AES key used to decrypt this message is built from a passphrase, and I know this passphrase is formatted as is : "2xxxxxxxx2",
xxxxxxxx are 8 characters and can be A,B,D,E,F,I,M,N,O,R,S,T,Y or Z
Thus bruteforce is theorically "easy", but the hard part is that for the 32 byte AES key generation from the passphrase, the PKCS5_PBKDF2_HMAC_SHA1() function from the libCrypto is used, with iteration parameter set to.....10000 !
Thus on my computer, when I launch my bruteforce program on one core, I can test at a rate of only 17.7 combinations per second.
A friend has a more powerfull computer, on one core it can test 35 combination per second, and because he has 8 core, he can potentially test the 14^8 combination of the passphrase in approx 60 days.
But it's to much !! On its computer, my friend have a Nvidia GTX680, and he told me that there is a possibility to use it to execute a lot of calculs in parallel because this card has more than 1000 core !!
Thus yesterday evening I searched a lot of information about that and found this forum. I looked how to develop in cuda but it really difficult for me to understand how I can port my program to compile it for cuda and exploit the power of the GTX680.
For someone which knows cuda I thing I can took less than 10 minutes to port the program !?? In fact what my program do is really simple, for all passphrase combination from "2AAAAAAAA2" to "2ZZZZZZZZ2" it :
- Generates the 32 byte AES key with the PKCS5_PBKDF2_HMAC_SHA1() function
- Decrypts the encrypted message with the 32 byte AES key
- Calculates the SHA1 digest of the decrypted message
- Compares the SHA1 digest to the official SHA1 digest of the decrypted message, if different, tests a new combination of the passphrase...
Here is my program source for those who want to look at it :
- Code: Select all
#include <stdio.h>
#include <string.h>
#include <openssl/engine.h>
#include <openssl/sha.h>
#define u8 unsigned char
// Available letters on the keyboard
const char keyboard[14] = { 'A','B','D','E','F','I','M','N','O','R','S','T','Y','Z' };
// Password string 2xxxxxxxx2 with xxxxxxxx = letters from the keyboard
char password[11] = { '2', 0, 0, 0, 0, 0, 0, 0, 0, '2', 0 };
// Code to bruteforce
u8 keyToTest[8] = { 0, 0, 0, 0, 0, 0, 0, 0 };
// Salt for function PKCS5_PBKDF2_HMAC_SHA1
const char* salt = "\x0B\x5C\x74\x63\xC0\xFA\x79\x20";
// 256 bits AES key
u8 aes256key[32];
// Initialisation vector for AES
const u8* iv = (u8*)"\xA1\xF3\x4A\x0B\x53\x2A\xD8\xC5\x88\xA1\xA1\xA1\xA1\xA1\xA1\xA1\xF5\xD4\xE0\x9D\xBA\x57\x64\xFE\x04\xF5\xF5\xF5\xF5\xF5\xF5\xF5";
// Cypher descriptor
EVP_CIPHER_CTX evp_cipher_ctx;
const EVP_CIPHER* evp_cipher;
// Crypted msg
u8 cryptedMsg[] =
{
0x39, 0xAA, 0x07, 0xA2, 0xFE, 0x56, 0x12, 0xFA, 0xC8, 0xF4,
0x50, 0xD2, 0xFB, 0x5E, 0x6F, 0xB5, 0x2E, 0x54, 0xCB, 0xAC,
0x1B, 0xE3, 0x5D, 0x51, 0xC8, 0xC5, 0xC6, 0x82, 0x0B, 0x7F,
0xA0, 0x08, 0xCE, 0x54, 0xFF, 0x6B, 0x88, 0x84, 0x81, 0x1A,
0xB8, 0x6E, 0x15, 0x71, 0x3D, 0xE9, 0x76, 0xFE, 0xC7, 0x40,
0x95, 0x2F, 0xFF, 0xEC, 0xC1, 0x8F, 0x3A, 0xA9, 0xE0, 0x5E,
0x2E, 0xEC, 0x1A, 0xD8, 0xDD, 0xD3, 0xF8, 0xB0, 0xDA, 0x1C,
0x37, 0xFF, 0xEA, 0xBD, 0xF1, 0x36, 0x5D, 0x78, 0xF4, 0x75,
0xD3, 0xE1, 0xF3, 0x0C, 0xC3, 0xE9, 0x42, 0xEB, 0xFA, 0x09,
0x0F, 0x62, 0x2B, 0x4F, 0x9D, 0xAC
};
// buffer for decrypted message
u8 out[512];
// SHA1 digest of the decrypted message, to know if decryption is okay
const u8 digestWanted[20] =
{
0xC0, 0x84, 0xF3, 0xC3, 0x4D, 0x8C, 0x9A, 0x6D, 0xB2, 0xB4,
0xAE, 0x13, 0xE5, 0x17, 0x48, 0xB6, 0xBF, 0xDD, 0x6D, 0x7A
};
void dump ( u8* data, int len )
{
while( len-- )
{ printf( "%02x ", *data++ );
}
printf( "\n" );
}
int aesDecrypt ( u8* crypto, int cryptoLen )
{
int outl;
outl = cryptoLen;
EVP_DecryptUpdate( &evp_cipher_ctx, out, &outl, crypto, cryptoLen );
return( EVP_DecryptFinal_ex( &evp_cipher_ctx, &out[outl], &outl ) );
}
int decrypt ( void )
{
u8 digest[20]; // Digest of the decrypted message, to verify its validity
EVP_CIPHER_CTX_init(&evp_cipher_ctx);
evp_cipher = EVP_aes_256_cbc();
// Build the 256bits AES key form the password
PKCS5_PBKDF2_HMAC_SHA1(password, 10, (const void*)salt, 8, 10000, 32, aes256key);
// Init the AES decryption
EVP_DecryptInit_ex( &evp_cipher_ctx, evp_cipher, NULL, aes256key, iv);
// If AES decryption is okay, test the validity of the decrypted message by SHA1 digest comparison
if ( aesDecrypt( cryptedMsg, sizeof(cryptedMsg) ) == 1 )
{ //printf( "aesDecrypt OK avec code %s\n", password );
// printf( "clé :\n" );
// dump( key, 32 );
// printf( "Message décodé :\n" );
// dump( out, 0x60 );
SHA1( out, 0x5F, digest );
// printf( "sha1 :\n" );
// dump( hash, 20 );
if ( memcmp( digestWanted, digest, 20 ) == 0 )
{ return 1;
}
}
EVP_CIPHER_CTX_cleanup(&evp_cipher_ctx);
return 0;
}
void nextKey ( u8* key, int keyIndex )
{
// Test the end of the recursive incrementation
if ( keyIndex < 0 )
return;
// Increment the actual key index value
key[keyIndex]++;
// If last possible character, increment next key index
if ( key[keyIndex] >= 14 )
{ key[keyIndex] = 0;
nextKey( key, keyIndex-1 );
}
}
int main ( int nbargs, char* args[] )
{
int i=0;
while( 1 )
{ password[1] = keyboard[keyToTest[0]];
password[2] = keyboard[keyToTest[1]];
password[3] = keyboard[keyToTest[2]];
password[4] = keyboard[keyToTest[3]];
password[5] = keyboard[keyToTest[4]];
password[6] = keyboard[keyToTest[5]];
password[7] = keyboard[keyToTest[6]];
password[8] = keyboard[keyToTest[7]];
if ( ++i%10000 == 0 )
printf( "%s\n", password );
if ( decrypt() )
{ printf( "Code trouvé ! %s\n", password );
printf( "%s\n", out );
break;
}
if ( memcmp( password, "2ZZZZZZZZ2", 10 ) == 0 )
{ printf( ":'(\n" );
break;
}
nextKey( keyToTest, 7 );
}
return 0;
}
As I said above, for someone whick know cuda programation it may be easy to adapt this code ? But what about the openssl cryptographic function ? I think they must be ported too ? But I think their original source code are sufficient, no need to port them ?
Regards