Allocating 2D arrays in CUDA can be a little confusing at first. There are a couple of mistakes you may make while trying to allocate your first 2D array. Wrong Way #1: int rowCount = 10; float** d_array=0; // array on device cudaMalloc(d_array, rowCount*sizeof(float*)); for(int i = 0 ; i < rowCount ; i++) { [...]