|Home   Forums   Free Software   Topics   Jobs   Links   About  |
Calling Cuda Functions from Fortran
Author: Austen C. Duffy, Florida State University
Cuda functions can be called directly from fortran programs by using a kernel wrapper as long as some simple rules are followed.
1. Data Types: Make sure you use equivalent data types, these basically follow from fortran --> C conventions. Make sure to specify fortran integers and reals, note that integer*2 is a short int in C, I have had alot of problems trying to use these so I would suggest using integer*4's instead.
integer*4 --> int
real*4 --> float
real*8 --> double
2. Function Names: Fortran functions are appended with _ so you need to account for this in your cuda function call, e.g. calling function 'kernel_wrapper( )' in fortran will be changed to 'kernel_wrapper_( )' in the pre-processing stage, and so your cuda function should be called 'kernel_wrapper_( )' instead. This does not apply to the cuda kernels since they will not be called in the fortran code.
3. Arrays: Fortran and C use a different storage structure, essentially the opposite of each other, i.e. Fortran array(i,j,k) is equivalent to C array[k][j][i], except the Fortran arrays are stored in linear memory and so will be passed as 1-D arrays to CUDA.
Where NX, NY and NZ are the sizes of the x, y and z dimensions respectively. The arrays will be returned to Fortran as their original 3D versions.
4. Compilation: To compile, first use the nvcc compiler to create an object file from the .cu file using the -c option, e.g. 'nvcc -c cudatest.cu' will create a cudatest.o file, then you compile your fortran code making sure to link to the cuda libraries (-L) and includes (-I) on your machine e.g.
nvcc -c cudatest.cu
gfortran -L /usr/local/cuda/lib -I /usr/local/cuda/include -lcudart -lcuda fortest.f95 cudatest.o
The included libraries may be in a different location on your machine. Note that if your code runs in double precision, you will need to add the nvcc compiler option -arch sm_13, which requires a version 1.3 GPU architecture.
A sample code set complete with makefile demonstrating 1,2 and 4 above is on the next page.
Code Example with Makefile