monolish
0.17.3-dev.23
MONOlithic LInear equation Solvers for Highly-parallel architecture
|
The following four classes have the computable
attribute:
These classes support computing on the GPU and have five functions for GPU programming.
When libmonolish_cpu.so
is linked, send() and recv() do nothing, the CPU and GPU code can be shared.
When libmonolish_gpu.so
is linked, these functions enable data communication with the GPU.
Each class is mapped to GPU memory by using the send() function. The class to which the data is transferred to the GPU behaves differently, and it becomes impossible to perform operations on the elements of vectors and matrices.
Whether the data has been transferred to the GPU can be obtained by the get_device_mem_stat() function.
The data mapped to the GPU is released from the GPU by recv() or device_free().
Most of the functions are executed on either CPU or GPU according to get_device_mem_stat() . A copy constructor is special, it is a function that copies an instance of a class. So both CPU and GPU data will be copied.
For developers, there is a nonfree_recv() function that receives data from the GPU without freeing the GPU memory. However, in the current version, there is no way to explicitly change the status of GPU memory, so it is not useful for most users.
GPU programs using monolish are implemented with the following flow in mind.
It is important to be aware that send() and recv() should not be performed many times in order to reduce transfers.
A simple inner product program for GPU is shown below:
This sample code can be found in /sample/blas/innerproduct/
.
This program can be compiled by the following command.
The following command runs this.
A description of this program is given below:
send()
and recv()
functions.x
and y
do not need to be received to CPU, so the memory management was left to the automatic release by the destructor.For a more advanced example, sample programs that implement CG methods using monolish::BLAS and monolish::VML can be found in /sample/blas/cg_impl/
.
The following is a sample program that solves a linear equations; Ax=b using the conjugate gradient method with jacobi preconditioner on GPU.
This program requires only two lines of changes from the CPU program.
This sample code can be found in /sample/equation/cg/
.
A sample program for a templated linear equation solver can be found at sample/equation/templated_solver
.