2008-10-01

[IC] Neural Network Implementation using CUDA and OpenMP

Abstract.

Many algorithms for image processing and patternrecognition have recently been implemented on GPU(graphic processing unit) for faster computationaltimes. However, the implementation using GPUencounters two problems. First, the programmershould master the fundamentals of the graphicsshading languages that require the prior knowledge oncomputer graphics. Second, in a job which needs muchcooperation between CPU and GPU, which is usual inimage processings and pattern recognitions contraryto the graphics area, CPU should generate raw featuredata for GPU processing as much as possible toeffectively utilize GPU performance. This paperproposes more quick and efficient implementation ofneural networks on both GPU and multi-core CPU.We use CUDA (compute unified device architecture)that can be easily programmed due to its simple Clanguage-like style instead of GPGPU to solve the firstproblem. Moreover, OpenMP (Open Multi-Processing)is used to concurrently process multiple data withsingle instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In theexperiments, we implemented neural networks-basedtext detection system using the proposed architecture,and the computational times showed about 15 timesfaster than implementation using CPU and about 4times faster than implementation on only GPU withoutOpenMP.
click
http://hci.ssu.ac.kr/ajpark/[IC]CUDAforNN.pdf
to download the papaer.

2008.

No comments: