2008-10-01

[IJ] Graph Cuts-based Automatic Color Image Segmentation using Mean Shift Analysis

Under Review
(IEEE Transactions on Systems, Man and Cybernetics, Part B - SCI)

Abstract.

A graph cuts method has recently attracted a lot of attention for image segmentation, as it can provide exact global optimasolutions for certain energy functions composed of data (regional) terms estimated in feature space and smoothness (boundary) termsestimated in an image domain. Although previous approaches using graph cuts have shown good performance for image segmentation,they manually obtained prior information to estimate the data term. To automatically estimate the data term, GMM (Gaussian mixturemodels) are generally used, but it is practicable only for classes with a hyper-spherical or hyper-ellipsoidal shape in feature space, as theclass is represented based on the covariance matrix centered on the mean. For arbitrary-shaped classes, this paper proposes a graphcuts-based automatic image segmentation method using mean shift analysis. We use the set of mean trajectories towards each modefrom initial means as prior information to estimate the data term, and the data that are not included in the set of prior information arecovered by a smoothness term that can preserve discontinuities in an image domain. Then, a graph cuts method is used to globallyoptimize the energy function. The main drawback of the mean shift procedures is it greatly consumes computational time. To tackle thisdrawback, we transform features in continuous feature space, i.e. L*u*v* color space in this paper, into a discrete 3D grid, and use 3Dkernel based on the first moment to move the means to modes in the grid. In the experiments, we investigated the problems of meanshift-based and normalized cuts-based image segmentation methods that have recently become popular methods and graph cuts-basedautomatic image segmentation using GMM. The proposed method showed better performance than the previous three methods onBerkeley Segmentation Dataset.
Keywords: Color Image Segmentation, Graph Cuts, Mean Shift Analysis.

click
http://hci.ssu.ac.kr/ajpark/[SMC]ColorSegmentation.pdf
to download the paper.

2008.

[IJ] Automatic Grouping in Trained SOFM via Graph Cuts

Under Review
(Pattern Recognition Letters - SCIE)

Abstract.

A Self-Organizing Feature Map (SOFM) is an unsupervised neural network and a very powerful tool for classifying and visualizing high-dimensional data sets. However, even though SOFMs have already been applied to many engineering problems, post-processing is still required, where similar output neurons after training the SOFMs are grouped into classes, which is invariably performed in manual. Moreover, existing algorithms that automatically group the neurons of a trained SOFM, such as the k-means, do not yield satisfactory results, especially when the grouped data shows unrestricted and arbitrary shapes. This paper proposes an automatic grouping method for a trained SOFM that can deal with arbitrary shapes of grouped data using graph cuts. In previous approaches using graph cuts, the graph is manually constructed based on prior data given by users, which hinders researchers from using it for automatic system. However, a mode-seeking in a distance matrix automatically obtains the prior data, and also can analyze arbitrary-shaped classes. Experimental results demonstrated the effectiveness of the proposed method for texture segmentation, with improved precision rates when compared with conventional clustering algorithms.

click
http://hci.ssu.ac.kr/ajpark/[PRL]GroupingofSOFM.pdf
to download the paper.

2008.

[IJ] Flying Cake: Augmented Game on Mobile Devices

Accepted
(ACM Computers in Entertainment)

Abstract.

In the current age of ubiquitous computing that uses high bandwidth networks, wearable and hand-held mobile devices with small cameras and wireless communication will be widespread in the near future. Thus, research on augmented games for mobile devices has recently attracted a lot of attention. Most existing augmented games use a traditional ‘backpack’ system and ‘pattern marker’. However, ‘backpack’ systems are expensive, cumbersome, and inconvenient to use, while the use of a ‘pattern marker’ means the game can play only at a previously-installed location. Accordingly, this paper proposes an augmented game, called Flying Cake, where face regions are used to create virtual objects (characters) without a predefined ‘pattern marker’, plus the location of the virtual objects are measured relative to the real world on a small mobile PDA, instead of using cumbersome hardware. Flying Cake is an augmented shooting game with two playing modes: 1) single player, where the player attacks a virtual character overlaid on images captured by a PDA camera in the physical world; and 2) two players, where each player attacks a virtual character in an image received via a wireless LAN from their opponent.The virtual character overlaps a face region obtained using a real-time face detection technique. As a result, Flying Cake provides an exciting experience for players based on a new game paradigm where the user interacts with both the physical world captured by a PDA camera and the virtual world.

Keywords: Augmented Game, Mobile Vision, Face Detection, 3D Augmented Shooting Game, CAMShift.

click
http://hci.ssu.ac.kr/ajpark/[CIE]FlyingCake_Final.pdf
to download the paper.

2008.

[IJ] PDA-based Text Extraction System using a Pipelined Client/Server Architecture

Under Review
(Journal of Zheijiang University, SCIENCE A - SCIE)

Abtract.

Recently, many researches about mobile vision using a personal digital assistant (PDA) have been attempted. However,many CPUs for the PDA are integer CPUs, which results in slow computation of the computer vision algorithms. To implementreal-time text extraction system on the PDA, we propose a fast and efficient a pipelined client(PDA)/server(PC) architecture withload-balancing. The client extracts tentative text regions using edge density, which results in faster transmission time because theresult images are suitable for JPEG encoding style. The server promotes accuracy by performing definitive text extraction usingmulti-layer perceptron and connected component analysis, and extracts texts only in the client’s results to enhance the processingtime of the server. We construct pipelined client/server architecture for sequential images so that it enables each process of theclient and server to be performed in parallel. Moreover, the proposed architecture can balance the client and server by regulatingthe amount of tentative text regions that will be transferred to the sever from the client. As a result, we can enhance the processingspeed of the overall architecture by reducing data transmission time and idle time between both processors.

Keywords: Mobile Vision, Text Extraction, MLP.

click
http://hci.ssu.ac.kr/ajpark/[IJ]PDAbased.pdf
to download the paper.

[IC] Graph-based High Level Motion Segmentation using Normalized Cuts

Abstract.

Motion capture devices have been utilized inproducing several contents, such as movies and video games. However,since motion capture devices are expensive and inconvenient to use,motions segmented from captured data was recycled and synthesizedto utilize it in another contents, but the motions were generallysegmented by contents producers in manual. Therefore, automaticmotion segmentation is recently getting a lot of attentions. Previousapproaches are divided into on-line and off-line, where on-lineapproaches segment motions based on similarities betweenneighboring frames and off-line approaches segment motions bycapturing the global characteristics in feature space. In this paper, wepropose a graph-based high-level motion segmentation method. Sincehigh-level motions consist of several repeated frames within temporaldistances, we consider all similarities among all frames within thetemporal distance. This is achieved by constructing a graph, whereeach vertex represents a frame and the edges between the frames areweighted by their similarity. Then, normalized cuts algorithm is usedto partition the constructed graph into several sub-graphs by globallyfinding minimum cuts. In the experiments, the results using theproposed method showed better performance than PCA-based methodin on-line and GMM-based method in off-line, as the proposed methodglobally segment motions from the graph constructed basedsimilarities between neighboring frames as well as similarities amongall frames within temporal distances.
Keywords: Capture Devices, High-Level Motions, Motion Segmentation, Normalized Cuts.

Click
http://hci.ssu.ac.kr/ajpark/[IC]MotionSegmentation.pdf
to download the paper.

2008.

[IC] Real-Time Vision-based Korean Finger Spelling Recognition System

Abstract.

Finger spelling is an art of communicating by signsmade with fingers, and has been introduced into sign language to serveas a bridge between the sign language and the verbal language.Previous approaches to finger spelling recognition are classified intotwo categories: glove-based and vision-based approaches. The glove-based approach is simpler and more accurate recognizing work of hand posture than vision-based, yet the interfaces require the user to wear a cumbersome and carry a load of cables that connected the device to a computer. In contrast, the vision-based approaches provide an attractive alternative to the cumbersome interface, and promise more natural and unobtrusive human-computer interaction. The vision-based approaches generally consist of two steps: hand extraction and recognition, and two steps are processed independently. This paper proposes real-time vision-based Korean finger spelling recognition system by integrating hand extraction into recognition. First, we tentatively detect a hand region using CAMShift algorithm.Then fill factor and aspect ratio estimated by width and height of detected hand regions are used to choose candidate from database,which can reduce the number of matching in recognition step. Torecognize the finger spelling, we use DTW(dynamic time warping) based on modified chain codes, to be robust to scale and orientation variations. In this procedure, since accurate hand regions, without holes and noises, should be extracted to improve the precision, we use graph cuts algorithm that globally minimize the energy function elegantly expressed by Markov random fields (MRFs). In the experiments, the computational times are less than 130 ms, and the times are not related to the number of templates of finger spellings in database, as candidate templates are selected in extraction step.
Keywords: CAMShift, DTW, Graph Cuts, and MRF.

click
http://hci.ssu.ac.kr/ajpark/[IC]GestureRecognition.pdf
to download the paper.

2008.

[IC] Graph Cuts-based Automatic Color Image Segmentation using Mean Shift Analysis

Abstract.

A graph cuts method has recently attracted a lot of attention for image segmentation, as it can minimize an energy function composed of data term estimated in feature space and smoothness term estimated in an image domain. Although previous approaches using graph cuts have shown good performance for image segmentation, they manually obtained prior information to estimate the data term, thus automatic image segmentation is one of issues in application using the graph cuts method. To automatically estimate the data term, GMM (Gaussian mixture model) is generally used, but it is practicable only for classes with a hyper-spherical or hyper-ellipsoidal shape, as the class was represented based on the covariance matrix centered on the mean. For arbitrary-shaped classes, this paper proposes graph cuts-based image segmentation using mean shift analysis. As prior information to estimate the data term, we use the set of mean trajectories toward each mode from initial means randomly selected in L*u*v* feature space. Since the mean shift procedure requires many computational times, we transform features in continuous feature space into 3D discrete grid, and use 3D kernel based on the first moment in the grid, which are needed to move the means to modes. In the experiments, we investigated problems of normalized cuts-based and mean shift-based segmentation and graph cuts-based segmentation using GMM. As a result, the proposed method showed better performance than previous three methods on Berkeley segmentation dataset.

click
http://hci.ssu.ac.kr/ajpark/[IC]ColorImageSegmentation.pdf
to download the paper.

2008.

[IC] Neural Network Implementation using CUDA and OpenMP

Abstract.

Many algorithms for image processing and patternrecognition have recently been implemented on GPU(graphic processing unit) for faster computationaltimes. However, the implementation using GPUencounters two problems. First, the programmershould master the fundamentals of the graphicsshading languages that require the prior knowledge oncomputer graphics. Second, in a job which needs muchcooperation between CPU and GPU, which is usual inimage processings and pattern recognitions contraryto the graphics area, CPU should generate raw featuredata for GPU processing as much as possible toeffectively utilize GPU performance. This paperproposes more quick and efficient implementation ofneural networks on both GPU and multi-core CPU.We use CUDA (compute unified device architecture)that can be easily programmed due to its simple Clanguage-like style instead of GPGPU to solve the firstproblem. Moreover, OpenMP (Open Multi-Processing)is used to concurrently process multiple data withsingle instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In theexperiments, we implemented neural networks-basedtext detection system using the proposed architecture,and the computational times showed about 15 timesfaster than implementation using CPU and about 4times faster than implementation on only GPU withoutOpenMP.
click
http://hci.ssu.ac.kr/ajpark/[IC]CUDAforNN.pdf
to download the papaer.

2008.