2008-10-01

[IJ] Graph Cuts-based Automatic Color Image Segmentation using Mean Shift Analysis

Under Review
(IEEE Transactions on Systems, Man and Cybernetics, Part B - SCI)

Abstract.

A graph cuts method has recently attracted a lot of attention for image segmentation, as it can provide exact global optimasolutions for certain energy functions composed of data (regional) terms estimated in feature space and smoothness (boundary) termsestimated in an image domain. Although previous approaches using graph cuts have shown good performance for image segmentation,they manually obtained prior information to estimate the data term. To automatically estimate the data term, GMM (Gaussian mixturemodels) are generally used, but it is practicable only for classes with a hyper-spherical or hyper-ellipsoidal shape in feature space, as theclass is represented based on the covariance matrix centered on the mean. For arbitrary-shaped classes, this paper proposes a graphcuts-based automatic image segmentation method using mean shift analysis. We use the set of mean trajectories towards each modefrom initial means as prior information to estimate the data term, and the data that are not included in the set of prior information arecovered by a smoothness term that can preserve discontinuities in an image domain. Then, a graph cuts method is used to globallyoptimize the energy function. The main drawback of the mean shift procedures is it greatly consumes computational time. To tackle thisdrawback, we transform features in continuous feature space, i.e. L*u*v* color space in this paper, into a discrete 3D grid, and use 3Dkernel based on the first moment to move the means to modes in the grid. In the experiments, we investigated the problems of meanshift-based and normalized cuts-based image segmentation methods that have recently become popular methods and graph cuts-basedautomatic image segmentation using GMM. The proposed method showed better performance than the previous three methods onBerkeley Segmentation Dataset.
Keywords: Color Image Segmentation, Graph Cuts, Mean Shift Analysis.

click
http://hci.ssu.ac.kr/ajpark/[SMC]ColorSegmentation.pdf
to download the paper.

2008.

[IJ] Automatic Grouping in Trained SOFM via Graph Cuts

Under Review
(Pattern Recognition Letters - SCIE)

Abstract.

A Self-Organizing Feature Map (SOFM) is an unsupervised neural network and a very powerful tool for classifying and visualizing high-dimensional data sets. However, even though SOFMs have already been applied to many engineering problems, post-processing is still required, where similar output neurons after training the SOFMs are grouped into classes, which is invariably performed in manual. Moreover, existing algorithms that automatically group the neurons of a trained SOFM, such as the k-means, do not yield satisfactory results, especially when the grouped data shows unrestricted and arbitrary shapes. This paper proposes an automatic grouping method for a trained SOFM that can deal with arbitrary shapes of grouped data using graph cuts. In previous approaches using graph cuts, the graph is manually constructed based on prior data given by users, which hinders researchers from using it for automatic system. However, a mode-seeking in a distance matrix automatically obtains the prior data, and also can analyze arbitrary-shaped classes. Experimental results demonstrated the effectiveness of the proposed method for texture segmentation, with improved precision rates when compared with conventional clustering algorithms.

click
http://hci.ssu.ac.kr/ajpark/[PRL]GroupingofSOFM.pdf
to download the paper.

2008.

[IJ] Flying Cake: Augmented Game on Mobile Devices

Accepted
(ACM Computers in Entertainment)

Abstract.

In the current age of ubiquitous computing that uses high bandwidth networks, wearable and hand-held mobile devices with small cameras and wireless communication will be widespread in the near future. Thus, research on augmented games for mobile devices has recently attracted a lot of attention. Most existing augmented games use a traditional ‘backpack’ system and ‘pattern marker’. However, ‘backpack’ systems are expensive, cumbersome, and inconvenient to use, while the use of a ‘pattern marker’ means the game can play only at a previously-installed location. Accordingly, this paper proposes an augmented game, called Flying Cake, where face regions are used to create virtual objects (characters) without a predefined ‘pattern marker’, plus the location of the virtual objects are measured relative to the real world on a small mobile PDA, instead of using cumbersome hardware. Flying Cake is an augmented shooting game with two playing modes: 1) single player, where the player attacks a virtual character overlaid on images captured by a PDA camera in the physical world; and 2) two players, where each player attacks a virtual character in an image received via a wireless LAN from their opponent.The virtual character overlaps a face region obtained using a real-time face detection technique. As a result, Flying Cake provides an exciting experience for players based on a new game paradigm where the user interacts with both the physical world captured by a PDA camera and the virtual world.

Keywords: Augmented Game, Mobile Vision, Face Detection, 3D Augmented Shooting Game, CAMShift.

click
http://hci.ssu.ac.kr/ajpark/[CIE]FlyingCake_Final.pdf
to download the paper.

2008.

[IJ] PDA-based Text Extraction System using a Pipelined Client/Server Architecture

Under Review
(Journal of Zheijiang University, SCIENCE A - SCIE)

Abtract.

Recently, many researches about mobile vision using a personal digital assistant (PDA) have been attempted. However,many CPUs for the PDA are integer CPUs, which results in slow computation of the computer vision algorithms. To implementreal-time text extraction system on the PDA, we propose a fast and efficient a pipelined client(PDA)/server(PC) architecture withload-balancing. The client extracts tentative text regions using edge density, which results in faster transmission time because theresult images are suitable for JPEG encoding style. The server promotes accuracy by performing definitive text extraction usingmulti-layer perceptron and connected component analysis, and extracts texts only in the client’s results to enhance the processingtime of the server. We construct pipelined client/server architecture for sequential images so that it enables each process of theclient and server to be performed in parallel. Moreover, the proposed architecture can balance the client and server by regulatingthe amount of tentative text regions that will be transferred to the sever from the client. As a result, we can enhance the processingspeed of the overall architecture by reducing data transmission time and idle time between both processors.

Keywords: Mobile Vision, Text Extraction, MLP.

click
http://hci.ssu.ac.kr/ajpark/[IJ]PDAbased.pdf
to download the paper.

[IC] Graph-based High Level Motion Segmentation using Normalized Cuts

Abstract.

Motion capture devices have been utilized inproducing several contents, such as movies and video games. However,since motion capture devices are expensive and inconvenient to use,motions segmented from captured data was recycled and synthesizedto utilize it in another contents, but the motions were generallysegmented by contents producers in manual. Therefore, automaticmotion segmentation is recently getting a lot of attentions. Previousapproaches are divided into on-line and off-line, where on-lineapproaches segment motions based on similarities betweenneighboring frames and off-line approaches segment motions bycapturing the global characteristics in feature space. In this paper, wepropose a graph-based high-level motion segmentation method. Sincehigh-level motions consist of several repeated frames within temporaldistances, we consider all similarities among all frames within thetemporal distance. This is achieved by constructing a graph, whereeach vertex represents a frame and the edges between the frames areweighted by their similarity. Then, normalized cuts algorithm is usedto partition the constructed graph into several sub-graphs by globallyfinding minimum cuts. In the experiments, the results using theproposed method showed better performance than PCA-based methodin on-line and GMM-based method in off-line, as the proposed methodglobally segment motions from the graph constructed basedsimilarities between neighboring frames as well as similarities amongall frames within temporal distances.
Keywords: Capture Devices, High-Level Motions, Motion Segmentation, Normalized Cuts.

Click
http://hci.ssu.ac.kr/ajpark/[IC]MotionSegmentation.pdf
to download the paper.

2008.

[IC] Real-Time Vision-based Korean Finger Spelling Recognition System

Abstract.

Finger spelling is an art of communicating by signsmade with fingers, and has been introduced into sign language to serveas a bridge between the sign language and the verbal language.Previous approaches to finger spelling recognition are classified intotwo categories: glove-based and vision-based approaches. The glove-based approach is simpler and more accurate recognizing work of hand posture than vision-based, yet the interfaces require the user to wear a cumbersome and carry a load of cables that connected the device to a computer. In contrast, the vision-based approaches provide an attractive alternative to the cumbersome interface, and promise more natural and unobtrusive human-computer interaction. The vision-based approaches generally consist of two steps: hand extraction and recognition, and two steps are processed independently. This paper proposes real-time vision-based Korean finger spelling recognition system by integrating hand extraction into recognition. First, we tentatively detect a hand region using CAMShift algorithm.Then fill factor and aspect ratio estimated by width and height of detected hand regions are used to choose candidate from database,which can reduce the number of matching in recognition step. Torecognize the finger spelling, we use DTW(dynamic time warping) based on modified chain codes, to be robust to scale and orientation variations. In this procedure, since accurate hand regions, without holes and noises, should be extracted to improve the precision, we use graph cuts algorithm that globally minimize the energy function elegantly expressed by Markov random fields (MRFs). In the experiments, the computational times are less than 130 ms, and the times are not related to the number of templates of finger spellings in database, as candidate templates are selected in extraction step.
Keywords: CAMShift, DTW, Graph Cuts, and MRF.

click
http://hci.ssu.ac.kr/ajpark/[IC]GestureRecognition.pdf
to download the paper.

2008.

[IC] Graph Cuts-based Automatic Color Image Segmentation using Mean Shift Analysis

Abstract.

A graph cuts method has recently attracted a lot of attention for image segmentation, as it can minimize an energy function composed of data term estimated in feature space and smoothness term estimated in an image domain. Although previous approaches using graph cuts have shown good performance for image segmentation, they manually obtained prior information to estimate the data term, thus automatic image segmentation is one of issues in application using the graph cuts method. To automatically estimate the data term, GMM (Gaussian mixture model) is generally used, but it is practicable only for classes with a hyper-spherical or hyper-ellipsoidal shape, as the class was represented based on the covariance matrix centered on the mean. For arbitrary-shaped classes, this paper proposes graph cuts-based image segmentation using mean shift analysis. As prior information to estimate the data term, we use the set of mean trajectories toward each mode from initial means randomly selected in L*u*v* feature space. Since the mean shift procedure requires many computational times, we transform features in continuous feature space into 3D discrete grid, and use 3D kernel based on the first moment in the grid, which are needed to move the means to modes. In the experiments, we investigated problems of normalized cuts-based and mean shift-based segmentation and graph cuts-based segmentation using GMM. As a result, the proposed method showed better performance than previous three methods on Berkeley segmentation dataset.

click
http://hci.ssu.ac.kr/ajpark/[IC]ColorImageSegmentation.pdf
to download the paper.

2008.

[IC] Neural Network Implementation using CUDA and OpenMP

Abstract.

Many algorithms for image processing and patternrecognition have recently been implemented on GPU(graphic processing unit) for faster computationaltimes. However, the implementation using GPUencounters two problems. First, the programmershould master the fundamentals of the graphicsshading languages that require the prior knowledge oncomputer graphics. Second, in a job which needs muchcooperation between CPU and GPU, which is usual inimage processings and pattern recognitions contraryto the graphics area, CPU should generate raw featuredata for GPU processing as much as possible toeffectively utilize GPU performance. This paperproposes more quick and efficient implementation ofneural networks on both GPU and multi-core CPU.We use CUDA (compute unified device architecture)that can be easily programmed due to its simple Clanguage-like style instead of GPGPU to solve the firstproblem. Moreover, OpenMP (Open Multi-Processing)is used to concurrently process multiple data withsingle instruction on multi-core CPU, which results ineffectively utilizing the memories of GPU. In theexperiments, we implemented neural networks-basedtext detection system using the proposed architecture,and the computational times showed about 15 timesfaster than implementation using CPU and about 4times faster than implementation on only GPU withoutOpenMP.
click
http://hci.ssu.ac.kr/ajpark/[IC]CUDAforNN.pdf
to download the papaer.

2008.

2008-09-30

[IC] Clustering of Trained Self-Organizing Feature Maps based on s-t Graph Cuts

Abstract.

The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as k-means, on the trained SOFM, but do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM via graph cuts, which can both deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using graph cuts, the graph must have two additional nodes, called terminals, and weights between the terminals and nodes of the graph are generally setting based on data manually obtained by users. The proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering and globally optimize the clustering of the trained SOFM by graph cuts.

click
http://hci.ssu.ac.kr/ajpark/[MLDM]Clustering.pdf
to download the paper.

2007.

[IC] Edge-based Eye Region Detection in Rotated Face using Global Orientation Histogram

Abstract.

Automatic human face analysis and recognition have become one of the most important research topics in robot society, and research on automatic eye region detection has recently attracted a lot of attentions, as the most important feature for human faces is eyes. Although much effect has been spent, the problem of automatic eye region detection is still challenging, as most of the existing methods mainly focus on eye detection in the frontal face without consideration of the factors, such as non-frontal faces and lighting conditions. This paper proposes an eye region detection method in faces rotated around the front-to-back axis called rotated faces, based on edge information that shows fast computational times. The proposed method consists of two steps: making frontal faces, and then detecting eye regions. The rotated angle of face regions can be estimated by analyzing histogram accumulated from edge orientation of the face region, called global orientation histogram. The rotated face can be frontal faces based on the estimated angle, and then the eye regions are detected in frontal face by analyzing edge orientation histogram of components grouping adjacent edges, called local orientation histogram, already verified by experiment of previous works. Experiment results demonstrated the effectiveness of the proposed method using 300 face images provided from THE Weizmann Institute of Science, and achieved precision rates of 83.5% and computational times of 0.5 seconds.
click
http://hci.ssu.ac.kr/ajpark/[ICRA]Edgebased.pdf
to download the paper.

2007.

[IC] e-Sports Live: e-Spoart Relay Broadcasting on Demand

Abstract.

Electronic Sports (e-Sports) is computer and video games played as competitive sports over the Internet and a local area network, and are provided to users through a TV relay broadcasting system and the Internet due to tremendous popularity of the e-Sports recently. The main drawback of the previous e-Sports relay is to broadcast the e-Sports selected by the provider without viewers’intention, and thus the viewers can not watch desired information. Accordingly,this paper proposes a message-based semi-interactive relay broadcasting system called e-Sports Live over the Internet. The proposed system captures all regions of playing a game, and transfers the regions to a client’s viewer. The client can watch desired information by selecting a slice of all regions. However,transferring continually all regions to the client over the Internet leads to a high-traffic due to high-capacity of data. To reduce the traffic, the system transfers all messages occurred in the game such as movements of characters instead of transferring continually all areas, based on resources repetitively used in game such as the whole map and game characters, and the resources are repetitively transferred to the client’s viewer in an initial stage. Consequently, the proposed system can reduce the traffic over the Internet by transferring only message, and can perform a semi-interaction by watching desire information, as the client’s viewer receives all areas occurred in the game based on resources repetitively used in the game and messages. Now, our system can not apply to TV relay broadcasting, as it does not have buffers for resources, but may be utilized for a variety of fields of interactive TV.
click
http://hci.ssu.ac.kr/ajpark/LNCS_e_Sports.pdf
to download the paper.

2007.

[IC] Automatic Word Detection System for Document Image using Mobile Devices

Abstract.

In the current age of ubiquitous computing age that uses high bandwidth network, wearable and hand-held mobile devices with small cameras and wireless communication will be widespread in the near future. Thus, computer vision and image processing for mobile devices have recently attracted a lot ofattention. Especially, many approaches to detect image texts containing useful information for automatic annotation, indexing, and structuring of image are important for a prerequisite stage of recognition in dictionary application using mobile devices equipped with a camera. To detect image texts on the mobile devices that have limited computational resources, recent works are based on two methodologies; the image texts are detected not by automatically but by manually using stylus pen to reduce the computational resources, and the server is used to detect image texts requiring many floating-computations. The main disadvantage of the manual method is that users directly select tentative text regions,and recall and precision rates are determined by the selected regions. The second method to automatically detect the image texts is difficult to perform it in real-time, due to transmission time between the mobile device and the server. Accordingly, this paper proposes a real-time automatic word detection system without support of the server. To minimize the computational time, one word in the central region of the image is considered as a target of the system. The word region is tentatively extracted by using edge density and window transition, and the tentatively extracted region is then verified by measuring uniform distribution among sub-windows of the extracted region. In the experiments, the proposed method showed high precision rates for one word in the central region of the image, and showed fast computational time on the mobile devices.
click
http://hci.ssu.ac.kr/ajpark/LNCS_AutomaticWord.pdf
to download the paper.

2007.

[IC] Augmented Galaga on Mobile Devices

Abstract.

Recently, research on augmented games as a new game genre has attracted a lot of attention. An augmented game overlaps virtual objects in an augmented reality (AR) environment, allowing users to interact with the AR environment through manipulating real and virtual objects. However, it is difficult to release existing augmented games to ordinary gamers, as the games generally use very expensive and inconvenient ‘backpack’ systems. Accordingly, this paper introduces an augmented game, called augmented galaga based on traditional well-known galaga, executed on mobile devices to make game players experience the game without any economic burdens. Augmented galaga uses real space, such as a room, as game environment, and covers the real wide space with a small screen of mobile devices by actually moving the mobile devices.In an initial stage, the specific objects are selected by game players, and are automatically recognized by scale-invariant features in playing a game.Then, virtual insect-like aliens randomly appear in several specific regions displayed on the mobile screen, and game players attack it by moving the mobile devices towards specific regions with virtual aliens and clicking a button of mobile devices. As a result, we expect that augmented galaga provides an exciting experience without any economic burdens for players based on the game paradigm, where the user interacts with both the physical world captured by a mobile camera and the virtual aliens automatically generated by a mobiledevice.
click
http://hci.ssu.ac.kr/ajpark/Augmented_Galaga.pdf
to download the paper.

2007.

[IC] Automatic Cartoon Image Re-authoring using SOFM

Abstract.

According to the growth of the mobile industry, a lot of on/off-line contents are being converted into mobile contents. Although the cartoon contents especially are one of the most popular mobile contents, it is difficult to provide users with the existing on/off-line contents without any considerations due to the small size of the mobile screen. In existing methods to overcome the problem, the cartoon contents on mobile devices are manually produced by computer software such as Photoshop. In this paper, we automatically produce the cartoon contents fitting for the small screen, and introduce a clustering method useful for variety types of cartoon images as a prerequisite stage for preserving semantic meaning. Texture information which is useful for gray scale image segmentation gives us a good clue for semantic analysis and self organizing feature maps (SOFM) is used to cluster similar texture information. Besides, we automatically segment the clustered SOFM outputs using agglomerative clustering. In our experimental results, combined approaches show goodresults of clustering in several cartoons.

click
http://hci.ssu.ac.kr/ajpark/LNCS_MRCS_Automatic.pdf
to download the paper.

2006.

[IC] Effective Image Retrieval for the M-Learning System

Abstract.

In this paper, we propose augmented learning contents (ALC) withthe blended learning on mobile devices. It augments on-line contents by indexingthe corresponding off-line contents using traditional pattern recognitionmethod, which results in a minimize of labors for conversion. Among the patternrecognition method marker-based is one of most general approach. Howeverit must reconstruct the off-line contents with pattern markers. To solveboth drawbacks that use of the pattern markers and difficulty of the color-basedimage retrieval by means of a low-resolution PDA camera, we used for a shapebasedsystem. CBIR based on object shapes is used instead of pattern markersto link off-line contents with on-line, and shapes are represented by a differentialchain code with estimated new starting points to obtain rotation-invariantrepresentation, which is suited to low computational resources of mobile devices.Consequently, the ALC can provide learner with a fast and accurate multimediacontents (video, audio, text) on static off-line contents using mobile deviceswithout space limitation.

click
http://hci.ssu.ac.kr/ajpark/LNCS_ICADL_Effective.pdf
to download the paper.

2006.

[IC] Image Texts-based Navigation for Augmented Game

Abstract.

In an augmented game, which is overlapping virtual objectson a real environment and attacking the virtual objects, accurate location estimation in a real environment is one of important issues. Existing global positioning systems (GPS) to track users’ positions do not work inside a building, and systems using sensors such as Active Badge are expensive to install andmaintain. Therefore, researches for low-cost vision-based navigation system have been attempted. Since most of scenes include a floor, ceiling and wall in a building, it is difficult to represent characteristics of those scenes. We propose an image matching method using image texts instead of objects included uniformly in natural scenes for navigation. The image texts are widely distributed in our environments, are very useful for describing the contents of an image, and can be sassily extracted compared to other semantic contents, and we obtain image texts using a method combining edge density and multi-layer perceptrons with CAMShift. However, since a camera attached to moving vehicles (robots)or hand-held devices has a low resolution, it is not easy to perform extraction using a binarization and a text recognition. Therefore, we perform an image matching using a matching window based on a scale and orientation of image texts and its neighborhood to recognize discriminated places including same image texts.

click
http://hci.ssu.ac.kr/ajpark/LNCS_EDUTAINMENT_Image.pdf
to download the paper.

2006.

[IC] Flying Cake: Augmented Game on Mobile Devices

Abstract.

In the ubiquitous computing age which uses a high band-width network, mobile devices such as wearable and hand-held ones with a small camera and a wireless communication module will be widely used in the near future. Thus, a lot of research about an augmented game on mobile devices have been attempted recently. The existing augmented games used a traditional ‘backpack’ system and a ‘pattern marker’. The ‘backpack’ system is expensive, cumbersome and in-convenient to use, and the game using ‘pattern marker’ can only be played in the previously-installed place. In this paper, we propose an augmented game called Flying Cake using a face region to create virtual objects(characters) without the predefined ‘pattern marker’, which measures the location of the virtual objects relative to the real world, on a small, light and mobile PDA instead of the cumbersome hardware. We augment the virtual character on the face region with a face detection technique using a skin-color model with the CAM-Shift algorithm, which is effective in detecting the face re-gion on the PDA with low computational resources. Flying Cake supplies new pleasure to players with a new game paradigm interacting between the user in the physical world and the virtual character in a virtual world using a camera attached to the PDA.
Keywords: Augmented Game, Face Detection, 3D Augmented Shooting Game, Mobile Vision.

click
http://hci.ssu.ac.kr/ajpark/CGAMES_Flying.pdf
to download the paper.

2005.

[IC] Automatic Conversion System for Mobile Cartoon Contents

Abtracts.

As the production of mobile contents is increasing and many peopleare using it, the existing mobile contents providers manually split cartoons intoframe images fitted to the screen of mobile devices. It needs much time and isvery expensive. This paper proposes an Automatic Conversion System (ACS)for mobile cartoon contents. It converts automatically the existing cartoon contentsinto mobile cartoon contents using an image processing technology as follows:1) A scanned cartoon image is segmented into frames by structure layoutanalysis. 2) The frames are split at the region that does not include the semanticstructure of the original image 3) Texts are extracted from the splitting frames,and located at the bottom of the screen. Our experiment shows that the proposedACS is more efficient than the existing methods in providing mobile cartoon contents.
click
http://hci.ssu.ac.kr/ajpark/LNCS_ICADL_Automatic.pdf
to download the paper.

2005.

[IC] Contents Recycling usnig Content-based Image Retrieval on Mobile Devices

Abstract.

Although a lot of studies have been made on mobile learning, thestudy of content-based image recycling on mobile device is not known verywell. This paper presents a new approach which recycles and augments existingoff-line contents using a camera-equipped mobile device. Each individuallearner has a PDA and an off-line textbook (Picture English Book: PEB).During the PEB-watching learning activity, users are dynamically providedwith on-line information such as texts, videos and audios corresponding to theoff-line contents via the PDA. A content-based image retrieval system (CBIR)is constructed to provide learner with required information using imagerecognition and multimedia technologies, such that the objective of m-learningcan be achieved. We believe that it is worth developing a mobile learningsystem to provide the learners with a new educational environment which canrecycles the existing PEBs.

click
http://hci.ssu.ac.kr/ajpark/LNCS_CIVR_Contents.pdf
to download the paper.

2005.

[IC] Intelligent Document Scanning with Active Camera

Abstract.

Document scanning is important as a prerequisite stage for analysis and recognition. Recently, a lot of researches about document image acquisition using a camera have been attempted, and the camera can be an alternative input device for document scanning if we can solve some problems such as the low resolution. We use an image registraction to overcome the low resolution of a camera. An ordinary image registraction method needs a pre-processing such as a camera calibration to reduce distortions on the composite. Therefore, the ordinary method has an extra running time. In this paper, we proposed a component-based image registration method to concentrate on reducing the distortions and acquiring a seamless image using a PTZ(pan-tilt-zoom) camera without pre-processing. Since we divide the input document image into each component using a text-specific characteristic, this method leads to reduce the object(text) distortions on the composite, and we save the extra running time because this method does not perform the post-processing.

click
http://hci.ssu.ac.kr/ajpark/ICDAL_Intelligent.pdf
to download the paper.

2005.

[IC] PDA-based Text Localization System using Client/Server Architecture

Abstract.

Recently, several research results of image processing are proposed on the mobile vision system. Many CPUs for Personal Digital Assistant (PDA) are integer CPUs, which have no floating-computation component. It results in slow computation of the algorithms constructed by using neural networks, which have much floating-computation. In this paper, in order to resolve this weakness, we propose an effective text localization system with the Client(PDA)/Server(PC) architecture which is connected to each other with a wireless LAN. The Client(PDA) compresses tentative text localization results in JPEG format for minimizing the transmission time to the Server(PC). The Server(PC) uses both the Multi-Layer Perceptron(MLP)-based texture classifier and Connected Components(CCs)-based filtering for a precise text localization based on the Client(PDA)'s tentative extraction results. The proposed method leads to not only faster running time but also efficient text localization.

Click
http://hci.ssu.ac.kr/ajpark/LNCS_PRICAI_PDA-based.pdf
to download the paper.

Aug. 2004.