Human Eyes: 2008-09

2008-09-30

[IC] Clustering of Trained Self-Organizing Feature Maps based on s-t Graph Cuts

Abstract.

The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as k-means, on the trained SOFM, but do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM via graph cuts, which can both deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using graph cuts, the graph must have two additional nodes, called terminals, and weights between the terminals and nodes of the graph are generally setting based on data manually obtained by users. The proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering and globally optimize the clustering of the trained SOFM by graph cuts.

click
http://hci.ssu.ac.kr/ajpark/[MLDM]Clustering.pdf
to download the paper.

2007.

[IC] Edge-based Eye Region Detection in Rotated Face using Global Orientation Histogram

Abstract.

Automatic human face analysis and recognition have become one of the most important research topics in robot society, and research on automatic eye region detection has recently attracted a lot of attentions, as the most important feature for human faces is eyes. Although much effect has been spent, the problem of automatic eye region detection is still challenging, as most of the existing methods mainly focus on eye detection in the frontal face without consideration of the factors, such as non-frontal faces and lighting conditions. This paper proposes an eye region detection method in faces rotated around the front-to-back axis called rotated faces, based on edge information that shows fast computational times. The proposed method consists of two steps: making frontal faces, and then detecting eye regions. The rotated angle of face regions can be estimated by analyzing histogram accumulated from edge orientation of the face region, called global orientation histogram. The rotated face can be frontal faces based on the estimated angle, and then the eye regions are detected in frontal face by analyzing edge orientation histogram of components grouping adjacent edges, called local orientation histogram, already verified by experiment of previous works. Experiment results demonstrated the effectiveness of the proposed method using 300 face images provided from THE Weizmann Institute of Science, and achieved precision rates of 83.5% and computational times of 0.5 seconds.
click
http://hci.ssu.ac.kr/ajpark/[ICRA]Edgebased.pdf
to download the paper.

2007.

[IC] e-Sports Live: e-Spoart Relay Broadcasting on Demand

Abstract.

Electronic Sports (e-Sports) is computer and video games played as competitive sports over the Internet and a local area network, and are provided to users through a TV relay broadcasting system and the Internet due to tremendous popularity of the e-Sports recently. The main drawback of the previous e-Sports relay is to broadcast the e-Sports selected by the provider without viewers’intention, and thus the viewers can not watch desired information. Accordingly,this paper proposes a message-based semi-interactive relay broadcasting system called e-Sports Live over the Internet. The proposed system captures all regions of playing a game, and transfers the regions to a client’s viewer. The client can watch desired information by selecting a slice of all regions. However,transferring continually all regions to the client over the Internet leads to a high-traffic due to high-capacity of data. To reduce the traffic, the system transfers all messages occurred in the game such as movements of characters instead of transferring continually all areas, based on resources repetitively used in game such as the whole map and game characters, and the resources are repetitively transferred to the client’s viewer in an initial stage. Consequently, the proposed system can reduce the traffic over the Internet by transferring only message, and can perform a semi-interaction by watching desire information, as the client’s viewer receives all areas occurred in the game based on resources repetitively used in the game and messages. Now, our system can not apply to TV relay broadcasting, as it does not have buffers for resources, but may be utilized for a variety of fields of interactive TV.
click
http://hci.ssu.ac.kr/ajpark/LNCS_e_Sports.pdf
to download the paper.

2007.

[IC] Automatic Word Detection System for Document Image using Mobile Devices

Abstract.

In the current age of ubiquitous computing age that uses high bandwidth network, wearable and hand-held mobile devices with small cameras and wireless communication will be widespread in the near future. Thus, computer vision and image processing for mobile devices have recently attracted a lot ofattention. Especially, many approaches to detect image texts containing useful information for automatic annotation, indexing, and structuring of image are important for a prerequisite stage of recognition in dictionary application using mobile devices equipped with a camera. To detect image texts on the mobile devices that have limited computational resources, recent works are based on two methodologies; the image texts are detected not by automatically but by manually using stylus pen to reduce the computational resources, and the server is used to detect image texts requiring many floating-computations. The main disadvantage of the manual method is that users directly select tentative text regions,and recall and precision rates are determined by the selected regions. The second method to automatically detect the image texts is difficult to perform it in real-time, due to transmission time between the mobile device and the server. Accordingly, this paper proposes a real-time automatic word detection system without support of the server. To minimize the computational time, one word in the central region of the image is considered as a target of the system. The word region is tentatively extracted by using edge density and window transition, and the tentatively extracted region is then verified by measuring uniform distribution among sub-windows of the extracted region. In the experiments, the proposed method showed high precision rates for one word in the central region of the image, and showed fast computational time on the mobile devices.
click
http://hci.ssu.ac.kr/ajpark/LNCS_AutomaticWord.pdf
to download the paper.

2007.

[IC] Augmented Galaga on Mobile Devices

Abstract.

Recently, research on augmented games as a new game genre has attracted a lot of attention. An augmented game overlaps virtual objects in an augmented reality (AR) environment, allowing users to interact with the AR environment through manipulating real and virtual objects. However, it is difficult to release existing augmented games to ordinary gamers, as the games generally use very expensive and inconvenient ‘backpack’ systems. Accordingly, this paper introduces an augmented game, called augmented galaga based on traditional well-known galaga, executed on mobile devices to make game players experience the game without any economic burdens. Augmented galaga uses real space, such as a room, as game environment, and covers the real wide space with a small screen of mobile devices by actually moving the mobile devices.In an initial stage, the specific objects are selected by game players, and are automatically recognized by scale-invariant features in playing a game.Then, virtual insect-like aliens randomly appear in several specific regions displayed on the mobile screen, and game players attack it by moving the mobile devices towards specific regions with virtual aliens and clicking a button of mobile devices. As a result, we expect that augmented galaga provides an exciting experience without any economic burdens for players based on the game paradigm, where the user interacts with both the physical world captured by a mobile camera and the virtual aliens automatically generated by a mobiledevice.
click
http://hci.ssu.ac.kr/ajpark/Augmented_Galaga.pdf
to download the paper.

2007.

[IC] Automatic Cartoon Image Re-authoring using SOFM

Abstract.

According to the growth of the mobile industry, a lot of on/off-line contents are being converted into mobile contents. Although the cartoon contents especially are one of the most popular mobile contents, it is difficult to provide users with the existing on/off-line contents without any considerations due to the small size of the mobile screen. In existing methods to overcome the problem, the cartoon contents on mobile devices are manually produced by computer software such as Photoshop. In this paper, we automatically produce the cartoon contents fitting for the small screen, and introduce a clustering method useful for variety types of cartoon images as a prerequisite stage for preserving semantic meaning. Texture information which is useful for gray scale image segmentation gives us a good clue for semantic analysis and self organizing feature maps (SOFM) is used to cluster similar texture information. Besides, we automatically segment the clustered SOFM outputs using agglomerative clustering. In our experimental results, combined approaches show goodresults of clustering in several cartoons.

click
http://hci.ssu.ac.kr/ajpark/LNCS_MRCS_Automatic.pdf
to download the paper.

2006.

[IC] Effective Image Retrieval for the M-Learning System

Abstract.

In this paper, we propose augmented learning contents (ALC) withthe blended learning on mobile devices. It augments on-line contents by indexingthe corresponding off-line contents using traditional pattern recognitionmethod, which results in a minimize of labors for conversion. Among the patternrecognition method marker-based is one of most general approach. Howeverit must reconstruct the off-line contents with pattern markers. To solveboth drawbacks that use of the pattern markers and difficulty of the color-basedimage retrieval by means of a low-resolution PDA camera, we used for a shapebasedsystem. CBIR based on object shapes is used instead of pattern markersto link off-line contents with on-line, and shapes are represented by a differentialchain code with estimated new starting points to obtain rotation-invariantrepresentation, which is suited to low computational resources of mobile devices.Consequently, the ALC can provide learner with a fast and accurate multimediacontents (video, audio, text) on static off-line contents using mobile deviceswithout space limitation.

click
http://hci.ssu.ac.kr/ajpark/LNCS_ICADL_Effective.pdf
to download the paper.

2006.

[IC] Image Texts-based Navigation for Augmented Game

Abstract.

In an augmented game, which is overlapping virtual objectson a real environment and attacking the virtual objects, accurate location estimation in a real environment is one of important issues. Existing global positioning systems (GPS) to track users’ positions do not work inside a building, and systems using sensors such as Active Badge are expensive to install andmaintain. Therefore, researches for low-cost vision-based navigation system have been attempted. Since most of scenes include a floor, ceiling and wall in a building, it is difficult to represent characteristics of those scenes. We propose an image matching method using image texts instead of objects included uniformly in natural scenes for navigation. The image texts are widely distributed in our environments, are very useful for describing the contents of an image, and can be sassily extracted compared to other semantic contents, and we obtain image texts using a method combining edge density and multi-layer perceptrons with CAMShift. However, since a camera attached to moving vehicles (robots)or hand-held devices has a low resolution, it is not easy to perform extraction using a binarization and a text recognition. Therefore, we perform an image matching using a matching window based on a scale and orientation of image texts and its neighborhood to recognize discriminated places including same image texts.

click
http://hci.ssu.ac.kr/ajpark/LNCS_EDUTAINMENT_Image.pdf
to download the paper.

2006.

[IC] Flying Cake: Augmented Game on Mobile Devices

Abstract.

In the ubiquitous computing age which uses a high band-width network, mobile devices such as wearable and hand-held ones with a small camera and a wireless communication module will be widely used in the near future. Thus, a lot of research about an augmented game on mobile devices have been attempted recently. The existing augmented games used a traditional ‘backpack’ system and a ‘pattern marker’. The ‘backpack’ system is expensive, cumbersome and in-convenient to use, and the game using ‘pattern marker’ can only be played in the previously-installed place. In this paper, we propose an augmented game called Flying Cake using a face region to create virtual objects(characters) without the predefined ‘pattern marker’, which measures the location of the virtual objects relative to the real world, on a small, light and mobile PDA instead of the cumbersome hardware. We augment the virtual character on the face region with a face detection technique using a skin-color model with the CAM-Shift algorithm, which is effective in detecting the face re-gion on the PDA with low computational resources. Flying Cake supplies new pleasure to players with a new game paradigm interacting between the user in the physical world and the virtual character in a virtual world using a camera attached to the PDA.
Keywords: Augmented Game, Face Detection, 3D Augmented Shooting Game, Mobile Vision.

click
http://hci.ssu.ac.kr/ajpark/CGAMES_Flying.pdf
to download the paper.

2005.

[IC] Automatic Conversion System for Mobile Cartoon Contents

Abtracts.

As the production of mobile contents is increasing and many peopleare using it, the existing mobile contents providers manually split cartoons intoframe images fitted to the screen of mobile devices. It needs much time and isvery expensive. This paper proposes an Automatic Conversion System (ACS)for mobile cartoon contents. It converts automatically the existing cartoon contentsinto mobile cartoon contents using an image processing technology as follows:1) A scanned cartoon image is segmented into frames by structure layoutanalysis. 2) The frames are split at the region that does not include the semanticstructure of the original image 3) Texts are extracted from the splitting frames,and located at the bottom of the screen. Our experiment shows that the proposedACS is more efficient than the existing methods in providing mobile cartoon contents.
click
http://hci.ssu.ac.kr/ajpark/LNCS_ICADL_Automatic.pdf
to download the paper.

2005.

[IC] Contents Recycling usnig Content-based Image Retrieval on Mobile Devices

Abstract.

Although a lot of studies have been made on mobile learning, thestudy of content-based image recycling on mobile device is not known verywell. This paper presents a new approach which recycles and augments existingoff-line contents using a camera-equipped mobile device. Each individuallearner has a PDA and an off-line textbook (Picture English Book: PEB).During the PEB-watching learning activity, users are dynamically providedwith on-line information such as texts, videos and audios corresponding to theoff-line contents via the PDA. A content-based image retrieval system (CBIR)is constructed to provide learner with required information using imagerecognition and multimedia technologies, such that the objective of m-learningcan be achieved. We believe that it is worth developing a mobile learningsystem to provide the learners with a new educational environment which canrecycles the existing PEBs.

click
http://hci.ssu.ac.kr/ajpark/LNCS_CIVR_Contents.pdf
to download the paper.

2005.

[IC] Intelligent Document Scanning with Active Camera

Abstract.

Document scanning is important as a prerequisite stage for analysis and recognition. Recently, a lot of researches about document image acquisition using a camera have been attempted, and the camera can be an alternative input device for document scanning if we can solve some problems such as the low resolution. We use an image registraction to overcome the low resolution of a camera. An ordinary image registraction method needs a pre-processing such as a camera calibration to reduce distortions on the composite. Therefore, the ordinary method has an extra running time. In this paper, we proposed a component-based image registration method to concentrate on reducing the distortions and acquiring a seamless image using a PTZ(pan-tilt-zoom) camera without pre-processing. Since we divide the input document image into each component using a text-specific characteristic, this method leads to reduce the object(text) distortions on the composite, and we save the extra running time because this method does not perform the post-processing.

click
http://hci.ssu.ac.kr/ajpark/ICDAL_Intelligent.pdf
to download the paper.

2005.

[IC] PDA-based Text Localization System using Client/Server Architecture

Abstract.

Recently, several research results of image processing are proposed on the mobile vision system. Many CPUs for Personal Digital Assistant (PDA) are integer CPUs, which have no floating-computation component. It results in slow computation of the algorithms constructed by using neural networks, which have much floating-computation. In this paper, in order to resolve this weakness, we propose an effective text localization system with the Client(PDA)/Server(PC) architecture which is connected to each other with a wireless LAN. The Client(PDA) compresses tentative text localization results in JPEG format for minimizing the transmission time to the Server(PC). The Server(PC) uses both the Multi-Layer Perceptron(MLP)-based texture classifier and Connected Components(CCs)-based filtering for a precise text localization based on the Client(PDA)'s tentative extraction results. The proposed method leads to not only faster running time but also efficient text localization.

Click
http://hci.ssu.ac.kr/ajpark/LNCS_PRICAI_PDA-based.pdf
to download the paper.

Aug. 2004.

Human Eyes