SIGMAP 2006 Abstracts


Area 1 - Multimedia Communications

Full Papers
Paper Nr: 9
Title:

WEIGHT UPDATING METHODS FOR A DYNAMIC WEIGHTED FAIR QUEUING (WFQ) SCHEDULER

Authors:

Gianmarco Panza, Valentin Besoiu, Cathérine Lamy-bergot, Filippo Sidoti and Gianmarco Panza

Abstract: This work aims to analyze different weight updating methods for a dynamic Weighted Fair Queuing (WFQ) scheduler providing Quality of Service (QoS) guarantees for the applications of the IST PHOENIX project and for new value-added services in general. Two weight updating methods are investigated in terms of granted delays to concerned service classes and buffer utilization of the related queues at a given IP interface. In particular, a novel weight updating based on the Knightly’s theory is proposed. Simulation results have demonstrated that a dynamic WFQ based on either of the weight updating methods can well support a proportional relative model of QoS in a Diff-Serv architecture in an IP-based Next Generation Network. The designed system is extremely simple and effective, and with low computational overhead by employing an innovative technique to evaluate the trend of the entering traffic aggregates, in order to trigger a scheduler’s weights updating only when needed.
Download

Paper Nr: 17
Title:

USING PLACEHOLDER SLICES AND MPEG-21 BSDL FOR ROI EXTRACTION IN H.264/AVC FMO-ENCODED BITSTREAMS

Authors:

Peter Lambert, Wesley De Neve, Davy De Schrijver, Yves Dhondt and Rik Van De Walle

Abstract: The concept of Regions of Interest (ROIs) within a video sequence is useful for many application scenarios. This paper concentrates on the exploitation of ROI coding within the H.264/AVC specification by making use of Flexible Macroblock Ordering. It shows how ROIs can be coded in an H.264/AVC compliant bitstream and how the MPE-21 BSDL framework can be used for the extraction of the ROIs. The first type of ROI extraction that is described, is simply dropping the slices that are not part of one of the ROIs. The second type is the replacement of these slices with so-called placeholder slices, the latter being implemented as P slices containing only macroblocks that are marked as ‘skipped’. The exploitation of ROI scalability, as achieved by the presented methodology, illustrates the possibilities that are offered by the single-layered H.264/AVC specification for content adaptation. The results show that the bit rate needed to transmit the adapted bitstreams can be reduced significantly. Especially in the case of a static camera and a fixed background, this bit rate reduction has very little impact on the visual quality. Another advantage of the adaptation process is the fact that the execution speed of the receiving decoder fairly increases.
Download

Paper Nr: 25
Title:

A PREDICTIVE MULTI-CHANNEL MBAC TECHNIQUE FOR ON-LINE VIDEO STREAMING

Authors:

Pietro Camarda, Cataldo Guaragnella and Domenico Striccoli

Abstract: A measurement based admission control predictive technique is introduced for on line streaming systems exploiting the GOP length demultiplexing of the aggregate bit stream in conjunction with a linear predictive algorithm. Due to the long latency of the statistical aggregate, the predictive technique is able to predict the bit rate over about two seconds of time. The prediction information is used in an admission control system to estimate the bit rate and the margin with respect to the channel capacity in the proposed streaming system. These measures have been used to estimate the overflow probability in a general aggregate situation. Tests conducted over real video sequences confirm the feasibility of the proposed technique.
Download

Paper Nr: 38
Title:

AN EFFICIENT PACKETIZATION SCHEME FOR VOIP

Authors:

Antonio Estepa, Rafael Estepa and Juan Manuel Vozmediano

Abstract: A number of VoIP audio codecs generate Silence Insertion Descriptor (SID) frames during talk-gaps of conversations to update the comfort noise generator-parameters at the receiver. According to the RFC 3551 packetization scheme, discontinuously-generated SID frames can not be carried in the same IP packet, thus increasing the conversation’s bandwidth consumption.. We define a novel packetization scheme in which a set of non-consecutive SID frames may share the same packet, reducing the overhead while keeping the timing between them. We provide analytical expressions and experimental validation for the bandwidth savings obtained with this new scheme, which grows up to a 14% for the G.729B codec.
Download

Paper Nr: 45
Title:

STREAMING LOW-DELAY VIDEO OVER AIMD TRANSPORT PROTOCOLS

Authors:

Ahmed A. El Al, Tarek Saadawi and Myung Lee

Abstract: In this paper, we present adaptation strategies for low-delay video streams over Additive-Increase Multiplicative-Decrease (AIMD) transport protocols, where we switch among several versions of the coded video to match the available network bandwidth accurately, and meet client delay constraints. By monitoring the application buffer at the server, we estimate the current and future server buffer drain delay, and derive the transmission rate to minimize client buffer starvation. We also show that the adaptation accuracy can be significantly improved by a simple scaling to transport protocol send-buffer size. The proposed mechanisms were implemented over Stream Control Transmission Protocol (SCTP) and evaluated through simulation and real Internet traces. Performance results show that the adaptation mechanism is responsive to bandwidth fluctuations, while ensuring that the client buffer does not underflow, and that the quality adaptation is smooth so that the impact on the perceptual quality at the client is minimal.
Download

Paper Nr: 51
Title:

HDL LIBRARY OF PROCESSING UNITS FOR GENERIC AND DVB-S2 LDPC DECODING

Authors:

Marco Gomes, Gabriel Falcão, Vitor Silva, Miguel Falcão and Pedro Faia

Abstract: This paper proposes an efficient HDL library of processing units for generic and DVB-S2 LDPC decoders following a modular and automatic design approach. General purpose, low complexity and high throughput bit node and check node functional models are developed. Both full serial and parallel architecture versions are considered. Also, a dedicated functional unit for an array processor LDPC decoder architecture to the DVB-S2 standard is considered. Additionally, it is described an automatic HDL code generator tool for arbitrary decoder architectures and LDPC codes, based on the proposed processing units and Matlab scripts.
Download

Paper Nr: 52
Title:

DESIGN AND IMPLEMENTATION OF VIDEO ON DEMAND SERVICES OVER A PEER-TO-PEER MULTIOVERLAY NETWORK

Authors:

Jia-ming Chen, Jenq-shiou Leu, Hsin-wen Wei, Li-ping Tung, Yen-ting Chou and Wei-Kuan Shih

Abstract: Video-on-Demand (VoD) services using peer-to-peer (P2P) technologies benefit by balancing load among clients and maximizing their bandwidth utilization to reduce the burden on central video servers with the single point of failure. Conventional P2P techniques for realizing VoD services only consider data between active peers in the same VoD session. They never consider those inactive peers that have left the session but may still hold partial media content in their local storage. In this article, we propose a novel architecture to construct a fully decentralized P2P overlay network for VoD streaming services based on a multioverlay concept. The architecture is referred to as MegaDrop. It not only takes the types of peers into consideration but also provides mechanisms for discovering nodes that may contain desired media objects. Such a P2P-based scheme can distribute media among peers, allow peers to search for a specific media object over the entire network efficiently, and stream the media object from a group of the peers. We employ a layered architecture consisting of four major tiers: Peer Discovery Layer, Content Lookup Layer, Media Streaming Layer, and Playback Control Layer. The evaluation results show that our architecture is particularly efficient for huge media delivery and multiuser streaming sessions.
Download

Paper Nr: 92
Title:

FAST CONVERSION OF H.264/AVC INTEGER TRANSFORM COEFFICIENTS INTO DCT COEFFICIENTS

Authors:

Ricardo Marques, V. Silva, S. Faria, Antonio Navarro and P. Assuncao

Abstract: In this paper we propose a fast method to convert H.264/AVC 4x4 Integer Transform (IT) to standard Discrete Cosine Transform (DCT for video transcoding applications. We derive the transcoding matrix for converting, simultaneously, in the transform domain, four IT 4x4 blocks into one 8 x 8 DCT block of coefficients. By exploiting the symmetry properties of the matrix, we show that the proposed conversion method requires fewer operations than its equivalent in the pixel domain. An integer matrix approximation is also proposed. The experimental results show that a negligible error is introduced, while the computational complexity can be significantly reduced.
Download

Paper Nr: 96
Title:

TRAFFIC TRUNK PARAMETERS FOR VOICE TRANSPORT OVER MPLS

Authors:

Antonio Estepa, Rafael Estepa and Juan Manuel Vozmediano

Abstract: Access nodes in NGN are likely to transport voice traffic using MPLS Traffic Trunks. The traffic parameters describing a Traffic Trunk are basic to calculate the network resources to be allocated along the nodes belonging to its corresponding Label-Switched-Path (LSP). This paper provides an analytical model to estimate the lower limit of the bandwidth that needs to be allocated to a TT loaded with a heterogeneous set of voice connections. Our model considers the effect of the Silence Insertion Descriptor (SID) frames that a number of VoIP codecs currently use. Additionally, two transport schemes are considered: VoIP and VoMPLS. The results, experimentally validated, quantify the benefits of VoMPLS over VoIP.
Download

Short Papers
Paper Nr: 13
Title:

CONGESTION CONTROL ACROSS A VIDEO-DOMINATED INTERNET TIGHT LINK

Authors:

Emmanuel Jammeh, Martin Fleury and Mohammed Ghanbari

Abstract: Existing congestion controllers have been designed with TCP traffic in mind. Changing traffic patterns on the Internet imply that on some tight links all UDP video streams will occur. Three different congestion controllers (RAP, TFRC, and fuzzy-logic based), already successful in avoiding instability in current TCP-dominated internets, were tested across a tight link in which video traffic dominated. Congestion control is either achieved by modulating the sending rate in response to feedback of packet loss rates and/or round-trip delays (RAP/TFRC) or a congestion level based on packet dispersion across a network path (fuzzy controller). The controllers were found to differ in the smoothness of resulting video clip streams, with the fuzzy and TFRC controllers, in that order, producing the smoothest received video. Tests also demonstrated that, when controlled flows of different types compete across a tight link, it is possible for the sending rate of TFRC to exceed the available bandwidth, resulting in excess packet loss and implying quite poor video quality at the receiver. The results show that fuzzy-logic control is more flexible when video streams dominate.
Download

Paper Nr: 23
Title:

A COMPARATIVE STUDY OF STATE-OF-THE-ART ADAPTATION TECHNIQUES FOR SCALABLE MULTIMEDIA STREAMS

Authors:

Andreas Schorr, Franz J. Hauck, Bernhard Feiten and Ingo Wolf

Abstract: Stream adaptation is a key technology for enabling communication between heterogeneous multimedia devices and applications, possibly located in heterogeneous wired or wireless networks. Converting already compressed multimedia streams into a format suitable for a certain receiver terminal and network can be achieved by transcoding or by filtering of media streams. Transcoding allows more flexible adaptation operations but is in general a very CPU-intensive process. Therefore, scalable media formats have been developed, which allow more efficient adaptation of media streams through media filtering. Several filter techniques for specific media formats have been proposed and implemented during the last decade. Recently, the MPEG-21 Digital Item Adaptation standard has defined a set of new tools for multimedia adaptation. In this paper, we provide a comparative study of several adaptation techniques for scalable multimedia streams. We compare generic MPEG-21-based adaptation techniques with filter mechanisms for specific media formats with respect to the required processing resources and scalability. We also compare filter techniques with stream adaptation through transcoding. Moreover, we compare how adaptation of multiple media streams performs on systems with single-core and with multi-core processors.
Download

Paper Nr: 28
Title:

VIDEOCONFERENCE OVER IPV6 - IPv6 Networks Advanced Developments

Authors:

Carlos Friaças, Miguel Baptista, Mónica Domingues and Paulo Ferreira

Abstract: This document focuses IPv6 support on H.323 videoconference protocol and on ConferenceXP architecture. The last one was designed by Microsoft to develop collaborative tasks and videoconference applications. In this document videoconference solutions like GnomeMeeting and ConferenceXP Client (and adjacent services) that implement the analyzed protocols are also presented. This way, guidelines for the deployment of a videoconference service with IPv6 support are provided.
Download

Paper Nr: 83
Title:

BALANCED RESOURCE SHARE MECHANISM IN OPTICAL NETWORKS

Authors:

Hyeon Park, Byung-Ho Yae, Dong-Hun Lee and Sang-Ha Kim

Abstract: Existing protection mechanisms to rapidly recover the failures allocate the backup path just SRLG-disjointed with working path. However, these mechanisms have the low resource utilization because the resource is not shared among the backup paths. To complement it, Kini (Kini et al., 2002), Somdip (Somdip et al., 2001) and so on, have proposed the mechanisms to share the resources of the backup paths. Although these mechanisms can improve the efficiency of bandwidth usage, those did not consider the unbalanced resource share of backup paths. The backup paths can be centralized on the specific link so that the idle resource is not used. As a result, as those do not use the resource efficiently the whole resource utilization is not good. So we propose the mechanism to enhance the resources utilization as settling down the unbalanced resource share to recover simultaneous failures. We formulate the problem to minimize the number of the used backup resource (exactly, wavelengths) as considering the maximum link load. We compare the existing mechanisms with our mechanism by the spare resource capacity as the result of the simulation.
Download

Area 2 - Multimedia Signal Processing

Full Papers
Paper Nr: 33
Title:

CONVOLUTION KERNEL COMPENSATION APPLIED TO 1D AND 2D BLIND SOURCE SEPARATION

Authors:

Damjan Zazula, Aleš Holobar and Matjaž Divjak

Abstract: Many practical situations can be modelled with multiple-input multiple-output (MIMO) models. If the input sources are mutually orthogonal, several blind source separation methods can be used to reconstruct the sources and model transfer channels. In this paper, we derive a new approach of this kind, which is based on the compensation of the model convolution kernel. It detects the triggering instants of individual sources, and tolerates their non-orthogonalities and high amount of additive noise, which qualifies the method in several signal and image analysis applications where other approaches fail.. We explain how to implement the convolution kernel compensation (CKC) method both in 1D and 2D cases. This unified approach made us able to demonstrate its performance in two different experiments. A 1D application was introduced to the decomposition of surface electromyograms (SEMG). Nine healthy males participated in the tests with 5% and 10% maximum voluntary isometric contractions (MVC) of biceps brachii muscle. We identified 3.4 ± 1.3 (mean ± standard deviation) and 6.2 ± 2.2 motor units (MUs) at 5% and 10% MVC, respectively. At the same time, we applied the 2D version of CKC to range imaging. Dealing with the Middlebury Stereo Vision referential set of images, our method found correct matches of 91.3 ± 12.1% of all pixels, while the obtained RMS disparity difference was 3.4 ± 2.5 pixels. This results are comparable to other ranging approaches, but our solution exhibits better robustness and reliability.
Download

Paper Nr: 53
Title:

DIRECTION BIASED SEARCH ALGORITHMS FOR FAST BLOCK MOTION ESTIMATION

Authors:

Niranjan Mulay

Abstract: Motion estimation (ME) is computationally the most challenging part of the video encoding process. It has a direct impact on speed and qualitative performance of the encoder. Consequently, many sub-optimal but faster ME algorithms have been developed till date. In particular, the Three Step Search (TSS) and Four Step Search (FSS) algorithms have become popular because of their ease of implementation. The TSS algorithm is a uniformly spaced block matching algorithm, which performs better in case of large motion. On the other hand, the New Three Step Search (NTSS) and FSS are center-biased algorithms that outperform TSS in case of smooth correlated motion. Later, another center-biased search technique namely, the Diamond Search (DS) algorithm was introduced which was proved to deliver a faster convergence than FSS in case of smooth motion scenarios. However, the performance of the center-biased algorithms degrades in sequences having consistently large or uncorrelated motion as they become susceptible to getting trapped in local minima near the center. In this paper, two novel ME algorithms, namely, dual square search (DSS) and dual diamond search (DDS) are proposed in order to strike a balance between the center-biased and uniformly spaced search techniques. The proposed algorithms suggest that a decision to shift the search center should be delayed till the candidates on a coarse as well as fine grid are evaluated. Moreover, these algorithms are modeled to exploit motion vector distribution found in most of the real world video sequences by giving more precedence to candidates near the center, followed by the candidates in the horizontal and vertical directions than those in the diagonal direction. The performance of the proposed algorithms is compared with TSS and FSS algorithms in terms of computational speed, motion compensation error and the compression achieved for various kinds of video sequences. The tested sequences show that both these algorithms can be substantially faster than TSS and FSS. The proposed ME algorithms promise to achieve a balanced tradeoff amongst ‘speed - bit rate - quality’ for different kinds of motion sequences.
Download

Paper Nr: 57
Title:

SIGNAL DENOISING BASED ON PARAMETRIC HAAR-LIKE TRANSFORMS

Authors:

Susanna Minasyan, Karen Egiazarian, Jaakko Astola and David Guevorkian

Abstract: Orthogonal transforms have found considerable interest in signal denoising applications. Recently Parametric Haar-like Transforms (PHTs) have been introduced and shown to be efficient in image denoising and compression applications. PHT is such that it may be computed with fast algorithm in structure a similar to that of classical fast Haar transform and such that its matrix contains a predefined basis vector, called generating vector, as its first row. PHT may be adapted to the characteristics of the input signal or to its parts by a proper selection of the generating vectors. Possibility of adaptation to the input signal may, in principle, be significant source for performance improvement of transform based signal processing algorithms. In this paper, the capability of parametric Haar-like transforms, in 1-D signal denoising application is explored. A new PHT based post-processing algorithm for 1-D signal denoising is proposed, which may be combined with another denoising method in order to improve the quality of the output signal. Experiments were conducted where the basic wavelet thresholding based signal denoising method was complemented with the proposed post-processing algorithm. Simulation results illustrate significant performance improvement due to the use of the proposed algorithm.
Download

Paper Nr: 69
Title:

ON-THE-FLY TIME SCALING FOR COMPRESSED AUDIO STREAMS

Authors:

Suzana B. Maranhão, Rogério Rodrigues and Luiz Soares

Abstract: Time scaling is a technique used to modify media-object presentation duration. This paper proposes an audio time-scaling algorithm focused on supporting applications that need: to maintain the original data format for storage or immediate presentation on any legacy audio player; to perform linear time scaling in real time, allowing the adjustment factor to vary along the audio presentation; and to perform time mark-up maintenance, that is, to compute new time values for original marked audio time instants. The proposed algorithm is appropriate for those applications that do not need a great adjustment factor variation. The integration with content rendering tools is presented in the paper and also an example of using these tools in a hypermedia presentation formatter.
Download

Paper Nr: 84
Title:

A SIMPLE AND COMUTATIONALLY EFFICIENT ALGORITHM FOR REAL-TIME BLIND SOURCE SEPARATION OF SPEECH MIXTURES

Authors:

Tarig Ballal, Nedelko Grbic and Abbas Mohammed

Abstract: In this paper we exploit the amplitude diversity provided by two sensors to achieve blind separation of two speech sources. We propose a simple and highly computationally efficient method for separating sources that are W-disjoint orthogonal (W-DO), that are sources whose time-frequency representations are disjoint sets. The Degenerate Unmixing and Estimation Technique (DUET), a powerful and efficient method that exploits the W-disjoint orthogonality property, requires extensive computations for maximum likehood parameter learning. Our proposed method avoids all the computations required for parameters estimation by assuming that the sources are "cross high-low diverse (CH-LD)", an assumption that is explained later and that can be satisfied exploiting the sensors settings/directions. With this assumption and the W-disjoint orthogonality property, two binary time-frequency masks that can extract the original sources from one of the two mixtures, can be constructed directly from the amplitude ratios of the time-frequency points of the two mixtures. The method works very well when tested with both artificial and real mixtures. Its performance is comparable to DUET, and it requires only 2% of the computations required by the DUET method. Moreover, it is free of convergence problems that lead to poor SIR ratios in the first parts of the signals. As with all binary masking approaches, the method suffers from artifacts that appear in the output signals.
Download

Paper Nr: 95
Title:

IMPROVING MULTISCALE RECURRENT PATTERN IMAGE CODING WITH DEBLOCKING FILTERING

Authors:

Nuno Rodrigues, Eduardo Silva, Murilo Carvalho, Sérgio M. Faria and Vitor Silva

Abstract: The Multidimensional Multiscale Parser (MMP) algorithm is an image encoder that approximates the image blocks by using recurrent patterns, from an adaptive dictionary, at different scales. This encoder performs well for a large range of image data. However, images encoded with MMP suffer from blocking artifacts. This paper presents the design of a deblocking filter that improves the performance the MMP. We present the results of our research, that aims to increase the performance of MMP, particularly for smooth images, without causing quality losses for other image types, where its performance is already up to 5 dB better than that of top transform based encoders. For smooth images, the proposed filter introduces relevant perceptual quality gains by efficiently eliminating the blocking effects, without introducing the usual blurring artifacts. Besides this, we show that, unlike traditional deblocking algorithms, the proposed method also improves the objective quality of the decoded image, achieving PSNR gains of up to about 0.3 dB. With such gains, MMP reaches an almost equivalent performance to that of the state-of-the-art image encoders (equal to that of JPEG2000 for higher compression ratios), for smooth images, while maintaining its gains for non-smooth images. In fact, for all image types, the proposed method provides significant perceptual improvements, without sacrificing the PSNR performance.
Download

Short Papers
Paper Nr: 18
Title:

OPTIMAL POWER ALLOCATION IN A MIMO-OFDM TWISTED PAIR TRANSMISSION SYSTEM WITH FAR-END CROSSTALK

Authors:

Andreas Ahrens and Christoph Lange

Abstract: Crosstalk between neighbouring wire pairs is one of the major impairments in digital transmission via multipair copper cables, which essentially limits the transmission quality and the throughput of such cables. For high-rate transmission, often the strong near-end crosstalk (NEXT) disturbance is avoided or suppressed and only the far-end crosstalk (FEXT) remains as crosstalk influence. If FEXT is present, signal parts are trans- mitted via the FEXT paths from the transmitter to the receiver in addition to the direct transmission paths. Therefore transmission schemes are of great practical interest, which take advantage of the signal parts transmitted via the FEXT paths. Here a SVD (singular-value decomposition) equalized MIMO-OFDM system is investigated, which is able to take advantage of the FEXT signal path. Based on the Lagrange multiplier method an optimal power allocation schema is considered in order to reduce the overall bit-error rate at a fixed data rate and fixed QAM constellation sizes. Thereby an interesting combination of SVD equalization and power allocation is considered, where the transmit power is not only adapted to the subchannels but rather to the symbol amplitudes of the SVD equalized data block. As a result it can be seen that the exploitation of FEXT is important for wireline transmission systems in particular with high couplings between neighbouring wire pairs and the power allocation is possible taking the different subcarriers into account.
Download

Paper Nr: 35
Title:

PROCESSING OF NON-STATIONARY SIGNAL USING LEVEL-CROSSING SAMPLING

Authors:

Modris Greitans

Abstract: The spectral characteristics of multimedia signals typically vary with time. Preferably, the sampling density of them would comply with instantaneous bandwidth of signal. The paper discusses the level-crossing sampling principle, which provides such capability for analog-to-digital conversion. As the captured samples are spaced non-uniformly, the appropriate digital signal processing is required. The non-stationary signal is characterized by time-frequency representation. Its classical approaches are inspected for applicability to analyze the data obtained by level-crossing sampling. Several enhancements of short-time Fourier transform approach are proposed, which are based on the idea to minimize the reconstruction error not only at sampling instants, but also between them with the same accuracy. Additional benefits are gained if the instantaneous spectral range of analysis is complied with local sampling density: artifacts are removed, complexity of calculations is decreased. The performance of algorithms is demonstrated by simulations. Presented research can be attractive for clock-less designs, which receive now an increasing interest. Their promising advantages can play a significant role in future electronics’ development.
Download

Paper Nr: 37
Title:

SIDE INFORMATION INTERPOLATION WITH SUB-PEL MOTION COMPENSATION FOR WYNER-ZIV DECODER

Authors:

Sven Klomp, Yuri Vatis and Jörn Ostermann

Abstract: Using Distributed Video Coding (DVC), the complex task of exploiting the source statistics can be moved from the encoder to the decoder. Such a DVC decoder needs side information to exploit the statistics. In common DVC codecs, the side information is obtained by interpolating the current frame from already decoded frames. This paper proposes an interpolation technique for the side information that uses motion compensation with sub-pel accuracy, and compares different interpolation filters for calculating the sub-pel values. Using a six tab Wiener filter, we observe a gain of up to 1.8 dB for the DVC coded frames.
Download

Paper Nr: 50
Title:

REAL-TIME IMAGE WAVELET CODING FOR LOW BIT RATE TRANSMISSION

Authors:

Gaoyong Luo

Abstract: Embedded coding for progressive image transmission has recently gained popularity in image compression community. However, current progressive wavelet-based image coders tend to be complex and computationally intense requiring large memory space. The encoding process usually sends information on the lowest-frequency wavelet coefficients first. At very low bit rates, images compressed are therefore dominated by low frequency information, where high frequency components belonging to edges are lost leading to blurring the signal features. This paper presents a new image coder for real-time transmission, employing edge preservation based on local variance analysis to improve the visual appearance and recognizability of compressed images. The analysis and compression is performed by dividing an image into blocks. Lifting wavelet filter bank is constructed for image decomposition and reconstruction with the advantages of being computationally efficient and boundary effects minimized. A modified SPIHT algorithm with more bits used to encode the wavelet coefficients and transmitting fewer bits in the sorting pass for performance improvement, is used to reduce the correlation of the coefficients at scalable bit rates. Local variance estimation and edge strength measurement can effectively determine the best bit allocation for each block to preserve the local features. Experimental results demonstrate that the method performs well both visually and in terms of quantitative performance measures, and offers error resilience feature that is evaluated using a simulated transmission channel with random error.
Download

Paper Nr: 98
Title:

SPEECH/MUSIC DISCRIMINATION BASED ON WAVELETS FOR BROADCAST PROGRAMS

Authors:

Emmanuel Didiot, Irina Illina, Odile Mella, Dominique Fohr and Jean-paul Haton

Abstract: The problem of speech/music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We propose to use a decomposition of the audio signal based on wavelets, which allows a good analysis of non stationary signal like speech or music. We compute different energy types in each frequency band obtained from wavelet decomposition. Two class/non-class classifiers are used : one for speech/non-speech, one for music/non-music. On the broadcast test corpus, the proposed wavelet approach gives better results than the MFCC one. For instance, we have a significant relative improvements of the error rate of 39% for the speech/music discrimination task.
Download

Paper Nr: 64
Title:

INTERVENANT CLASSIFICATION IN AN AUDIOVISUAL DOCUMENT

Authors:

Jeremy Philippeau, Julien Pinquier and Phlippe Joly

Abstract: This document deals with the definition of a new descriptor for audiovisual document indexing : the intervenant. We actually focus on its audiovisual localization, this is to say its place in an audiovisual sequence and its classification in 3 categories : IN, OUT or OFF. Based on the comparison of different analysis tools of both audio and video modes, we define a set of descriptors which can automatically be filled, potentially relevant to classify the intervenant localization. This decision is taken on the base of transition modeling between classes.
Download

Area 3 - Multimedia Systems and Applications

Full Papers
Paper Nr: 26
Title:

MULTI-MODAL WEB-BROWSING - An Empirical Approach to Improve the Browsing Process of Internet Retrieved Results

Authors:

Dimitris Rigas and Antonio Ciuffreda

Abstract: This paper describes a survey and an experiment which were carried out to measure some usability aspects of a multi-modal interface to browse documents retrieved from the Internet. An experimental platform, called AVBRO, was developed in order to be used as basis for the experiments. This study investigates the use of audio-visual stimuli as part of a multi-modal interface to communicate the results retrieved from Internet. The experiments were based on a set of Internet queries performed by an experimental and a control group of users. The experimental group of users performed Internet-based search operations using the AVBRO platform, and the control group using the Google search engine. On overall the users in the experimental group performed better than the ones in the control group. This was particular evident when users had to perform complex search queries with a large number of keywords (e.g. 4 to 5). The results of the experiments demonstrated that the experimental group, aided by the AVBRO platform, provided additional feedback about documents retrieved and helped users to access the desired information by visiting fewer web pages and in effect improved usability of browsing documents. A number of conclusions regarding issues of presentation and combination of different modalities were identified.
Download

Paper Nr: 34
Title:

EXTRACTING PERSONAL USER CONTEXT WITH A THREE-AXIS SENSOR MOUNTED ON A FREELY CARRIED CELL PHONE

Authors:

Toshiki Iso and Kenichi Yamazaki

Abstract: To realize ubiquitous services such as presence services and health care services, we propose an algorithm to extract ”personal user context” such as user’s behavior; it processes information gathered by a three-axis accelerometer mounted on a cell phone. Our algorithm has two main functions; one is to extract feature vectors by analyzing sensor data in detail by wavelet packet decomposition. The other is to flexibly cluster personal user context by combining a self-organizing algorithm with Bayesian theory. A prototype that implements the algorithm is constructed. Experiments on the prototype show that the algorithm can identify personal user contexts such as walking, running, going up/down stairs, and walking fast with an accuracy of about 88[%].
Download

Paper Nr: 41
Title:

A VIDEO CLASSIFICATION METHOD FOR USER-CENTERED STREAMING SERVICES

Authors:

Yuka Kato and Katsuya Hakozaki

Abstract: The present paper analyzes the relationship between video content and subjective video quality for user-centered streaming services. In this analysis, we conduct subjective assessments using various types of video programs, and propose a method of classifying video programs into a number of groups that are thought by a large majority of users to have the same video quality. Control to a high level of user satisfaction can be performed by applying a different control method to each group obtained using the proposed method. In addition, we demonstrate the necessity of rate control according to video content by comparing a classification result based on vision parameters with the classification result based on the assessment result.
Download

Paper Nr: 44
Title:

A NETWORK TRAFFIC SCHEDULER FOR A VOD SERVER ON THE INTERNET

Authors:

Javier Balladini, Leandro Souza and Remo Suppi

Abstract: Most of the Video on Demand (VoD) systems were designed to work in dedicated networks. However, there are some approaches that provide VoD service in nondedicated and best effort networks, but they adapt the media’s quality according to the available network bandwidth. Our research activities focus on VoD systems with high quality service on nondedicated networks. Currently, we have designed and developed, to integrate in the VoD server, a network manager that provides: total network control, network state information, and adaptation of the transmission rate in a TCP-Friendly way. The present work describes this network manager, named Network Traffic Scheduler (NTS), which incorporates a congestion control algorithm named "Enhanced Rate Adaptation Protocol" (ERAP). ERAP is an optimization of the well-known protocol denominated "Rate Adaptation Protocol" (RAP). Maintaining the basic behavior of RAP, ERAP increases the efficiency of the NTS by reducing the resources usage (of the server and the network). These components has been extensively evaluated by simulations and real tests in which the resource consumption and the performance were measured. This paper presents the advantages of using ERAP instead of RAP in a VoD server, and its viability to be integrated within the NTS in a VoD server on nondedicated networks.
Download

Paper Nr: 48
Title:

SEARCHING MOVIES BASED ON USER DEFINED SEMANTIC EVENTS

Authors:

Bart Lehane, Noel E.O'Connor and Hyowon Lee

Abstract: The number, and size, of digital video databases is continuously growing. Unfortunately, most, if not all, of the video content in these databases is stored without any sort of indexing or analysis and without any associated metadata. If any of the videos do have metadata, then it is usually the result of some manual annotation process rather than any automatic indexing. Locating clips and browsing content is difficult, time consuming and generally inefficient. The task of managing a set of movies is particularly difficult given their innovative creation process and the individual style of directors. This paper proposes a method of searching video data in order to retrieve semantic events thereby facilitating management of video databases. An interface is created which allows users to perform searching using the proposed method. In order to assess the searching method, this interface is used to conduct a set of experiments in which users are timed completing a set of tasks using both the searching method and an alternate, keyframe based, retrieval method. These experiments evaluate the searching method, and demonstrate it’s versatility.
Download

Paper Nr: 60
Title:

ENHANCED INTERACTION FOR STREAMING MEDIA

Authors:

Wolfgang Hürst, Tobias Lauer and Rainer Müller

Abstract: Streaming is a popular and efficient way of web-based on-demand multimedia delivery. However, flexible methods of interaction and navigation, as required, for example, in learning applications, are very restricted with streamed contents. Using the example of recorded lectures, we point out the importance of such advanced interaction which is not possible with purely streamed media. A new delivery method based on a combination of streaming and download is proposed which can be realized with Rich Internet Applications. It combines the advantages of streaming delivery with navigational and interactive features that are usually known only from locally available media.
Download

Paper Nr: 76
Title:

AN E-LIBRARIAN SERVICE THAT YIELDS PERTINENT RESOURCES FROM A MULTIMEDIA KNOWLEDGE BASE

Authors:

Serge Linckels and Christoph Meinel

Abstract: In this paper we present an e-librarian service which is able to retrieve multimedia resources from a knowledge base in a more efficient way than by browsing through an index or by using a simple keyword search. We explored the approach to allow the user to formulate a complete question in natural language. Our background theory is composed of three steps. Firstly, there is the linguistic pre-processing of the user question. Secondly, there is the semantic interpretation of the user question into a logical and unambiguous form, i.e. ALC terminology. The focus function resolves ambiguities in the question; it returns the best interpretation for a given word in the context of the complete user question. Thirdly, there is the generation of a semantic query, and the retrieval of pertinent documents. We developed two prototypes: one about computer history (CHESt), and one about fractions in mathematics (MatES). We report on experiments with these prototypes that confirm the feasibility, the quality and the benefits of such an e-librarian service. From 229 different user questions, the system returned for 97% of the questions the right answer, and for nearly half of the questions only one answer, the best one.
Download

Paper Nr: 80
Title:

REAL-TIME SIMULATION OF SOUND SOURCE OCCLUSION

Authors:

Christopher Share and Graham Mcallister

Abstract: Sound source occlusion occurs when the direct path from a sound source to a listener is blocked by an intervening object. Currently, a variety of methods exist for modeling sound source occlusion. These include finite element and boundary element methods, as well as methods based on time-domain models of edge diffraction. At present, the high computational requirements of these methods precludes their use in real-time environments. In the case of real-time geometric room acoustic methods (e.g. the image method, ray tracing), the model of sound propagation employed makes it difficult to incorporate wave-related effects such as occlusion. As a result, these methods generally do not incorporate sound source occlusion. The lack of a suitable sound source occlusion method means that developers of real-time virtual environments (such as computer games) have generally either ignored this phenomenon or used rudimentary and perceptually implausible approximations. A potential solution to this problem is the use of shadow algorithms from computer graphics. These algorithms can provide a way to efficiently simulate sound source occlusion in real-time and in a physically plausible manner. Two simulation prototypes are presented, one for fixed-position sound sources and another for moving sound sources.
Download

Paper Nr: 90
Title:

ROBUST CONTENT-BASED VIDEO WATERMARKING EXPLOITING MOTION ENTROPY MASKING EFFECT

Authors:

Amir Houmansadr, Hamed Pirsiavash and Shahrokh Ghaemmaghami

Abstract: A major class of image and video watermarking algorithms, i.e. content-based watermarking, is based on the concept of Human Visual System (HVS) in order to adapt more efficiently to the local characteristics of the host signal. In this paper, a content-based video watermarking scheme is developed and the concept of entropy masking effect is employed to significantly improve the use of the HVS model. Entropy masking effect states that the human eye’s sensitivity decreases in high entropy regions, i.e. regions with spatial or temporal complexity. The spatial entropy masking effect has been exploited in a number of previous works in order to enhance the robustness of image-adaptive watermarks. In the current research, we use the temporal entropy masking as well to achieve a higher performance in video watermarking. Experimental results show that more robust watermarked video sequences are produced considering temporal entropy masking effect, while the watermarks are still subjectively imperceptible. Robustness enhancement is a function of temporal and spatial complexity of the host video sequences.
Download

Paper Nr: 103
Title:

PROVIDING PHYSICAL SECURITY VIA VIDEO EVENT AWARENESS

Authors:

Dimitrios Georgakopoulos and Donald Baker

Abstract: The Video Event Awareness System (VEAS) analyzes surveillance video from thousands of video cameras and automatically detects complex events in near real-time—at pace with their input video streams. For events of interest to security personnel, VEAS generates and routes alerts and related video evidence to subscribing security personnel that facilitate decision making and timely response. In this paper we introduce VEAS’s novel publish/subscribe run-time system architecture and describe VEAS’s event detection approach. Event processing in VEAS is driven by user-authored awareness specifications that define patterns of inter-connected spatio-temporal event stream operators that consume and produce facility-specific events described in VEAS’s surveillance ontology. We describe how VEAS integrates and orchestrates continuous and tasked video analysis algorithms (e.g., for entity tracking and identification), how it fuses events from multiple sources and algorithms in an installation-specific entity model, how it can proactively seek additional information by tasking video analysis algorithms and security personnel to provide it, and how it deals with late arriving information due to out-of-band video analysis tasks and overhead. We use examples from the physical security domain, and discuss related and future work.
Download

Paper Nr: 105
Title:

A B-LEARNING APPROACH FOR ELECTRICAL ENGINEERING BASED ON WIRELESS ACCESS TO PEDAGOGICAL E-CONTENT

Authors:

Pedro Assunção, Carla Lopes and Rafael S. Caldeirinha

Abstract: This paper describes a novel pedagogical approach and its application to undergraduate Electrical Engineering courses. A b-learning model is proposed where students are subject to different types of interactions and diverse learning experiences characterised by the type learning session and pedagogical agent in use. These include face-to-face teaching sessions in the classroom, standalone e-learning sessions, working groups and supervised laboratory work. A wireless LAN infrastructure, as part of the Portuguese Electronic University Project, is used for the e-learning component of the proposed approach. In practice, this is an application of the “anywhere, at anytime” concept to learning processes by extending classroom boundaries in both space and availability domains. From a technical point a view, the paper addresses the information network infrastructure used and its quality of service in multimedia communications, the e-content generation, its characteristics, requirements and possible adaptation to heterogeneous environments. A student survey was conducted to evaluate technical and pedagogical aspects of the learning process using the proposed model. The results show this is a promising approach towards flexible learning in Higher Education Institutions and also revealed that students tend to adapt rather slowly to emerging pedagogical approaches, though they are truly in favour to use IT in their own learning processes.
Download

Short Papers
Paper Nr: 43
Title:

SUBGROUP FEEDBACK FOR SOURCE-SPECIFIC MULTICAST

Authors:

Dan Komosny

Abstract: The recent deployment of IP-based TV and radio distributions requires one-to-many multicast instead of the traditional many-to-many data distribution. These large multimedia sessions usually rely on the real time protocol (RTP) and the real time control protocol (RTCP). Although one-to-many multicast offers the required communication, it does not support the multicast feedback channel for carrying the RTCP control messages. Therefore, unicast feedback channels from session members to the source are used to carry these messages. In this paper, we introduce subgroup feedback scenarios for source-specific multicast, which is built on the one-to-many philosophy. Our extensions are based on the subgroup feedback framework standardized in the IETF. We outline a possible implementation of the subgroup feedback using the receiver summary information (RSI) packet. A theoretical RSI packet rate analysis is also presented in the paper.
Download

Paper Nr: 61
Title:

SEGMENTING OF RECORDED LECTURE VIDEOS - The Algorithm VoiceSeg

Authors:

Stephan Repp and Christoph Meinel

Abstract: In the past decade, we have witnessed a dramatic increase in the availability of online academic lecture videos. There are technical problems in the use of recorded lectures for learning: the problem of easy access to the multimedia lecture video content and the problem of finding the semantically appropriate information very quickly. The first step to a semantic lecture-browser is the segmenting of the large video-corpus into a smaller cohesion area. The task of breaking documents into topically coherent subparts is called topic segmentation. In this paper, we present a segmenting algorithm for recorded lecture videos based on their imperfect transcripts. The recorded lectures are transcripted by an out-of-the-box speech recognition software with a accuracy of approximately 70%-80%. Words as well as a time stamp for each word are stored in a database. This data acts as the input to our algorithm. We will show that the clustering of similar words, the generation of vectors with the values from the clusters and the calculation of the cosine-mass of adjacent vectors, leads to a better segmenting result compared to a standard algorithm.
Download

Paper Nr: 61
Title:

SEGMENTING OF RECORDED LECTURE VIDEOS - The Algorithm VoiceSeg

Authors:

Stephan Repp and Christoph Meinel

Abstract: In the past decade, we have witnessed a dramatic increase in the availability of online academic lecture videos. There are technical problems in the use of recorded lectures for learning: the problem of easy access to the multimedia lecture video content and the problem of finding the semantically appropriate information very quickly. The first step to a semantic lecture-browser is the segmenting of the large video-corpus into a smaller cohesion area. The task of breaking documents into topically coherent subparts is called topic segmentation. In this paper, we present a segmenting algorithm for recorded lecture videos based on their imperfect transcripts. The recorded lectures are transcripted by an out-of-the-box speech recognition software with a accuracy of approximately 70%-80%. Words as well as a time stamp for each word are stored in a database. This data acts as the input to our algorithm. We will show that the clustering of similar words, the generation of vectors with the values from the clusters and the calculation of the cosine-mass of adjacent vectors, leads to a better segmenting result compared to a standard algorithm.
Download

Paper Nr: 73
Title:

RANDOMISED DYNAMIC TRAITOR TRACING

Authors:

Jarrod Trevathan and Wayne Read

Abstract: Dynamic traitor tracing schemes are used to trace the source of piracy in broadcast environments such as cable TV. Dynamic schemes divide content into a series of watermarked segments that are then broadcast. The broadcast provider can adapt the watermarks according to the pirate’s response and eventually trace him/her. As dynamic algorithms are deterministic, for a given set of inputs, the tracing algorithm will execute exactly the same way each time. An adversary can use this knowledge to ensure that the tracing algorithm is forced into executing at its worst case bound. In this paper we review dynamic traitor tracing schemes and describe why determinism is a problem. We ammend several existing dynamic tracing algorithms by incorporating randomised decisions. This eliminates any advantage an adversary has in terms of the aforementioned attack, as he/she no longers knows exactly how the tracing algorithm will execute. Simulations show that the randomising modifications influence each dynamic algorithm to run at its average case complexity in terms of tracing time. We provide an efficiency analysis of the ammended algorithms and give some recommendations for reducing overhead.
Download

Paper Nr: 91
Title:

A TEMPORAL SYNCHRONIZATION MECHANISM FOR REAL-TIME DISTRIBUTED CONTINUOUS MEDIA

Authors:

Luis M. Rosales and Saul Eduardo Pomares Hernandez

Abstract: The preservation of temporal relations for real-time distributed continuos media is a key issue for emerging multimedia applications, such as Tele-Immersion and Tele-Engineering. Although several works try to model and execute distributed continuous media scenarios, they are far from resolving the problem. The present paper proposes a viable solution based on the identification of logical dependencies. Our solution considers two main components. First, it establishes a temporal synchronization model that expresses all possible temporal scenarios for continuous media according to their causal dependency constraints. The second component consists of an innovative synchronization mechanism that accomplishes the reproduction of continuous media according to its temporal specification. We note that the present work does not require previous knowledge of when nor for how long the continuous media of a temporal scenario is executed.
Download

Paper Nr: 99
Title:

A COMPONENT-BASED SOFTWARE ARCHITECTURE FOR REALTIME AUDIO PROCESSING SYSTEMS

Authors:

Jarmo Hiipakka

Abstract: This paper describes a new software architecture for audio signal processing. The architecture was specifically designed low-latency, low-delay realtime applications in mind. Additionally, the frequently used paradigm of dividing the functionality into components all sharing the same interface, was adopted. The paper presents a systematic approach into structuring the processing inside the components by dividing the functionality into two groups of functions: realtime and control functions. The implementation options are also outlined with short descriptions of two existing implementations of the architecture. An algorithm example highlighting the benefits of the architecture concludes the paper.
Download

Paper Nr: 100
Title:

DEVELOPMENT OF VOICE-BASED MULTIMODAL USER INTERFACES

Authors:

Claudia P. Sena and Celso Santos

Abstract: In the last decades, the interface evolution made the visual interfaces popular as standard and the keyboard and mouse as input device most used to the human-computer interaction. The integration of voice as an input style to visual-only interfaces could overcome many of the limitations and problems of current human-computer interaction. One of the major issues that remain is how to integrate voice input into a graphical interface application. In this paper, we introduce a development method of multimodal interfaces combining voice and visual input/output. In order to evaluate the proposed approach, a video application multimodal interface was implemented and analysed.
Download

Paper Nr: 36
Title:

SPEAKER’S GENDER IDENTIFICATION FOR HUMAN-ROBOT INTERACTION

Authors:

Kyungsook Bae, Keunchang Kwak and Soo-young Chi

Abstract: This paper is concerned with a text-independent Speaker’s gender Identification (GI) for Human-Robot Interaction (HRI). For this purpose, we perform speaker’s gender recognition based on Gaussian Mixture Model (GMM) and use robot platform called WEVER, which is a Ubiquitous Robotic Companion (URC) intelligent service robot developed at Intelligent Robot Research Division in Electronics and Telecommunication Research Institute (ETRI). Furthermore, we communicate with intelligent service robots through a Korean-based spontaneous speech recognition and text-independent speaker’s gender identification to provide a suitable service such as selection of preferable TV channel or music for the identified speaker’s gender. The experimental results obtained for ETRI speaker database reveal that the approach presented in this paper yields a good identification (94.9%) performance within 3 meter.
Download

Paper Nr: 36
Title:

SPEAKER’S GENDER IDENTIFICATION FOR HUMAN-ROBOT INTERACTION

Authors:

Kyungsook Bae, Keunchang Kwak and Soo-young Chi

Abstract: This paper is concerned with a text-independent Speaker’s gender Identification (GI) for Human-Robot Interaction (HRI). For this purpose, we perform speaker’s gender recognition based on Gaussian Mixture Model (GMM) and use robot platform called WEVER, which is a Ubiquitous Robotic Companion (URC) intelligent service robot developed at Intelligent Robot Research Division in Electronics and Telecommunication Research Institute (ETRI). Furthermore, we communicate with intelligent service robots through a Korean-based spontaneous speech recognition and text-independent speaker’s gender identification to provide a suitable service such as selection of preferable TV channel or music for the identified speaker’s gender. The experimental results obtained for ETRI speaker database reveal that the approach presented in this paper yields a good identification (94.9%) performance within 3 meter.
Download

Paper Nr: 63
Title:

APPLICATION OF DYNAMICALLY RECONFIGURABLE PROCESSORS IN DIGITAL SIGNAL PROCESSING

Authors:

Hrvoje Mlinaric, Mario Kovac and Josip Knezovic

Abstract: The paper describes a new approach to processor construction, which combines a general purpose processor and a program reconfigurable device, as well as its implementation in digital signal processing applications. Significant flexibility and adaptability of such a processor is obtained through the possibility of varying the components of the processor architecture. A simple change of architecture enables easy adaptation to various applications in which such processors are used. Furthermore, to achieve even greater functionality, a dynamic adjustment of the processor is enabled, by enabling the change of function of individual processor components without the need to turn the processor off. The functionality change itself is conducted in such a way that it is made in a single clock, which allows for a great flexibility of the processor, increases the functionality and enables simple implementation in various applications. Such processor architecture is broadly used in embedded computer systems for various multimedia, encryption and digital signal applications.
Download

Paper Nr: 77
Title:

PERSONAL SOUND BROWSER - A Collection of Tools to Search, Analyze and Collect Audio Files in a LAN and in the Internet

Authors:

Sergio Cavaliere and Carmine Colucci

Abstract: In this paper we present a toolbox aimed to search for audio files on the Internet, in a Local Area Network or in a single computer. Search is finalized both to analyze the collected files and to populate a multimedia archive for further use or analysis. The related tools to interface to a multimedia Data Base and analyze files is also provided. The toolbox is intended to be open in the sense that any user may customize it at will adding proprietary tools and methods. It is freely distributed and also open to contributions. The goal has been achieved building a Matlab Toolbox; this, as is well known, results in an open environment that anybody may customize at will. Research in the field of music and sound browsing analysis and classification is a large and open field in which a large amount of different solutions have been proposed in the literature. Deciding which sound parameters are suited to a kind of search or classification is still an open problem: we are therefore providing an open environment where anybody may customize at will tools and methods, an environment which, as a plus respect to other tools in the literature, starts from the very first stage of the process, searching and browsing directly from the Internet. Our work goes in this direction and proposes an open environment made of open tools for the purpose. The language used allows also, as a further benefit, the advantage of straightforward prototyping of new tools. Interested researchers are kindly invited to email the authors for the distribution of the toolbox.
Download

Paper Nr: 85
Title:

TAG INTERFACE FOR PERVASIVE COMPUTING - Paper Tag Interface using Imae Code

Authors:

Dong-chul Kim, Jong-hoon Seo, Cheolho Cheong and Tack-Don Han

Abstract: Recently, computing environments move to pervasive computing age with rapid growth of internet and appearance of various mobile devices. This means computer offers more convenient life with linking between physical object and digital information. With these advance of computing environment, several researches are progress that are about tag interface to link between various physical objects and digital information, which are image code like as barcode, or wireless technology based like as RFID. It leads high expenses to read or write because RFID has to buy tags – as much as wanted – and has to need exclusive hardware. But image code can be printed to paper and can decode from camera connected to computer, so it is convenient and less maintenance cost. In this paper, we developed encoding and decoding algorithm for image code and applied algorithm to the tag interface to access the information more easy and fast in pervasive computing environment and develop.
Download

Paper Nr: 86
Title:

iTV MODEL - An HCI Based Model for the Planning, Development and Evaluation of iTV Applications

Authors:

Alcina Prata, Nuno Guimarães, Piet Kommers and Teresa Chambel

Abstract: This document describes a Model for the Planning, Development and Evaluation of iTV Viewer/User Interfaces. Explained are the motivations for the development of the Model and what is new about it. Also mentioned, are the models, methodologies, theories, guidelines, heuristics, design patterns, processes, “steps” and “tips” that were combined in order to achieve the presented Model. Some conclusions are presented and also future lines of research are point out.
Download

Paper Nr: 89
Title:

CONTENT-BASED VISUAL RETRIEVAL ON MULTIPLE FEATURES IN THE IMAGE DATABASES OBTAINED FROM DICOM FILES

Authors:

Liana Stanescu and Dumitru Burdescu

Abstract: The paper presents the results of some experiments effectuated in the content-based visual query process applied on color medical images extracted from the DICOM files provided by the medical tools. The color feature was considered first, and the study implied more quantization methods (HSV color space at 166 colors, RGB color space at 64 colors and CIE-LUV color space at 512 colors) and several methods of computing the dissimilitude between the query and the target images (Euclidian distance, the histogram intersection and the quadratic distance between histograms). The content-based visual query on color texture feature was tested using two important methods of texture detection: the co-occurrence matrices and the Gabor filters. Also, the accurateness of the color set back-projection in detecting color regions representing sick tissue in medical images was studied. The effectuated statistics encourage the use of this algorithm in keeping track of the patient evolution under a certain treatment, with performances both in quality and speed.
Download