Catálogo de publicaciones - libros
Computer Vision, Graphics and Image Processing: 5th Indian Conference, ICVGIP 2006, Madurai, India, December 13-16, 2006, Proceedings
Prem K. Kalra ; Shmuel Peleg (eds.)
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
No disponibles.
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2006 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-68301-8
ISBN electrónico
978-3-540-68302-5
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2006
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2006
Tabla de contenidos
doi: 10.1007/11949619_61
An Integrated Approach for Downscaling MPEG Video
Sudhir Porwal; Jayanta Mukherjee
Digital video databases are widely available in compressed format. In many applications such as video browsing, picture in picture, video conferencing etc. data transfer at lower bit rate is required. This requires downscaling of the video before transmission. The conventional spatial domain approach for downscaling video is computationally very expensive. The computation can greatly be reduced if downscaling and inverse motion compensation (IMC) are performed in Discrete Cosine Transform (DCT) domain. There are many algorithms in the literature to perform IMC in the DCT domain. In this paper, we propose an efficient integrated technique to perform IMC and downscaling in DCT domain. This new approach results in significant improvement in computational complexity.
- Compression | Pp. 686-695
doi: 10.1007/11949619_62
DCT Domain Transcoding of H.264/AVC Video into MPEG-2 Video
Vasant Patil; Tummala Kalyani; Atul Bhartia; Rajeev Kumar; Jayanta Mukherjee
As the number of different video compression standards increase, there is a growing need for conversion between video formats coded in different standards. H.264/AVC is a newly emerging video coding standard which achieves better video quality at reduced bit rate than other standards. The standalone media players that are available in the market do not support H.264 video playback. In this paper, we present novel techniques that can achieve conversion of pre-coded video in H.264/AVC standard to MPEG-2 standard directly in the compressed domain. Experimental results show that the proposed approach can produce transcoded video with quality comparable to the pixel-domain approach at significantly reduced cost.
- Compression | Pp. 696-707
doi: 10.1007/11949619_63
Adaptive Scalable Wavelet Difference Reduction Method for Efficient Image Transmission
T. S. Bindulal; M. R. Kaimal
This paper presents a scalable image transmission scheme based on the wavelet-based coding technique supporting region of interest properties. The proposed scheme scalable WDR (SWDR), is based on the wavelet difference reduction scheme, progresses adaptively to get different resolution images at any bit rate required and is supported with the spatial and SNR scalability. The method is developed for the limited bandwidth network where the image quality and data compression are mopst important. Simulations are performed on the medical images, satellite images and Standard test images like Barbara, fingerprint images. The simulation results show that the proposed scheme is up to 20-40% better than other famous scalable schemes like scalable SPIHT coding schemes in terms of signal to noise ratio values (dB) and reduces execution time around 40% in various resolutions. Thus, the proposed scalable coding scheme becomes increasingly important.
- Compression | Pp. 708-717
doi: 10.1007/11949619_64
GAP-RBF Based NR Image Quality Measurement for JPEG Coded Images
R. Venkatesh Babu; S. Suresh
In this paper, we present a growing and pruning radial basis function based no-reference (NR) image quality model for JPEG-coded images. The quality of the images are estimated without referring to their original images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity factors such as edge amplitude, edge length, background activity and background luminance. Image quality estimation involves computation of functional relationship between HVS features and subjective test scores. Here, the problem of quality estimation is transformed to a function approximation problem and solved using GAP-RBF network. GAP-RBF network uses sequential learning algorithm to approximate the functional relationship. The computational complexity and memory requirement are less in GAP-RBF algorithm compared to other batch learning algorithms. Also, the GAP-RBF algorithm finds a compact image quality model and does not require retraining when the new image samples are presented. Experimental results prove that the GAP-RBF image quality model does emulate the mean opinion score (MOS). The subjective test results of the proposed metric are compared with JPEG no-reference image quality index as well as full-reference structural similarity image quality index and it is observed to outperform both.
- Compression | Pp. 718-727
doi: 10.1007/11949619_65
A Novel Error Resilient Temporal Adjacency Based Adaptive Multiple State Video Coding over Error Prone Channels
M. Ragunathan; C. Mala
Video streaming applications have been gaining interest rapidly in various perspectives from entertainment to e-learning. Practically, these applications suffer from inevitable loss in the transmission channels. Hence it is a challenging task to improve the quality of video streaming over the error prone channels. Multiple Description Coding (MDC) is a promising error resilient coding scheme which sends two or more descriptions of the source to the receiver to improve the quality of video streaming over error prone channels. Depending on the number of descriptions received, the reconstruction-distortion gets reduced at the receiver. Multiple State Video Coding (MSVC) is a MDC scheme based on frame-wise splitting of the video sequence into two or more sub-sequences. Each of these sub-sequences is encoded separately to generate descriptions, which can be decoded independently on reception. Basic MSVC is based on the separation of frames in a video into odd and even frames and sending each part over a different path. The drawbacks or certain subtleties of the basic MSVC such as lack of meaningful basis behind the frame wise splitting, inability to support adaptive streaming effectively, less error resiliency are brought out and discussed. Thus to overcome them and to improve the quality of video streaming, the design of a novel MSVC scheme based on the temporal adjacency between video frames is proposed in this paper. This temporal adjacency based splitting of the video stream into N sub-sequences also enables the proposed scheme to adapt to varying bandwidths in heterogeneous environments effectively. The simulation results show that the proposed scheme also outperforms Single State Video Coding (SSVC) scheme in terms of the sensitivity of perception of the reconstructed video sequence, under various loss scenarios.
- Compression | Pp. 728-737
doi: 10.1007/11949619_66
Adaptive Data Hiding in Compressed Video Domain
Arijit Sur; Jayanta Mukherjee
In this paper we propose a new adaptive block based compressed domain data hiding scheme which can embed relatively large number of secret bits without significant perceptual distortion in video domain. Macro blocks are selected for embedding on the basis of low inter frame velocity. From this subset, the blocks with high prediction error are selected for embedding. The embedding is done by modifying the quantized DCT AC coefficients in the compressed domain. The number of coefficients (both zero and non zero) used in embedding is adaptively determined using relative strength of the prediction error block. Experimental results show that this blind scheme can embed a relatively large number of bits without degrading significant video quality with respect to Human Visual System (HVS).
- Compression | Pp. 738-748
doi: 10.1007/11949619_67
Learning Segmentation of Documents with Complex Scripts
K. S. Sesh Kumar; Anoop M. Namboodiri; C. V. Jawahar
Most of the state-of-the-art segmentation algorithms are designed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the apriori information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book) and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla, Kannada, Malayalam, and Urdu.
- Document Processing/OCR | Pp. 749-760
doi: 10.1007/11949619_68
Machine Learning for Signature Verification
Harish Srinivasan; Sargur N. Srihari; Matthew J. Beal
Signature verification is a common task in forensic document analysis. It is one of determining whether a questioned signature matches known signature samples. From the viewpoint of automating the task it can be viewed as one that involves machine learning from a population of signatures. There are two types of learning to be accomplished. In the first, the training set consists of genuines and forgeries from a general population. In the second there are genuine signatures in a given case. The two learning tasks are called person-independent (or general) learning and person-dependent (or special) learning. General learning is from a population of genuine and forged signatures of several individuals, where the differences between genuines and forgeries across all individuals are learnt. The general learning model allows a questioned signature to be compared to a single genuine signature. In special learning, a person’s signature is learnt from multiple samples of only that person’s signature– where within-person similarities are learnt. When a sufficient number of samples are available, special learning performs better than general learning (5% higher accuracy). With special learning, verification accuracy increases with the number of samples.
- Document Processing/OCR | Pp. 761-775
doi: 10.1007/11949619_69
Text Localization and Extraction from Complex Gray Images
Farshad Nourbakhsh; Peeta Basa Pati; A. G. Ramakrishnan
We propose two texture-based approaches, one involving Gabor filters and the other employing log-polar wavelets, for separating text from non-text elements in a document image. Both the proposed algorithms compute local energy at some information-rich points, which are marked by Harris’ corner detector. The advantage of this approach is that the algorithm calculates the local energy at selected points and not throughout the image, thus saving a lot of computational time. The algorithm has been tested on a large set of scanned text pages and the results have been seen to be better than the results from the existing algorithms. Among the proposed schemes, the Gabor filter based scheme marginally outperforms the wavelet based scheme.
- Document Processing/OCR | Pp. 776-785
doi: 10.1007/11949619_70
OCR of Printed Telugu Text with High Recognition Accuracies
C. Vasantha Lakshmi; Ritu Jain; C. Patvardhan
Telugu is one of the oldest and popular languages of India spoken by more than 66 million people especially in South India. Development of Optical Character Recognition systems for Telugu text is an area of current research.
OCR of Indian scripts is much more complicated than the OCR of Roman script because of the use of huge number of combinations of characters and modifiers. Basic Symbols are identified as the unit of recognition in Telugu script. Edge Histograms are used for a feature based recognition scheme for these basic symbols. During recognition, it is observed that, in many cases, the recognizer incorrectly outputs a very similar looking symbol. Special logic and algorithms are developed using simple structural features for improving recognition accuracies considerably without too much additional computational effort. It is shown that recognition accuracies of 98.5 % can be achieved on laser quality prints with such a procedure.
- Document Processing/OCR | Pp. 786-795