Catálogo de publicaciones - libros

Compartir en
redes sociales


Document Analysis Systems VII: 7th International Workshop, DAS 2006, Nelson, New Zealand, February 13-15, 2006, Proceedings

Horst Bunke ; A. Lawrence Spitz (eds.)

En conferencia: 7º International Workshop on Document Analysis Systems (DAS) . Nelson, New Zealand . February 13, 2006 - February 15, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Database Management; Pattern Recognition; Information Storage and Retrieval; Image Processing and Computer Vision; Simulation and Modeling; Computer Appl. in Administrative Data Processing

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-32140-8

ISBN electrónico

978-3-540-32157-6

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

Language Identification in Degraded and Distorted Document Images

Shijian Lu; Chew Lim Tan; Weihua Huang

This paper presents a language identification technique that differentiates Latin-based languages in degraded and distorted document images. Different from the reported methods that transform word images through a character shape coding process, our method directly captures word shapes with the local extremum points and the horizontal intersection numbers, which are both tolerant of noise, character segmentation errors, and slight skew distortions. For each language studied, a word shape template and a word frequency template are firstly constructed based on the proposed word shape coding scheme. Identification is then accomplished based on Bray Curtis or Hamming distance between the word shape code of query images and the constructed word shape and frequency templates. Experiments show the average identification rate upon eight Latin-based languages reaches over 99%. ...

Palabras clave: Text Image; Query Image; Document Image; Text Line; Word Image.

- Session 7: Language and Script Identification | Pp. 232-242

Bangla/English Script Identification Based on Analysis of Connected Component Profiles

Lijun Zhou; Yue Lu; Chew Lim Tan

Script identification is required for a multilingual OCR system. In this paper, we present a novel and efficient technique for Bangla/English script identification with applications to the destination address block of Bangladesh envelope images. The proposed approach is based upon the analysis of connected component profiles extracted from the destination address block images, however, it does not place any emphasis on the information provided by individual characters themselves and does not require any character/line segmentation. Experimental results demonstrate that the proposed technique is capable of identifying Bangla/English scripts on the real Bangladesh postal images.

Palabras clave: Document Image; Text Line; Text Block; Handwritten Text; Postal Stamp.

- Session 7: Language and Script Identification | Pp. 243-254

Script Identification from Indian Documents

Gopal Datt Joshi; Saurabh Garg; Jayanthi Sivaswamy

Automatic identification of a script in a given document image facilitates many important applications such as automatic archiving of multilingual documents, searching online archives of document images and for the selection of script specific OCR in a multilingual environment. In this paper, we present a scheme to identify different Indian scripts from a document image. This scheme employs hierarchical classification which uses features consistent with human perception. Such features are extracted from the responses of a multi-channel log-Gabor filter bank, designed at an optimal scale and multiple orientations. In the first stage, the classifier groups the scripts into five major classes using global features. At the next stage, a sub-classification is performed based on script-specific features. All features are extracted globally from a given text block which does not require any complex and reliable segmentation of the document image into lines and characters. Thus the proposed scheme is efficient and can be used for many practical applications which require processing large volumes of data. The scheme has been tested on 10 Indian scripts and found to be robust to skew generated in the process of scanning and relatively insensitive to change in font size. This proposed system achieves an overall classification accuracy of 97.11% on a large testing data set. These results serve to establish the utility of global approach to classification of scripts.

Palabras clave: Document Image; Text Line; Optimal Scale; Character Segmentation; Text Block.

- Session 7: Language and Script Identification | Pp. 255-267

Finding the Best-Fit Bounding-Boxes

Bo Yuan; Leong Keong Kwoh; Chew Lim Tan

The bounding-box of a geometric shape in 2D is the rectangle with the smallest area in a given orientation (usually upright) that complete contains the shape. The best-fit bounding-box is the smallest bounding-box among all the possible orientations for the same shape. In the context of document image analysis, the shapes can be characters (individual components) or paragraphs (component groups). This paper presents a search algorithm for the best-fit bounding-boxes of the textual component groups, whose shape are customarily rectangular in almost all languages. One of the applications of the best-fit bounding-boxes is the skew estimation from the text blocks in document images. This approach is capable of multi-skew estimation and location, as well as being able to process documents with sparse text regions. The University of Washington English Document Image Database (UW-I) is used to verify the skew estimation method directly and the proposed best-fit bounding-boxes algorithm indirectly.

Palabras clave: Document Image; Component Group; Text Block; Fiducial Point; Orientation Histogram.

- Session 7: Language and Script Identification | Pp. 268-279

Towards Versatile Document Analysis Systems

Henry S. Baird; Matthew R. Casey

The research goal of highly versatile document analysis systems, capable of performing useful functions on the great majority of document images, seems to be receding, even in the face of decades of research. One family of nearly universally applicable capabilities includes document image content extraction tools able to locate regions containing handwriting, machine-print text, graphics, line-art, logos, photographs, noise, etc. To solve this problem in its full generality requires coping with a vast diversity of document and image types. The severity of the methodological problems is suggested by the lack of agreement within the R&D community on even what is meant by a representative set of samples in this context. Even when this is agreed, it is often not clear how sufficiently large sets for training and testing can be collected and ground truthed. Perhaps this can be alleviated by discovering a principled way to amplify sample sets using synthetic variations. We will then need classification methodologies capable of learning automatically from these huge sample sets in spite of their poorly parameterized—or unparameterizable—distributions. Perhaps fast expected-time approximate k-nearest neighbors classifiers are a good solution, even if they tend to require enormous data structures: hashed k-d trees seem promising. We discuss these issues and report recent progress towards their resolution. Keyword: versatile document analysis systems, DAS methodology, document image content extraction, classification, k Nearest Neighbors, k-d trees, CART, spatial data structures, computational geometry, hashing

Palabras clave: Training Sample; Document Image; Content Type; Radius Search; Synthetic Variation.

- Session 9: Systems and Performance Evaluation | Pp. 280-290

Exploratory Analysis System for Semi-structured Engineering Logs

Michael Flaster; Bruce Hillyer; Tin Kam Ho

Engineering diagnosis often involves analyzing complex records of system states printed to large, textual log files. Typically the logs are designed to accommodate the widest debugging needs without rigorous plans on formatting. As a result, critical quantities and flags are mixed with less important messages in a loose structure. Once the system is sealed, the log format is not changeable, causing great difficulties to the technicians who need to understand the event correlations. We describe a modular system for analyzing such logs where document analysis, report generation, and data exploration tools are factored into generic, reusable components and domain-dependent, isolated plug-ins. The system supports incremental, focused analysis of complicated symptoms with minimal programming effort and software installation. We discuss important concerns in the analysis of logs that sets it apart from understanding natural language text or rigorously structured computer programs. We highlight the research challenges that would guide the development of a deep analysis system for many kinds of semi-structured documents.

Palabras clave: Document Image; Text Line; Optical Character Recognition; Text Block; Reusable Component.

- Session 9: Systems and Performance Evaluation | Pp. 291-301

Ground Truth for Layout Analysis Performance Evaluation

A. Antonacopoulos; D. Karatzas; D. Bridson

Over the past two decades a significant number of layout analysis (page segmentation and region classification) approaches have been proposed in the literature. Each approach has been devised for and/or evaluated using (usually small) application-specific datasets. While the need for objective performance evaluation of layout analysis algorithms is evident, there does not exist a suitable dataset with ground truth that reflects the realities of everyday documents (widely varying layouts, complex entities, colour, noise etc.). The most significant impediment is the creation of accurate and flexible (in representation) ground truth, a task that is costly and must be carefully designed. This paper discusses the issues related to the design, representation and creation of ground truth in the context of a realistic dataset developed by the authors. The effectiveness of the ground truth discussed in this paper has been successfully shown in its use for two international page segmentation competitions (ICDAR2003 and ICDAR2005).

Palabras clave: Ground Truth; Document Image; Text Region; Document Type Definition; Connected Component Analysis.

- Session 9: Systems and Performance Evaluation | Pp. 302-311

On Benchmarking of Invoice Analysis Systems

Bertin Klein; Stefan Agne; Andreas Dengel

An approach is presented to guide the benchmarking of invoice analysis systems, a specific, applied subclass of document analysis systems. The state of the art of benchmarking of document analysis systems is presented, based on the processing levels: Document Page Segmentation, Text Recognition, Document Classification, and Information Extraction. The restriction to invoices enables and requires a more purposeful, i.e. detailed, targetting of the benchmarking procedures (acquisition of ground truth data, system runs, comparison of data, condensation into meaningful numbers). Therefore the processing of invoices is dissected. The involved data structures are elicited and presented. These are provided, being the building blocks of the actual benchmarking of invoice analysis systems.

Palabras clave: IEEE Computer Society; Document Image; Ground Truth Data; Private Person; Payment Data.

- Session 9: Systems and Performance Evaluation | Pp. 312-323

Semi-automatic Ground Truth Generation for Chart Image Recognition

Li Yang; Weihua Huang; Chew Lim Tan

While research on scientific chart recognition is being carried out, there is no suitable standard that can be used to evaluate the overall performance of the chart recognition results. In this paper, a system for semi-automatic chart ground truth generation is introduced. Using the system, the user is able to extract multiple levels of ground truth data. The role of the user is to perform verification and correction and to input values where necessary. The system carries out automatic tasks such as text blocks detection and line detection etc. It can effectively reduce the time to generate ground truth data, comparing to full manual processing. We experimented the system using 115 images. The images and ground truth data generated are available to the public.

Palabras clave: Ground Truth; Feature Point; Text Component; Ground Truth Data; Line Detection.

- Session 9: Systems and Performance Evaluation | Pp. 324-335

Efficient Word Retrieval by Means of SOM Clustering and PCA

Simone Marinai; Stefano Faini; Emanuele Marino; Giovanni Soda

We propose an approach for efficient word retrieval from printed documents belonging to Digital Libraries. The approach combines word image clustering (based on Self Organizing Maps, SOM) with Principal Component Analysis. The combination of these methods allows us to efficiently retrieve the matching words from large documents collections without the need for a direct comparison of the query word with each indexed word.

Palabras clave: Principal Component Analysis; Digital Library; Word Image; Word Representation; Query Word.

- Session 10: Retrieval and Segmentation | Pp. 336-347