Lecture Series in Pattern Recognition
题 目 (TITLE)：Document Image Classification and Retrieval
讲 座 人 (SPEAKER)：Prof. David Doermann (University of Maryland College Park, US)
主 持 人 (CHAIR)：Prof. Chenglin Liu
时 间 (TIME)： September 24(Tuesday), 2013, 14:30-16:30 PM
地 点 (VENUE)：No.1 Conference Room (3rd floor), Intelligence Building
Traditional approaches to document retrieval focus on conversion to electronic text followed by indexing of the text content. Recently some work in the community has focused on indexing document image content directly. In this talk, we will overview work at Maryland on Classification and Indexing that scales to millions of documents. First we present a learning based approach for computing structural similarities among document images for unsupervised exploration in large document collections. The approach is based on multiple levels of content and structure. At a local level, a bag-of-visual words based on SURF features provides an effective way of computing content similarity. The document is then recursively partitioned and a histogram of codewords is computed for each partition. Structural similarity is computed using a random forest classifier trained with these histogram features. We experiment with three diverse datasets of document images varying in size, degree of structural similarity, and types of document images. Second, we present a scalable algorithm for segmentation free content retrieval in document images. The contributions of this paper include the use of the SURF feature for image passage retrieval, a novel indexing algorithm for efficient retrieval of SURF features and a method to filter results using the orientation of local features and geometric constraints. Results demonstrate that logo, signature block and stamp retrieval can be performed with high accurately and efficiently scaled to a large datasets.
Dr. Doermann will be available to meet with students. He will also highlight the University of Maryland graduate program as part of his talk, so students considering graduate school in the US are encouraged to attend.
Dr. David Doermann is a senior research scientist in UMIACS. He received a B.Sc. degree in Computer Science and Mathematics from Bloomsburg University in 1987, and a M.Sc. degree in 1989 in the Department of Computer Science at the University of Maryland, College Park. He continued his studies in the Computer Vision Laboratory, where he earned a Ph.D. 1993. Since 1993, he has served as co-director of the Laboratory for Language and Media Processing in the University of Maryland's Institute for Advanced Computer Studies and as an adjunct member of the graduate faculty.
His team of researchers focuses on topics related to document image analysis and multimedia information processing. In 2002 he received an Honorary Doctorate of Technology Sciences from the University of Oulu for his contributions to digital media processing and document analysis research. He is a founding co-editor of the International Journal on Document Analysis and Recognition, has the General Chair or Co-Chair of over a half dozen international conferences and workshops and was the General Chair of the International Conference on Document Analysis and Recognition (ICDAR) held in Washington DC in 2013. He has over 30 journal publications and over 160 refereed conference papers.