模式识别国家重点实验室

中国科学院自动化研究所

设为首页加入收藏联系我们

English

网站首页

实验室概况

研究队伍

组织机构

学术交流

科研成果

人才培养

开放课题

创新文化

资源共享

联系我们

新闻标题搜索：

学术讲座

2013-10-15 An Information Extraction Approach to Next-Generation Speech Processing

模式识别系列讲座

Lecture Series in Pattern Recognition

题目 (TITLE)：An Information Extraction Approach to Next-Generation Speech Processing

讲 座 人 (SPEAKER)：Prof. Chin-Hui Lee (Georgia Institute of Technology)

主持人 (CHAIR)：Prof. Jianhua Tao

时间 (TIME)： October 15(Tuesday), 2013, 10:00 AM

地点 (VENUE)：No.1 Conference Room (3rd floor), Intelligence Building

报告摘要（ABSTRACT）：

The field of automatic speech recognition (ASR) has enjoyed more than 30 years of technology advancement due to the extensive utilization of the hidden Markov model (HMM) framework and a concentrated effort by the community to make available a vast amount of language resources. However the ASR problem is still far from being solved because not all information available in the speech knowledge hierarchy can be directly and effectively integrated into state-of-the-art systems to improve ASR performance and enhance system robustness. It is believed that some of the current knowledge insufficiency issues can be partially addressed by processing techniques that can take advantage of the full set of acoustic and language information in speech. On the other hand in human speech recognition (HSR) and spectrogram reading we often determine the linguistic identity of a sound based on detected cues and evidences that exist at various levels of the speech knowledge hierarchy, ranging from acoustic phonetics to syntax and semantics. This calls for a bottom-up knowledge integration framework that links speech processing with information extraction, by spotting speech cues with a bank of attribute detectors, weighing and combining acoustic evidences to form cognitive hypotheses, and verifying these theories until a consistent recognition decision can be reached. The recently proposed ASAT (automatic speech attribute transcription) framework is an attempt to mimic some HSR capabilities with asynchronous speech event detection followed by bottom-up speech knowledge integration and verification. In the last few years it has demonstrated potentials and offered insights in detection-based speech processing and information extraction.

This presentation is intended to illustrate new possibilities of future speech via linking analysis and processing of raw speech signals with extracting multiple layers of useful information. We will also demonstrate that the same methodology used in speech attribute detection and knowledge integration can be extended to extracting language information from heterogeneous media signals for multimedia event detection (MED) and multimedia event recounting (MER).

报告人简介（BIOGRAPHY）：

Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Dr. Lee received the B.S. degree in Electrical Engineering from National Taiwan University, Taipei, in 1973, the M.S. degree in Engineering and Applied Science from Yale University, New Haven, in 1977, and the Ph.D. degree in Electrical Engineering with a minor in Statistics from University of Washington, Seattle, in 1981.

友情链接

中科院自动化研究所模式识别国家重点实验室事业单位京ICP备14019135号-3
NLPR, INSTITUTE OF AUTOMATION, CHINESE ACADEMY OF SCIENCES