题 目(TITLE):Word Sense Disambiguation for All Words without Hard Labor
讲 座 人(SPEAKER): Hwee Tou Ng, Department of Computer Science, National University of Singapore
主 持 人 (CHAIR):Prof. Chengqing Zong
时 间 (TIME):9:30am, February 5, 2010
地 点 (VENUE): Coffee House,13 floor
报告摘要(ABSTRACT):
While the most accurate word sense disambiguation systems are built using supervised learning from sense-tagged data, scaling them up to all words of a language has proved elusive, since preparing a sense-tagged corpus for all words of a language is time-consuming and human labor intensive.
In this talk, a completely automatic approach to scale up word sense disambiguation to all words of English is proposed and implemented. The approach relies on English-Chinese parallel corpora, English-Chinese bilingual dictionaries, and automatic methods of finding synonyms of Chinese words. No additional human sense annotations or word translations are needed.
A large-scale empirical evaluation was conducted on more than 29,000 noun tokens in English texts annotated in OntoNotes 2.0, based on its coarse-grained sense inventory. The evaluation results show that this approach is able to achieve high accuracy, outperforming the first-sense baseline and coming close to a prior reported approach that requires manual human efforts to provide Chinese translations of English senses. This talk is based on joint work with Zhi Zhong.
报告人简介(BIOGRAPHY):
Dr. Hwee Tou NG is an Associate Professor of Computer Science at the National University of Singapore, Program Co-chair (Computer Science Program) of the Singapore-MIT Alliance, and a Senior Faculty Member at the NUS Graduate School for Integrative Sciences and Engineering. He received a PhD in Computer Science from the University of Texas at Austin, USA. His research focuses on natural language processing and information retrieval. He has published papers in premier journals and conferences including Computational Linguistics, ACM TOIS, ACL, EMNLP, SIGIR, AAAI, and IJCAI. He is the Editor-in-Chief of ACM Transactions on Asian Language Information Processing (TALIP), and an editorial board member of Journal of Artificial Intelligence Research (JAIR) and Natural Language Engineering. He has also served as an editorial board member of Computational Linguistics journal (2004 - 2006). He is an elected member of the ACL executive committee (2008 - 2010) and a steering committee member and former secretary of ACL SIGNLL. He was program co-chair of EMNLP-2008, ACL-2005, and CoNLL-2004 conferences, and has served as area chair of ACL, EMNLP, and SIGIR conferences and as session chair and program committee member of many past conferences including ACL, EMNLP, SIGIR, AAAI, and IJCAI.
承办单位:模式识别国家重点实验室 |