Prof. Dr. Uwe Quasthoff, Universität Leipzig, Germany
Prof. Uwe Quasthoff works at the Natural Language Processing Group at the Department of Computer Science at Leipzig University in Germany. His main research topics are language independent methods in Natural Language Processing, building very large corpora, and the structure of natural language. The research method is the analysis of large text corpora with statistical and pattern based as well as machine learning methods. The Leipzig Corpora collection (http://corpora.uni-leipzig.de/) started in 1995 and now contains pre-processed text collections and monolingual dictionaries in more than 250 languages. The approach is language independent, hence the algorithms for further processing apply usually to a large group of languages. The analysis of word co-occurrence patterns was the starting point for machine learning used for semantic similarities in different granularities.
Speech Title: Corpora as a resource for IR
Knowledge about words is helpful for IR. Knowledge about single words like word frequencies and knowledge about replacement candidates like inflected forms or synonyms are used heavily. Syntactic and semantic relations between consecutive words are of interest for text understanding. POS tagging and syntactic parsing is the bases for a deeper semantic analysis with statistical methods.
The talk will give an overview of the complete pipeline of corpus building and exploration: Crawling and preprocessing (language identification, sentence segmentation, tokenization, de-duplication, POS-Tagging etc.), word co-occurrences and semantic similarities using word embeddings, word and text classification problems.
As an approach to relation extraction and sentence understanding, so-called typical sentences are used: Sentences of simple syntactic structure repeatedly found with rich lexical variability. The syntactic structures are selected by high frequency, and with large corpora they allow the usage of word similarities to cluster such sentences to basic statements.
Prof. Tianrui Li, Southwest Jiaotong University, China
Tianrui Li received his B.S. degree, M.S. degree and P h.D. degree from the Southwest Jiaotong University, China in 1992, 1995 and 2002 respect ively. He was a Post-Doctoral Researcher at Belgian Nuclear Research Centre (SCK•CEN), Belgium from 2005-2006, a visiting professor at Hasselt University, Belgium in 2008, the University of Technology, Sydney, Australia in 2009 and the University of Regina, Canad a in 2014. And, he is presentl y a Professor and the Director of the Key Lab of Cloud Computing and Intelligent Technique of Sichuan Province, Southwest Jiaotong University, China. Since 2000, he has co-edited 6 books, 10 special issues of internationa l journals, 15 proceedings, received 5 Chinese invention patents and published over 240 research papers (e.g., IEEE TKDE, IEEE TEC, IEEE TFS, IEEE TIFS, IEEE ASLP, IEEE TIE, IEEE TC, IEEE TVT) in refereed journals and conferences (e.g., KDD, IJCAI, UbiComp). 3 papers were ESI Hot Papers and 12 papers was ESI Highly Cited Papers. His Google H-index is 32. He serves as the area editor of International Journal of Computational Intell igence Systems (SCI), editor of Knowledge-based Systems ( SCI) and Information Fusion ( SCI), etc. He is an IRSS fellow, a distinguished member of CCF, a senior member of ACM, IEEE, CAAI, ACM SIGKDD member, Chair of IEEE CIS Chengdu Chapter (2013-2018), Treasurer of ACM SIGKDD China Chapter and CCF YOCSEF Chengdu Chair (2013-2014). Over fifty graduate students (including 8 Post- Docs, 13 Doctors) have been trained. Their employment units include Microsoft Research Asia, Sichuan University, Baidu, Alibaba, Tencent and Huawei. They have received 2 "Si Shi Yang Hua" Medals, Best Papers/ Dissertation Awards 13 times, Champion of Sina Weibo Interaction-prediction at Tianchi Big Data Competition (Bonus 200,000 RMB), Second Place of Social Influence Analysis Contest of IJCAI-2016 Competitions.
More information will be relased soon...