Prof. Dr. Uwe Quasthoff, Universität Leipzig, Germany
Prof. Uwe Quasthoff works at the Natural Language Processing Group at the Department of Computer Science at Leipzig University in Germany. His main research topics are language independent methods in Natural Language Processing, building very large corpora, and the structure of natural language. The research method is the analysis of large text corpora with statistical and pattern based as well as machine learning methods. The Leipzig Corpora collection (http://corpora.uni-leipzig.de/) started in 1995 and now contains pre-processed text collections and monolingual dictionaries in more than 250 languages. The approach is language independent, hence the algorithms for further processing apply usually to a large group of languages. The analysis of word co-occurrence patterns was the starting point for machine learning used for semantic similarities in different granularities.
Speech Title: Corpora as a resource for IR
Knowledge about words is helpful for IR. Knowledge about single words like word frequencies and knowledge about replacement candidates like inflected forms or synonyms are used heavily. Syntactic and semantic relations between consecutive words are of interest for text understanding. POS tagging and syntactic parsing is the bases for a deeper semantic analysis with statistical methods.
The talk will give an overview of the complete pipeline of corpus building and exploration: Crawling and preprocessing (language identification, sentence segmentation, tokenization, de-duplication, POS-Tagging etc.), word co-occurrences and semantic similarities using word embeddings, word and text classification problems.
As an approach to relation extraction and sentence understanding, so-called typical sentences are used: Sentences of simple syntactic structure repeatedly found with rich lexical variability. The syntactic structures are selected by high frequency, and with large corpora they allow the usage of word similarities to cluster such sentences to basic statements.
Prof. Tianrui Li, Southwest Jiaotong University, China
Tianrui Li received his B.S. degree, M.S. degree and P h.D. degree from the Southwest Jiaotong University, China in 1992, 1995 and 2002 respect ively. He was a Post-Doctoral Researcher at Belgian Nuclear Research Centre (SCK • CEN), Belgium from 2005-2006, a visiting professor at Hasselt University, Belgium in 2008, t he University of Technology, Sydney, Australia in 2009 and the University of Regina, Canad a in 2014. And, he is presentl y a Professor and the Director of the Key Lab of Cloud Computing and Intelligent Technique of Sichuan Province, Southwest Jiaotong University, China. Since 2000, he has co-edited 6 books, 10 special issues of internationa l journals, 15 proceedings, received 5 Chinese invention patents and published over 240 research papers (e.g., IEEE TKDE, IEEE TEC, IEEE TFS, IEEE TIFS, IEEE ASLP, IEEE TIE, IEEE TC, IEEE TVT) in refereed journals and conferences (e.g., KDD, IJCAI, UbiComp). 3 papers were ESI Hot Papers and 12 papers was ESI Highly Cited Papers. His Google H-index is 32. He serves as the area editor of International Journal of Computational Intell igence Systems (SCI), editor of Knowledge-based Systems ( SCI) and Information Fusion ( SCI), etc. He is an IRSS fellow, a distinguished member of CCF, a senior member of ACM, IEEE, CAAI, ACM SIGKDD member, Chair of IEEE CIS Chengdu Chapter (2013-2018), Treasurer of ACM SIGKDD China Chapter and CCF YOCSEF Chengdu Chair (2013-2014). Over fifty graduate students (including 8 Post- Docs, 13 Doctors) have been trained. Their employment units include Microsoft Research Asia, Sichuan University, Baidu, Alibaba, Tencent and Huawei. They have received 2 "Si Shi Yang Hua" Medals, Best Papers/ Dissertation Awards 13 times, Champion of Sina Weibo Interaction-prediction at Tianchi Big Data Competition (Bonus 200,000 RMB), Second Place of Social Influence Analysis Contest of IJCAI-2016 Competitions.
Speech Title: Data-Driven Intelligence: Challenges and our Solutions
Abstract: Data-Driven Intelligence has become a hot research topic in the area of information science. This talk aims to outline the challenges on Data-Driven Intelligence. Then our solutions for Data-Driven Intelligence are provided, which cover the following aspects. 1) A hierarchical entropy-based approach is demonstrated to evaluate the effectiveness of data collection, the first step of Data-Driven Intelligence. 2) A multi-view-based method is illustrated for filling missing data, the preprocessing step for Data-Driven Intelligence. 3) A unified framework is outlined for Parallel Large-scale Feature Selection to manage Big Data with high dimension. 4) A MapReduce-based parallel method together with three parallel strategies are presented to compute rough set approximations for classification, which is a fundamental part in rough set-based data analysis similar to frequent pattern mining in association rules. 5) Incremental learning-based approaches are shown for updating approximations and knowledge in dynamic data environments, e.g., the variation of objects, attributes or attribute values, which improve the computational efficiency by using previously acquired learning results to facilitate knowledge maintenance without re-implementing the original data mining algorithm. 6) A deep-learning-based model to deal with multiple different sources of data is developed.
Assoc. Prof. Phayung Meesad
Dean, Faculty of Information Technology, King Mongkut's University of Technology North Bangkok, Thailand
Phayung Meesad was graduated in Master of Science and Doctoral of Philosophy in Electrical Engineering from Oklahoma State University, Stillwater, USA in 1998 and 2002, respectively. He is now an Associate Professor and Dean at the Faculty of Information Technology, King Mongkut's University of Technology North Bangkok, Thailand. His research of interests are in the area of Data Mining, Data Science, Deep Learning, Time Series Analysis, and Natural Language Processing.
Speech Title: Deep Learning and Applications
Abstract: Deep Learning has been extremely successful in many fields such as image processing and natural language processing. Convolutional Neural Network (CNN) and Long Short Term Memory Recurrent Neural Network (LSTM-RNN) are probably most search hit in Deep Learning fields. CNNs are popular in image processing while LSTMs seem to play big roles in Time series data and natural language processing. This talk gives brief reviews about Deep Learning focusing on CNNs and LSTMs as well as their applications.