公告栏

IEEE Fellow系列学术报告


题目:Data Mining and its Application in Bioinformatics

数据挖掘及其在生物信息学的应用

报告人:胡小华副教授

美国Drexel 大学信息科学与技术学院

时间: 2009年2月24日下午3:30-5:00

地点:信息学院A楼三楼讲学厅


Abstract:

Despite an influx of molecular data in the form of sequences, structure, transcription profiles etc., most of the protein interaction information relevant to cell biology research still exists strictly in the scientific literature which is written in a natural language that computers cannot easily manipulate. Automatically mining and extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. In this talk, we present a novel approach Bio-IEDM (Biomedical Information Extraction and Data Mining) to integrate text mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two phases. In phase 1, we discuss a semi-supervised efficient learning approach to automatically extract biological relationships such as protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network. In phase 2, we present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks (communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical literatures. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular network.


Speaker:Associate Professor Xiaohua (Tony) Hu

College of Information Science and Technology,Drexel University

Philadelphia ,American


About the Speaker:

Xiaohua (Tony) Hu is currently an associate professor (early tenured in 2007) and the founding director of the data mining and bioinformatics lab at the College of Information Science and Technology, one of the best information science schools in USA (ranked as #1 in 1999 and #5 in 2006, 2007 in information systems by U.S. News & World Report). He is the now also serving as the IEEE Computer Society Bioinformatics and Biomedicine Steering Committee Chair and the IEEE Computational Intelligence Society Granular Computing Technical Committee Chair (2007-2008). Tony is a scientist, teacher and entrepreneur. He joined Drexel University in 2002, founded the International Journal of Data Mining and Bioinformatics in 2006 (SCI indexed), International Journal of Granular Computing, Rough Sets and Intelligent Systems in 2008. Earlier, he worked as a research scientist in the world-leading R&D centers such as Nortel Research Center, GTE labs and HP Labs. In 2001, he founded the DMW Software in Silicon Valley, California. His research ideas have been integrated into many commercial products and applications.


Tony’s current research interests are in biomedical literature data mining, bioinformatics, text mining, semantic web mining and reasoning, rough set theory and application, information extraction and information retrieval. He has published more than 160 peer-reviewed research papers in various journals, conferences and books such as various IEEE/ACM Transactions (IEEE/ACM TCBB, IEEE TFS, IEEE TDKE, IEEE TITB, IEEE Computer), JIS, KAIS, CI, DKE, IJBRA, SIG KDD, IEEE ICDM, IEEE ICDE, SIGIR, ACM CIKM, IEEE BIBE, IEEE CICBC etc, co-edited 9 books/proceedings. He has received a few prestigious awards including the 2005 National Science Foundation (NSF) Career award,(the most prestigious award from NSF to young faculty in USA), the best paper award at the 2007 International Conference on Artificial Intelligence, the best paper award at the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, the 2007 IEEE Bioinformatics and Bioengineering Outstanding Contribution Award, the 2006 IEEE Granular Computing Outstanding Service Award, and the 2001 IEEE Data Mining Outstanding Service Award. He has also served as a program co-chair/conference co-chair of 14 international conferences/workshops and a program committee member in more than 50 international conferences in the above areas. He is the founding editor-in-chief of the International Journal of Data Mining and Bioinformatics, an associate editor/editorial board member of four international journals (KAIS, IJDWM, IJSOI and JCIB). His research projects are funded by the National Science Foundation (NSF), US Dept. of Education, and the PA Dept. of Health and he has obtained more than US$3.8 millions research grants in the past 4 years as PI or Co-PI.


Tony has 8 years solid industry R& D experience and has converted many original research ideas into research prototype systems and eventually into commercial products. In his Ph.D. thesis entitled "Knowledge Discovery in Databases: An Attribute-Oriented Rough Set Approach", he introduced the rough set theory to data mining research and developed an attribute-oriented rough set approach for data mining and designed a research prototype system DBROUGH, which was later successfully transferred to the industry in Canada. From 1994-1998, he was a research scientist in data mining in Nortel Network Research Center, GTE Labs (Verizon Labs) etc. He had worked in many data mining related projects for real-time telephone switch system diagnosis, data managements, and wireless churn prediction. Among them, the CHAMP (CHurn Analysis, Modeling and Prediction) project was nominated for GTE’s highest technical achievement award in 1997. From 1998-2002, he had designed and developed data mining commercial software in various start-up companies (KSP, Blue Martini Software), KSP was acquired by Exchange Applications for $52 millions in April 2000. In 2001, Tony founded the company DMW software (Data Mining and Warehousing) in Silicon Valley, California. He has successfully deployed a few data mining products/systems to some Fortune 100 companies such as Chase, Citibank, Sprint for credit fraud detection, e-personalization and customer management systems.


Tony received his Ph.D. in Computer Science from University of Regina, Canada 1995, M.Sc. in Computer Science from Simon Fraser University, Canada in 1992, M.Eng. in Computer Engineering from Institute of Computing Technology, Chinese Academy of Science in 1988 and B.Sc. (Software) from Wuhan University in 1985.


2009-03-16