题目:一种降维方法在急性白血病人分类中的应用
报告人:吴静静博士、教授(University of Calgary)
时间:2014-05-28上午10:00--11:00
地点:数学与统计学院六楼统计实验室
Abstract: The case-control studies are widely used in epidemiology to identify important indicators or factors. In this article, a two-sample semiparamtric model, which is equivalent to the logistic regression model for case-control data, is proposed to describe the gene expression level for acute lymphoblastic leukemia (ALL) patients and acute myeloid leukemia (AML) patients. Considering a leukemia data set containing 38 bone marrow samples (27 ALL, 11 AML), we propose the minimum Hellinger distance estimation (MHDE) for the underlying semiparametric model and compare the results with those based on the classical maximum likelihood estimation (MLE). Based on the MHDE and MLE, Wald tests of significance are carried out to select marker genes. Further, using the idea of minimizing the sum of weighted misclassification rates, we develop a new classification rule based on the selected marker genes. To test our proposed classification rule, another independent leukemia data set (20 ALL, 14 AML) is analyzed and the result shows that 31 out of 34 patients are successfully classified. In the training data set, 36 out of 38 are successfully classified. This work is joint with Guoqiang Chen.
吴静静教授简历:吴静静本科于1999年毕业于中央民族大学应用数学与软件专业,硕士于2002年毕业于北京师范大学概率论专业,随后到加拿大艾伯塔大学攻读博士学位并于2008年取得统计博士学位,其博士论文被加拿大统计协会评审为2007年度加拿大最佳概率统计博士论文。吴教授于2007年加入University of Calgary,2013年获得终身教授职位。目前,吴教授的研究主要集中于半参、非参模型中的统计推断问题及其在生物统计中的应用,其研究自2008年起一直得到加拿大自然科学基金的资助。