首页

学术报告

当前位置: 首页 -> 学术报告 -> 正文

美国密西西比大学桑海林教授学术报告

作者: 来源: 阅读次数: 日期:2023-05-10

报告人:Professor Hailin Sang  (The University of Mississippi)

时间与地点:2023511日,9:00-10:00Zoomhttps://olemiss.zoom.us/my/hsang

报告题目:Double descent curve and error analysis of GAN models

AbstractI plan to cover two parts on the theoretical study of deep neural networks. One is the double descent phenomenon recently discovered in the literature and the other is on the generalization error of generative adversarial networks (GAN).

Recent studies observed a surprising concept about test error called the double descent phenomenon where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we worked on a two-layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim was to observe and find the mathematical theory behind the double descent behavior of the test error in the model for varying over-parameterization and under-parameterization ratios. We have been able to derive a closed-form solution for the test error of the model and a theorem to find the parameters with optimal empirical loss when model complexity increases. We proved the existence of the double descent phenomenon in our model for square loss function using the theorems derived.

The generative adversarial network (GAN) is a vital model developed for high-dimensional distribution learning in recent years. However, a general method is needed to understand its error bound and convergence rate. We study the convergence rate of the error for GAN. The error bound was developed and explained by a class of functions with the discriminator and generator neural networks. The new class of functions is uniformly bounded, VC type with bounded envelope function and VC dimension which make it possible applying Talagarnd inequality. We found a tight convergence rate for the error of GAN after applying Talagrand inequality. The error bound was generalized for the existing error estimation of GAN and we obtain better convergence rate.


This talk is based on two projects jointly with Chathurika Abeykoon and Mahmud Hasan.