[01009] A phase transition of various retention rules from multivariate analysis for big datasets.
Session Time & Room : 3D (Aug.23, 15:30-17:10) @E501
Type : Industrial Contributed Talk
Abstract : Estimating the number of significant components(factors, resp.) from principal component analysis(explanatory factor analysis, resp.) in datasets of finance/biology is essential. However, statistical software's default estimation method behaves pathologically for big datasets. We analyze the phase transition of the default rule as to the intra-class correlation of various data-generation models, and introduce a more acceptable estimation by random matrix theory for large sample correlation matrices. We also compare our rule to retention rules proposed to date.