Registered Data

[01009] A phase transition of various retention rules from multivariate analysis for big datasets.

  • Session Time & Room : 3D (Aug.23, 15:30-17:10) @E501
  • Type : Industrial Contributed Talk
  • Abstract : Estimating the number of significant components(factors, resp.) from principal component analysis(explanatory factor analysis, resp.) in datasets of finance/biology is essential. However, statistical software's default estimation method behaves pathologically for big datasets. We analyze the phase transition of the default rule as to the intra-class correlation of various data-generation models, and introduce a more acceptable estimation by random matrix theory for large sample correlation matrices. We also compare our rule to retention rules proposed to date.
  • Classification : 60F15, 62H25
  • Format : Online Talk on Zoom
  • Author(s) :
    • Atina Husnaqilati (Mathematics Department, Universitas Gadjah Mada)
    • Yohji Akama (Mathematical Institute, Tohoku University)