Publication | Open Access
A Large Scale Analysis of Logistic Regression: Asymptotic Performance and New Insights
37
Citations
9
References
2019
Year
Unknown Venue
EngineeringMachine LearningRegression AnalysisNew InsightsSupport Vector MachineClassification MethodData ScienceData MiningPattern RecognitionLogistic Regression ClassifierStatisticsSupervised LearningPredictive AnalyticsKnowledge DiscoveryComputer ScienceStatistical Learning TheoryLarge Scale AnalysisHigh-dimensional MethodLogistic RegressionStatistical InferenceSeparating HyperplaneSemi-nonparametric Estimation
Logistic regression, one of the most popular machine learning binary classification methods, has been long believed to be unbiased. In this paper, we consider the "hard" classification problem of separating high dimensional Gaussian vectors, where the data dimension p and the sample size n are both large. Based on recent advances in random matrix theory (RMT) and high dimensional statistics, we evaluate the asymptotic distribution of the logistic regression classifier and consequently, provide the associated classification performance. This brings new insights into the internal mechanism of logistic regression classifier, including a possible bias in the separating hyperplane, as well as on practical issues such as hyper-parameter tuning, thereby opening the door to novel RMT-inspired improvements.
| Year | Citations | |
|---|---|---|
Page 1
Page 1