Concepedia

Publication | Closed Access

Comparison of statistical and machine learning methods in modelling of data with multicollinearity

154

Citations

25

References

2013

Year

Abstract

Multicollinearity occurs in a dataset due to correlation between the predictors. Models derived from such data without a check on multicollinearity may lead to erroneous system analysis. This problem can be eliminated by the selection of appropriate predictors from the dataset. Variable reduction methods like B2, B4, VIF, KIF and factor analysis (FA) can be used to overcome this problem. Such methods are useful particularly when used in conjunction with modelling methods that do not automate variable selection, such as artificial neural network (ANN) and fuzzy logic. The literature reveals that the current problem is aptly described in the field of statistics but is paid little attention in the field of machine learning. In this paper, multicollinearity is presented involving the estimation of fat content inside the body. Commonly used statistical methods such as stepwise regression, radial basis function partial least squares, partial robust M-regression, ridge regression and principal component regressi...

References

YearCitations

Page 1