Publication | Open Access
Handling Skewed Data: A Comparison of Two Popular Methods
50
Citations
10
References
2020
Year
EngineeringData PreparationSkewed DataData ScienceManagementData IntegrationBiostatisticsData ManagementLog TransformationStatisticsMedical StatisticKnowledge DiscoveryMultilevel ModelingData ManipulationGeneralized Linear ModelFunctional Data AnalysisData TreatmentStatistical InferenceData Modeling
Scientists in biomedical and psychosocial research need to deal with skewed data all the time. In the case of comparing means from two groups, the log transformation is commonly used as a traditional technique to normalize skewed data before utilizing the two-group t-test. An alternative method that does not assume normality is the generalized linear model (GLM) combined with an appropriate link function. In this work, the two techniques are compared using Monte Carlo simulations; each consists of many iterations that simulate two groups of skewed data for three different sampling distributions: gamma, exponential, and beta. Afterward, both methods are compared regarding Type I error rates, power rates and the estimates of the mean differences. We conclude that the t-test with log transformation had superior performance over the GLM method for any data that are not normal and follow beta or gamma distributions. Alternatively, for exponentially distributed data, the GLM method had superior performance over the t-test with log transformation.
| Year | Citations | |
|---|---|---|
Page 1
Page 1