Fast-mRMR: Fast Minimum Redundancy Maximum Relevance Algorithm for High-Dimensional Big Data

Abstract

With the advent of large-scale problems, feature selection has become a fundamental preprocessing step to reduce input dimensionality. The minimum-redundancy-maximum-relevance (mRMR) selector is considered one of the most relevant methods for dimensionality reduction due to its high accuracy. However, it is a computationally expensive technique, sharply affected by the number of features. This paper presents fast-mRMR, an extension of mRMR, which tries to overcome this computational burden. Associated with fast-mRMR, we include a package with three implementations of this algorithm in several platforms, namely, CPU for sequential execution, GPU (graphics processing units) for parallel computing, and Apache Spark for distributed computing using big data technologies.

References

Page 1

	Year	Citations

Page 1