Concepedia

Publication | Closed Access

Optimized Conversion of Categorical and Numerical Features in Machine Learning Models

10

Citations

12

References

2021

Year

Abstract

While some data have an explicit, numerical form, many other data, such as gender or nationality, do not typically use numbers and are referred to as categorical data. Thus, machine learning algorithms need a way of representing categorical information numerically in order to be able to analyze them. Our project specifically focuses on optimizing the conversion of categorical features to a numerical form in order to maximize the effectiveness of various machine learning models. From the methods utilized, it has been observed that wide and deep is the most effective model for datasets that contain high-cardinality features, as opposed to learn embedding and one-hot encoding.

References

YearCitations

Page 1