Publication | Closed Access
Catboost-based Framework with Additional User Information for Social Media Popularity Prediction
43
Citations
1
References
2019
Year
Unknown Venue
EngineeringMachine LearningSocial Medium MonitoringSocial Media PopularityAdditional User InformationText MiningCatboost TrainingNatural Language ProcessingComputational Social ScienceSocial MediaInformation RetrievalData ScienceData MiningLanguage StudiesContent AnalysisSocial Medium MiningPredictive AnalyticsKnowledge DiscoveryComputer ScienceSocial Media PredictionCatboost-based FrameworkSocial ComputingSocial Medium Data
In this paper, a Catboost-based framework is proposed to predict social media popularity. The framework is constituted by two components: feature representation and Catboost training. In the component of feature representation, numerical features are directly used, while categorical features are converted into numerical features by a method of order target statistics in Catboost. Besides, some additional user information is also tracked to enrich the feature space. In the other component, Catboost is adopted as the regression model which is trained by using post-related, user-related and additional user information. Moreover, to make full use of the dataset for model training, a dataset augmentation strategy based on pseudo labels is proposed. This strategy involves in two-stage training. In the first stage, it trains a first-stage model that is used to label the test set as pseudo labeled. In the next stage, a final model is trained based on the new training set that includes original validation set and the pseudo labeled test set. The proposed method achieves the 2nd place in the leader board of the Grand Challenge of Social Media Prediction.
| Year | Citations | |
|---|---|---|
Page 1
Page 1