Experimental Study with Real-world Data for Android App Security Analysis using Machine Learning

TLDR

Machine‑learning approaches for Android malware detection show promise, yet critical challenges in evaluation and design remain unaddressed. This study systematically investigates how these challenges affect detection performance and advocates for improved evaluation strategies and design choices. We built an experimentation framework that varies key parameters on a large market‑scale dataset of benign and malicious apps to answer the research questions. The experiments reveal that certain evaluation and design challenges significantly degrade the effectiveness of existing ML‑based malware detectors.

Abstract

Although Machine Learning (ML) based approaches have shown promise for Android malware detection, a set of critical challenges remain unaddressed. Some of those challenges arise in relation to proper evaluation of the detection approach while others are related to the design decisions of the same. In this paper, we systematically study the impact of these challenges as a set of research questions (i.e., hypotheses). We design an experimentation framework where we can reliably vary several parameters while evaluating ML-based Android malware detection approaches. The results from the experiments are then used to answer the research questions. Meanwhile, we also demonstrate the impact of some challenges on some existing ML-based approaches. The large (market-scale) dataset (benign and malicious apps) we use in the above experiments represents the real-world Android app security analysis scale. We envision this study to encourage the practice of employing a better evaluation strategy and better designs of future ML-based approaches for Android malware detection.

References

Page 1

	Year	Citations

Page 1