The prevalence of Android platform has attracted adversaries to craft malicious payloads for illegal profit. Such malicious artifacts are frequently reused and embedded in benign, paid apps to lure victims that the apps have been cracked for free. To discover these fraudulent apps, administrators of app markets desire an automated scanning process to maintain the health of app ecosystem. However, conventional approaches cannot be efficiently applied due to the lack of a scalable, effective approach to malware characteristics aggregation. On the other hand, the vast number of apps significantly increases the analysis complexity. In this paper, we propose Petridish which generates discriminative models against the repacked malicious apps. These representative models of malicious semantics can be progressively distilled with malign and benign samples. These models can further detect repacked malicious apps. Our experiment shows that, after two retraining rounds, Petridish achieved an average of 28 percent progressive detection improvement from 63 to 91.2 percent for the large families, exceeding 38 test samples in size. With noise reduction, it accomplished 88 percent detection rate and 1.7 percent false alarm rate. The characteristics aggregation approach will become critical in the age of app explosion.
- Repackaged apps