In recent years, telecommunication fraud has become more rampant internationally with the development of modern technology and global communication. Because of rapid growth in the volume of call logs, the task of fraudulent phone call detection is confronted with big data issues in real-world implementations. Although our previous work, FrauDetector, addressed this problem and achieved some promising results, it can be further enhanced because it focuses only on fraud detection accuracy, whereas the efficiency and scalability are not top priorities. Other known approaches for fraudulent call number detection suffer from long training times or cannot accurately detect fraudulent phone calls in real time. However, the learning process of FrauDetector is too time-consuming to support real-world application. Although we have attempted to accelerate the the learning process of FrauDetector by parallelization, the parallelized learning process, namely PFrauDetector, still cannot afford the computing cost. In this article, we propose a highly efficient incremental graph-mining-based fraudulent phone call detection approach, namely FrauDetector + , which can automati-8 cally label fraudulent phone numbers with a “fraud” tag a crucial prerequisite for distinguishing fraudulent phone call numbers from nonfraudulent ones. FrauDetector + initially generates smaller, more manageable subnetworks from original graph and performs a parallelized weighted HITS algorithm for a significant speed increase in the graph learning module. It adopts a novel aggregation approach to generate a trust (or experience) value for each phone number (or user) based on their respective local values. After the initial procedure, we can incrementally update the trust (or experience) value for each phone number (or user) while a new fraud phone number is identified. An efficient fraud-centric hash structure is constructed to support fast real-time detection of fraudulent phone numbers in the detection module. We conduct a comprehensive experimental study based on real datasets collected through an antifraud mobile application called Whoscall. The results demonstrate a significantly improved efficiency of our approach compared with FrauDetector as well as superior performance against other major classifier-based methods.
- Fraudulent phone call detection
- Incremental learning
- Parallelized weighted HITS algorithm
- Telecommunication fraud
- Trust value mining