Hospitals and health care providers tend to get involved in exaggerated and fraudulent medical claims initiated by national insurance schemes. The present study applies data mining techniques to detect fraudulent or abusive reporting by healthcare providers using their invoices for diabetic outpatient services. This research is pursued in the context of Taiwan's National Health Insurance system. We compare the identification accuracy of three algorithms: logistic regression, neural network, and classification trees. While all three are quite accurate, the classification tree model performs the best with an overall correct identification rate of 99%. It is followed by the neural network (96%) and the logistic regression model (92%).
- Medical insurance fraud; National health insurance; Diabetes mellitus; Data mining; Logistic regression; Neural networks; Classification trees