Cancer studies based on secondary data analysis of the Taiwan's National Health Insurance Research Database: A computational text analysis and visualization study
There has been a surge in the academic publication output based on secondary analyses of the data from the Taiwan's National Health Insurance claim records. It has become a challenge to comprehend such a rapid expansion of the literature. Therefore, this study aimed to explore the conceptual content of National Health Insurance Research Database-based cancer research, using the abstract of articles extracted from PubMed between 2002 and 2015. Search terms including “National Health Insurance Research Database (NHIRD) AND Taiwan,” “Taiwan AND population-based,” and “Taiwan AND nationwide” were used to search in PubMed with the publication date limited to between 1997 and 2015. The retrieved articles were manually screened to retain only those that were cancer-related and were based on secondary data analysis of the NHIRD. A total 589 articles were selected for subsequent text mining using the R software. Among the 589 articles, the top 5 most studied cancer types were breast (16.3%), lung (11.4%), colorectal (10.4%), liver (8.3%), and prostate (7.5%). The articles that received the highest number of citations by PubMed Central articles were cited 92 times. The top 3 most frequently occurred keywords in the abstracts of the 589 articles were cancer, patient, and risk, with 3670, 2535, and 1652 times, respectively. Analysis of key conception indicated that the most common conceptions were diabetes, survival, breast cancer, lung cancer, and colorectal cancer. In conclusion, in this study of 589 published articles on secondary data analysis of the NHIRD, indexed by PubMed between 2002 and 2015, we found that while the risk factors of cancer, treatment of cancer, and survival of cancer patients were popular research topics, end-of-life cancer care issues were less studied. Further studies should explore these areas since they are as important as treatment of the disease itself for many patients.