Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, which is potentially a valuable source of information for improving clinical care and supporting research in medical domain.1,2 Used on radiology reports and biopsy results, NLP techniques enable automatic identification and extraction of information. For most patients suffer from prostate diseases, they always conduct tests including radiology and biopsy. And it is an interesting task to explore the prostate MR findings and prostate biopsy, which have not been studied in other research center.
Purpose
Methods
The current study was approved by hospital, and the reports used in this study were obtained from patients underwent prostate MR and biopsy from Mar. 2015 to Dec. 2016. The diagnostic sites in MR reports and corresponding prostate biopsy reports were collected. Prostate biopsy report contains the number of puncture needles and which needles can find malignant indication. The patients who have no diagnostic site or have no information about puncture needles were excluded. In total, 104 patients were included in our study. The specific process is as follows:
1. Extraction and normalization in MR reports: In order to extract and normalize diagnostic sites in MR report, a standard prostate anatomical site form was constructed, which contains 3 levels (section, zone and side). Using the knowledge, we built a knowledge-based extraction method to convert every site description into a tri-tuple contains Section, Zone and Side.
2. Extracting information in biopsy reports: As for prostate biopsy reports, we focused on the number of biopsy needles and which needles can find tumor. A rule-based method was constructed to extract the two elements. Another information we extracted is the biopsy diagnosis, such as prostate cancer, hyperplasia, inflammation and no tumor. And all analysis was conducted in patients with prostate cancer.
3. Correlation analysis: To conduct correlation analysis between diagnostic sites in MR report and the number of puncture needles, we generated the relation between the number of diagnostic sites and the number of puncture needles, the relation between the number of diagnostic sections and the number of puncture needles, along with the relation between the number of diagnostic zones and the number of puncture needles. The correlation between specific sites and the number of puncture needles along with which needles can find tumor was generated using the same method.
Results
Totally 104 patients were included in our study, in which 59 patients with prostate cancer, 6 patients with benign prostate hyperplasia, 15 patients with inflammation and 20 patients with no diagnosis of prostate cancer were included in final results. Other patients were excluded because of uncertainty on which needles can find tumor. And all analysis was conducted just in patients with prostate cancer.
1. The results (Fig. 1) indicate that most MR report just have one or two sites and the common number of puncture needles are 12,13 and 15.
2. The correlation between specific sites and the number of puncture needles along with which needles can find tumor are showed in Fig. 2. We found that all patients who have “Matrix zone” in MR diagnosis would have 13 puncture needles in biopsy.
Discussion
Information extraction using Natural Language Processing (NLP) and series of correlation analysis were conducted in this study and our results show few interesting findings, in particular, patients who have “Matrix zone” in MR diagnosis would have 13 puncture needles in biopsy. We think that the position of Matrix zone in MR reports is in corresponding to the region of 13 puncture needles in biopsy. Limited number of cases and the single method may influenced results. So in the future, more eligible patients could be collected, we also plan to optimize our analysis methods using suitable statistical model.Conclusion
It is feasible to use the natural language processing to explore the correlation of prostate MR findings and prostate biopsy.1. Pons E, Braun LM, Hunink MG, et al. Natural Language Processing in Radiology: A Systematic Review. Radiology. 2016;279(2):329 - 343.
2. Cai T, Giannopoulos AA, Yu S, et al. Natural Language Processing Technologies in Radiology Research and Clinical Application. Radiographics. 2016;36(1):176 - 191.