Monday, May 20, 2024
HomeMen's HealthEvaluating ChatGPT for structured information extraction from scientific notes

Evaluating ChatGPT for structured information extraction from scientific notes

In a current research printed in npj Digital Drugs, researchers evaluated ChatGPT’s capability to extract structured information from unstructured scientific notes.

Study: A critical assessment of using ChatGPT for extracting structured data from clinical notes. Image Credit: TippaPatt / Examine: A important evaluation of utilizing ChatGPT for extracting structured information from scientific notes. Picture Credit score: TippaPatt /

AI in medication

Giant-language-based fashions (LLMs), together with Generative Pre-trained Transformer (GPT) synthetic intelligence (AI) fashions like ChatGPT, are utilized in healthcare to enhance patient-clinician communication.

Conventional pure language processing (NLP) approaches like deep studying require problem-specific annotations and mannequin coaching. Nevertheless, the shortage of human-annotated information, mixed with the bills related to these fashions, makes constructing these algorithms tough.

Thus, LLMs like ChatGPT present a viable various by counting on logical reasoning and data to assist language processing.

In regards to the research

Within the current research, researchers create an LLM-based methodology for extracting structured information from scientific notes and subsequently changing unstructured textual content into structured and analyzable information. To this finish, the ChatGPT 3.50-turbo mannequin was used, as it’s related to particular Synthetic Normal Intelligence (AGI) capabilities.

An summary of the method and framework of utilizing ChatGPT for structured information extraction from pathology experiences. a Illustration of using OpenAI API for batch queries of ChatGPT service, utilized to a considerable quantity of scientific notes — pathology experiences in our research. b A normal framework for integrating ChatGPT into real-world functions.

A complete of 1,026 lung tumor pathology experiences and 191 pediatric osteosarcoma experiences from the Most cancers Digital Slide Archive (CDSA), which served because the coaching set, in addition to the Most cancers Genome Atlas (TCGA), which served because the testing set, had been remodeled to textual content utilizing R program. Textual content information was subsequently analyzed utilizing the OpenAI API, which extracted structured information primarily based on particular prompts.

ChatGPT API was used to carry out batch queries, adopted by immediate engineering to name the GPT service. Publish-processing concerned parsing and cleansing GPT output, evaluating GPT outcomes in opposition to reference information, and acquiring suggestions from area consultants. These processes aimed to extract data on TNM staging and histology kind as structured attributes from unstructured pathology experiences. Duties assigned to ChatGPT included estimating focused attributes, evaluating certainty ranges, figuring out key proof, and producing a abstract.

From the 99 experiences acquired from the CDSA database, 21 had been excluded as a consequence of low scanning high quality, near-empty information content material, or lacking experiences. This led to a complete of 78 real pathology experiences used to coach the prompts. To evaluate mannequin efficiency, 1,024 pathology experiences had been obtained from cBioPortal, 97 of which had been eradicated as a consequence of overlapping with coaching information.

ChatGPT was directed to make the most of the seventh version of the American Joint Committee on Most cancers (AJCC) Most cancers Staging Guide for reference. Knowledge analyzed included major tumor (pT) and lymph node (pN) staging, histological kind, and tumor stage. The efficiency of ChatGPT was in comparison with that of a key phrase search algorithm and deep learning-based Named Entity Recognition (NER) method.

An in depth error evaluation was carried out to establish the kinds and potential causes for misclassifications. The efficiency of GPT model 3.50-Turbos and GPT-4 had been additionally in contrast.

Examine findings

ChatGPT model 3.50 achieved 89% accuracy in extracting pathological classifications from the lung tumor dataset, thus outperforming the key phrase algorithm and NER Labeled, which had accuracies of 0.9, 0.5, and 0.8, respectively. ChatGPT additionally precisely labeled grades and margin standing in osteosarcoma experiences, with an accuracy charge of 98.6%.

Mannequin efficiency was affected by the educational immediate design, with most misclassifications as a consequence of a scarcity of particular pathology terminologies and improper TNM staging guideline interpretations. ChatGPT precisely extracted tumor data and used AJCC staging tips to estimate tumor stage; nonetheless, it usually used incorrect guidelines to tell apart pT classes, corresponding to deciphering a most tumor dimension of two centimeters as T2.

Within the osteosarcoma dataset, ChatGPT model 3.50 exactly labeled margin standing and grades with an accuracy of 100% and 98.6%, respectively. ChatGPT-3.50 additionally carried out constantly over time in pediatric osteosarcoma datasets; nonetheless, it continuously misclassified pT, pN, histological kind, and tumor stage.

Tumor stage classification efficiency was assessed utilizing 744 situations with correct experiences and reference information, 22 of which had been as a consequence of error propagation, whereas 34 had been as a consequence of improper rules. Assessing the classification efficiency of histological prognosis utilizing 762 situations confirmed that 17 circumstances had been unknown or had no output, thereby yielding a protection charge of 0.96.

The preliminary mannequin analysis and prompt-response assessment recognized uncommon situations, corresponding to clean, improperly scanned, or lacking report kinds, which ChatGPT did not detect usually. GPT-4-turbo outperformed the earlier mannequin in virtually each class, thereby enhancing this mannequin’s efficiency by over 5%.


ChatGPT seems to be able to dealing with huge scientific observe volumes to extract structured information with out requiring appreciable task-based human annotation or mannequin information coaching. Taken collectively, the research findings spotlight the potential of LLMs to transform unstructured-type healthcare data into organized representations, which may in the end facilitate analysis and scientific selections sooner or later.

Journal reference:

  • Huang, J., Yang, D.M., Rong, R., et al. (2024). A important evaluation of utilizing ChatGPT for extracting structured information from scientific notes. npj Digital Drugs 7(106). doi:10.1038/s41746-024-01079-8


Please enter your comment!
Please enter your name here

Most Popular

Recent Comments