TechTalk – Health Care Applications with Natural Language Processing
April 5 2024 (Friday) 4:30-5:30pm
Unstructured documents often come with embedded structured data. Representing valuable and structured information as tables is popular in health, financial, and many domains. However, manual extraction of structured information from documents typically costs tremendous time and labor, motivating the need for a system for automating the process. After such tables have been extracted, the data can be used for a wide variety of tasks such as question answering and various “down-stream” analytics tasks. In this talk, we will discuss how to leverage ground breaking pre-trained language models (e.g., BERT, ChatGPT) to develop tools for automated table extraction from various types of documents. We will present different applications from cancer registry reporting, cancer care, and psychiatry hospitalization prediction.