Learning to Highlight Relevant Text in Binary Classified Documents
[摘要] Answering questions like ;;has this person ever been treated for breast cancer?”are critical for the success of tasks like clinical trial design, association analysis,documentation of mandatory discharge summary, etc. In this thesis, I argue thattraditional machine learning approaches have had limited success addressing thisproblem and present a better approach to answering these questions.In order to address the above problem, I take a different approach which annotateskey textual passages, which are then used in answering these questions. This approachis superior as it doesn’t involve going through the whole electronic medical record(EMR). This thesis is an attempt to understand how to model such annotations foran EMR. These annotations will help in answering questions which otherwise requirereading the whole text.In this thesis I present efficient inference algorithm for existing ;;Word Label Regression”(WLR) model and extend it to extract more accurate key textual passages.The extended version of the algorithm explores one can use language features likepunctuations to model annotations effectively.
[发布日期] [发布机构] Rice University
[效力级别] approximation [学科分类]
[关键词] [时效性]