已收录 268921 条政策
 政策提纲
  • 暂无提纲
A linear grammar approach for the analysis of mathematical documents
[摘要] Many approaches have been proposed for the recognition of mathematical formulae, traditionally using the results of optical character recognition over scanned documents. However, optical character recognition generally performs poorly when presented with mathematics, making it difficult to accurately parse formulae. Due to the rapidly increasing number of natively digital documents available, an alternative to optical character recognition is now available, that of analysing files directly instead of images.In this thesis, we explore such a method, analysing files in the ubiquitous Portable Document Format directly and combining it with image analysis, to produce the necessary information for the analysis of mathematical formulae and documents.We also revisit a method proposed in the 1960s for parsing handwritten mathematics. An extremely efficient, yet impractical approach due to a reliance of perfect input and precise character positioning. We heavily modify and extend this method, removing many of its restrictions and use it in conjunction with the perfect input from the PDF analysis, yielding high quality results which compare favourably with the leading scientific document analysis system.
[发布日期]  [发布机构] University:University of Birmingham;Department:School of Computer Science
[效力级别]  [学科分类] 
[关键词] Q Science;QA Mathematics [时效性] 
   浏览次数:17      统一登录查看全文      激活码登录查看全文