Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Understanding Structured Documents with a Strong Layout
KTH, School of Computer Science and Communication (CSC).
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This work will focus on named entity recognition on documents with a strong layout using deep recurrent neural networks. Examples of such documents are receipts, invoices, forms and scientific papers, the latter of which will be used in this research.

The problem of NER on structured documents is modeled in different ways. First, the prob- lem is modeled as sequence labeling where every word or character has to labeled as belonging to one of the different entity classes. Secondly, the problem is modeled in a way that is typical for object detection in images. Here the network will output bounding boxes around words belonging to the same entity class.

In order to be able to do this task successfully not only the words themselves are important but also their locations. Multiple ways of encoding these locations have been researched. Using the relative position compared to the previous word has shown to be the most effective. Exper- iments have revealed that for sequence labeling it works best to split up the documents into multiple smaller sequences of size 200 and process these with 2 bi-directional stateful LSTM layers. In this model the last hidden state of an LSTM is re-used as the initial state for the next partial sequence of a document. This model has an average F1 on all classes of 94.2%. The performance of the models that output bounding boxes are not as good as the ones for sequence labeling but they are still promising. 

Place, publisher, year, edition, pages
2017.
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:kth:diva-200658OAI: oai:DiVA.org:kth-200658DiVA, id: diva2:1069903
Presentation
2016-06-14, 16:00 (English)
Supervisors
Examiners
Available from: 2017-02-02 Created: 2017-02-02Bibliographically approved

Open Access in DiVA

fulltext(2012 kB)123 downloads
File information
File name FULLTEXT01.pdfFile size 2012 kBChecksum SHA-512
06b31afb3db761551b65505deae74ffd94f1238e30d5bf0223864dfba031c08e8dfcd7eed6571070bb99e87a86d6393ad5153d4c1fdc3dd1f120a703314a11b4
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 123 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 479 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf