Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Leveraging Dominant Language Image Tags for Automatic Image Annotation in Minor Languages
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2010 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Image annotations, often in the form of tags, are very useful when indexing large image collections. They provide an intuitive human centered way to search and browse images using text queries. However, tagging images is very time consuming to do manually so researchers have developed methods for automatic image tagging. These methods rely on a set of example images with tags to learn what images should be associated with which tags.

One thing that has been overlooked with these systems is the fact that example images with tags are different in each language. Generally researchers have only made English automatic tagging systems and not considered the problems of building equally good systems in other minor languages where it is more difficult to obtain example images and tags.

In this thesis we study how an automatic tagging system in Japanese compares to an automatic tagging system in English. We find that the Japanese system suffers in performance and based on this we improve the performance by leveraging the dominant English language system. We compare an automatic translation of the tags using a dictionary to our proposed translation matrix method. Our method estimates the translation of tags based on the co-occurrence of different language tags in images.

We show that our proposed method using very simple heuristics performs about the same as a high end machine translator in the case of automatic tagging systems. There are several improvements to be made but with this work we show that the conceptual idea is strong, giving reasons to improve it further. The main contribution of our approach is the ability to translate words that a dictionary cannot interpret as well as considering the context when establishing a translation.

Place, publisher, year, edition, pages
2010.
Series
UPTEC IT, ISSN 1401-5749 ; 10 013
Identifiers
URN: urn:nbn:se:uu:diva-129446OAI: oai:DiVA.org:uu-129446DiVA, id: diva2:343751
Uppsok
Technology
Supervisors
Examiners
Available from: 2010-08-16 Created: 2010-08-16 Last updated: 2010-08-16Bibliographically approved

Open Access in DiVA

fulltext(1673 kB)765 downloads
File information
File name FULLTEXT01.pdfFile size 1673 kBChecksum SHA-512
5003f21dfabeb18fc929add2ae75cfd144ff8a60f2d88aa75de68faa8b0de9ed3a126383cd834ec1cabddd5b0743676d0d8542c7010c85392dca9cda83901ddf
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 765 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 742 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf