Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Self-supervised language grounding by active sensing combined with Internet acquired images and text
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
2017 (engelsk)Inngår i: Proceedings of the Fourth International Workshop on Recognition and Action for Scene Understanding (REACTS2017) / [ed] Jorge Dias George Azzopardi, Rebeca Marf, Málaga: REACTS , 2017, s. 71-83Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

For natural and efficient verbal communication between a robot and humans, the robot should be able to learn names and appearances of new objects it encounters. In this paper we present a solution combining active sensing of images with text based and image based search on the Internet. The approach allows the robot to learn both object name and how to recognise similar objects in the future, all self-supervised without human assistance. One part of the solution is a novel iterative method to determine the object name using image classi- fication, acquisition of images from additional viewpoints, and Internet search. In this paper, the algorithmic part of the proposed solution is presented together with evaluations using manually acquired camera images, while Internet data was acquired through direct and reverse image search with Google, Bing, and Yandex. Classification with multi-classSVM and with five different features settings were evaluated. With five object classes, the best performing classifier used a combination of Pyramid of Histogram of Visual Words (PHOW) and Pyramid of Histogram of Oriented Gradient (PHOG) features, and reached a precision of 80% and a recall of 78%.

sted, utgiver, år, opplag, sider
Málaga: REACTS , 2017. s. 71-83
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-138290ISBN: 978-84-608-8176-6 (tryckt)OAI: oai:DiVA.org:umu-138290DiVA, id: diva2:1133829
Konferanse
Fourth International Workshop on Recognition and Action for Scene Understanding (REACTS2017), August 25, 2017, Ystad, Sweden
Tilgjengelig fra: 2017-08-17 Laget: 2017-08-17 Sist oppdatert: 2018-06-09bibliografisk kontrollert

Open Access i DiVA

fulltext(5286 kB)29 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 5286 kBChecksum SHA-512
cd5edd63e74155531c29f76d69a259dbbcd15557ec9560c596f9ac9bc018545bb3f23ffdf612b707cd372e30ccb4ce311e62ba5cc296140d1e16c6c2bc1ba5b7
Type fulltextMimetype application/pdf

Andre lenker

URL

Søk i DiVA

Av forfatter/redaktør
Bensch, SunaHellström, Thomas
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 29 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 435 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf