Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A deep learning model for scene recognition
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
2019 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Scene recognition is a hot research topic in the field of image recognition. It is necessary that we focus on the research on scene recognition, because it is helpful to the scene understanding topic, and can provide important contextual information for object recognition. The traditional approaches for scene recognition still have a lot of shortcomings. In these years, the deep learning method, which uses convolutional neural network, has got state-of-the-art results in this area. This thesis constructs a model based on multi-layer feature extraction of CNN and transfer learning for scene recognition tasks. Because scene images often contain multiple objects, there may be more useful local semantic information in the convolutional layers of the network, which may be lost in the full connected layers. Therefore, this paper improved the traditional architecture of CNN, adopted the existing improvement which enhanced the convolution layer information, and extracted it using Fisher Vector. Then this thesis introduced the idea of transfer learning, and tried to introduce the knowledge of two different fields, which are scene and object. We combined the output of these two networks to achieve better results. Finally, this thesis implemented the method using Python and PyTorch. This thesis applied the method to two famous scene datasets. the UIUC-Sports and Scene-15 datasets. Compared with traditional CNN AlexNet architecture, we improve the result from 81% to 93% in UIUC-Sports, and from 79% to 91% in Scene- 15. It shows that our method has good performance on scene recognition tasks.

Place, publisher, year, edition, pages
2019. , p. 43
Keywords [en]
Scene recognition, CNN, convolutional supervised, Fisher Vector, transfer learning
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:miun:diva-36491Local ID: DT-V19-G3-013OAI: oai:DiVA.org:miun-36491DiVA, id: diva2:1330963
Subject / course
Computer Engineering DT1
Supervisors
Examiners
Available from: 2019-06-26 Created: 2019-06-26 Last updated: 2019-06-26Bibliographically approved

Open Access in DiVA

fulltext(1412 kB)14 downloads
File information
File name FULLTEXT01.pdfFile size 1412 kBChecksum SHA-512
481bff6073d4c6a52320aabc9474a1b387016ebee1924311828222bb6e5693db5339e8ecba8407f2c321ec72884528f7172cf2e67d744c1ad24d244276ce7aa1
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Meng, Zhaoxin
By organisation
Department of Information Systems and Technology
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 14 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 28 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf