Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Leerec: A scalable product recommendation engine suitable for transaction data.
Mid Sweden University, Faculty of Science, Technology and Media, Department of Information Systems and Technology.
2018 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

We are currently living in the Internet of Things (IoT) era, which involves devices that are connected to Internet and are communicating with each other. Each year, the number of devices increases rapidly, which result in rapid growth of data that is generated. This large amount of data is sometimes titled as Big Data, which is generated from different sources, such as log data of user behavior. These log files can be collected and analyzed in different ways, such as creating product recommendations. Product recommendations have been around since the late 90s, when the amount of data collected were not at the same level as it is today. The aim of this thesis has been to investigating methods to process and create product recommendations to see how well they are adapted for Big Data. This has been accomplished by three theory studies on how to process user events, how to make the product recommendation algorithm called collaborative filtering scalable and finally how to convert implicit feedback to explicit feedback (ratings).

This resulted in a recommendation engine consisting of Apache Spark as the data processing system, which had three functions: read multiple log files and concatenate log files for each month, parsing the log files of the user events to create explicit ratings from the transactions and create four types of recommendations. The NoSQL database MongoDB was chosen as the database to store the different types of product recommendations that was created. To be able to get the recommendations from the recommendation engine and the database, a REST API was implemented which can be used by any third-party. What can be concluded from the results of this thesis work is that the system that was implemented is partial scalable. This means that Apache Spark was scalable for both concatenating files, parse and create ratings and also create the recommendations using the ALS method. However, MongoDB was shown to be not scalable when managing more than 100 concurrent requests. Future work involves making the recommendation engine distributed in a multi-node cluster to utilize the parallelization of Apache Spark. Other recommendations include considering other NoSQL databases that might be more scalable than MongoDB.

Place, publisher, year, edition, pages
2018. , p. 97
Keywords [en]
Collaborative filtering, log processing, event, Alternating Least Square
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:miun:diva-33941Local ID: DT-V18-A2-001OAI: oai:DiVA.org:miun-33941DiVA, id: diva2:1227105
Subject / course
Computer Engineering DT1
Educational program
Master of Science in Engineering - Computer Engineering TDTEA 300 higher education credits
Supervisors
Examiners
Available from: 2018-06-27 Created: 2018-06-27 Last updated: 2018-06-27Bibliographically approved

Open Access in DiVA

fulltext(2729 kB)8 downloads
File information
File name FULLTEXT01.pdfFile size 2729 kBChecksum SHA-512
77c036c9a55b3e500f90f46cc9bacd95d89214e6d51baf15b1fd49c459461d3e5effbf33e9a1b31759c51e024e83a21b7674783de0790db4ba0cf4a42d9e8bfa
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Flodin, Anton
By organisation
Department of Information Systems and Technology
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 8 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 59 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf