Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A general framework for scraping newspaper websites
Linnaeus University, Faculty of Technology, Department of Computer Science.
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Data streaming nowadays is one of the most used approaches used by websites and applications to supply the end user with the latest articles and news. As a lot of news websites and companies are founded every day, such data centers must be flexible and it must be easy to introduce a new website to keep track of. The main goal of this project is to investigate two frameworks where implementing a robot for given website should take some acceptable amount of time. It is really challenging task, first of all it aims optimizing of a framework which means to put less efforts on something and have the same result and one another thing is that it will be used by professors and students at the end so quality and robustness play big role here. In order to overcome this challenge two different types of news websites were investigated and through this process the approximately time to implement a single robot was extracted. Having in mind the time spent to implement a single robot, the new frameworks were implemented with the goal to spend less time to implement a new web robot. The results are two general frameworks for two different types of websites, where implementing a robot does not take so much efforts and time. The implementation time of a new robot was reduced from 18 hours to approximately 4 hours.

Place, publisher, year, edition, pages
2016. , p. 40
National Category
Software Engineering
Identifiers
URN: urn:nbn:se:lnu:diva-59044OAI: oai:DiVA.org:lnu-59044DiVA, id: diva2:1056964
Educational program
Datavetenskap, kandidatprogram, 60 hp
Presentation
2016-05-30, 14:00 (English)
Supervisors
Examiners
Available from: 2016-12-16 Created: 2016-12-15 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(1098 kB)208 downloads
File information
File name FULLTEXT01.pdfFile size 1098 kBChecksum SHA-512
8d2dd3982ad830415e5fe61e512ba542c517909ab51619a72ba7502953fb59921b51828cac6494424406ff61fe38781ff34cfa3287e37cb1f3081e945b21d6c2
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Tasim, Taner
By organisation
Department of Computer Science
Software Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 208 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 311 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf