Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Crawling Online Social Networks
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.ORCID iD: 0000-0003-3219-9598
Blekinge Institute of Technology, School of Computing.ORCID iD: 0000-0002-9316-4842
Blekinge Institute of Technology, Faculty of Computing, Department of Computer Science and Engineering.
Show others and affiliations
2015 (English)In: SECOND EUROPEAN NETWORK INTELLIGENCE CONFERENCE (ENIC 2015), IEEE Computer Society, 2015, 9-16 p.Conference paper, Published paper (Refereed)
Abstract [en]

Researchers put in tremendous amount of time and effort in order to crawl the information from online social networks. With the variety and the vast amount of information shared on online social networks today, different crawlers have been designed to capture several types of information. We have developed a novel crawler called SINCE. This crawler differs significantly from other existing crawlers in terms of efficiency and crawling depth. We are getting all interactions related to every single post. In addition, are we able to understand interaction dynamics, enabling support for making informed decisions on what content to re-crawl in order to get the most recent snapshot of interactions. Finally we evaluate our crawler against other existing crawlers in terms of completeness and efficiency. Over the last years we have crawled public communities on Facebook, resulting in over 500 million unique Facebook users, 50 million posts, 500 million comments and over 6 billion likes.

Place, publisher, year, edition, pages
IEEE Computer Society, 2015. 9-16 p.
Keyword [en]
Crawlers;Facebook;Feeds;Informatics;Sampling methods;Silicon compounds;crawling;mining;online social media;online social networks
National Category
Computer Science
Identifiers
URN: urn:nbn:se:bth-10993DOI: 10.1109/ENIC.2015.10ISI: 000375081700002OAI: oai:DiVA.org:bth-10993DiVA: diva2:899462
Conference
Second European Network Intelligence Conference (ENIC)
Available from: 2016-02-02 Created: 2015-11-20 Last updated: 2017-11-15Bibliographically approved
In thesis
1. Human Interactions on Online Social Media: Collecting and Analyzing Social Interaction Networks
Open this publication in new window or tab >>Human Interactions on Online Social Media: Collecting and Analyzing Social Interaction Networks
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Online social media, such as Facebook, Twitter, and LinkedIn, provides users with services that enable them to interact both globally and instantly. The nature of social media interactions follows a constantly growing pattern that requires selection mechanisms to find and analyze interesting data. These interactions on social media can then be modeled into interaction networks, which enable network-based and graph-based methods to model and understand users’ behaviors on social media. These methods could also benefit the field of complex networks in terms of finding initial seeds in the information cascade model. This thesis aims to investigate how to efficiently collect user-generated content and interactions from online social media sites. A novel method for data collection that is using an exploratory research, which includes prototyping, is presented, as part of the research results in this thesis.

 

Analysis of social data requires data that covers all the interactions in a given domain, which has shown to be difficult to handle in previous work. An additional contribution from the research conducted is that a novel method of crawling that extracts all social interactions from Facebook is presented. Over the period of the last few years, we have collected 280 million posts from public pages on Facebook using this crawling method. The collected posts include 35 billion likes and 5 billion comments from 700 million users. The data collection is the largest research dataset of social interactions on Facebook, enabling further and more accurate research in the area of social network analysis.

 

With the extracted data, it is possible to illustrate interactions between different users that do not necessarily have to be connected. Methods using the same data to identify and cluster different opinions in online communities have also been developed and evaluated. Furthermore, a proposed method is used and validated for finding appropriate seeds for information cascade analyses, and identification of influential users. Based upon the conducted research, it appears that the data mining approach, association rule learning, can be used successfully in identifying influential users with high accuracy. In addition, the same method can also be used for identifying seeds in an information cascade setting, with no significant difference than other network-based methods. Finally, privacy-related consequences of posting online is an important area for users to consider. Therefore, mitigating privacy risks contributes to a secure environment and methods to protect user privacy are presented.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Tekniska Högskola, 2018
Series
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 1
Keyword
Social Media, Social Networks, Crawling, Complex Networks, Information Cascade, Seed Selection, Privacy
National Category
Computer Science
Identifiers
urn:nbn:se:bth-15503 (URN)978-91-7295-344-4 (ISBN)
Public defence
2017-01-15, J1650, Karlskrona, 13:00 (English)
Opponent
Supervisors
Available from: 2017-11-23 Created: 2017-11-15 Last updated: 2017-12-06Bibliographically approved

Open Access in DiVA

fulltext(1776 kB)447 downloads
File information
File name FULLTEXT01.pdfFile size 1776 kBChecksum SHA-512
1f01c758801081efcec359d2386d92957f6af679f5241d9d86855301b80b3a5e40bc3b491da949674449468b1b30fc88ea7c9a272e3f071417626229f08e981a
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Erlandsson, FredrikBoldt, MartinJohnson, Henric
By organisation
Department of Computer Science and EngineeringSchool of Computing
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 447 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 395 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf