Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Techniques for analyzing digital environments from a security perspective
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Datorteknik. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för datorteknik.ORCID-id: 0000-0001-6553-4319
2019 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The development of the Internet and social media has exploded in the last couple of years. Digital environments such as social media and discussion forums provide an effective method of communication and are used by various groups in our societies.  For example, violent extremist groups use social media platforms for recruiting, training, and communicating with their followers, supporters, and donors. Analyzing social media is an important task for law enforcement agencies in order to detect activity and individuals that might pose a threat towards the security of the society.

In this thesis, a set of different technologies that can be used to analyze digital environments from a security perspective are presented. Due to the nature of the problems that are studied, the research is interdisciplinary, and knowledge from terrorism research, psychology, and computer science are required. The research is divided into three different themes. Each theme summarizes the research that has been done in a specific area.

The first theme focuses on analyzing digital environments and phenomena. The theme consists of three different studies. The first study is about the possibilities to detect propaganda from the Islamic State on Twitter.  The second study focuses on identifying references to a narrative containing xenophobic and conspiratorial stereotypes in alternative immigration critic media. In the third study, we have defined a set of linguistic features that we view as markers of a radicalization.

A group consists of a set of individuals, and in some cases, individuals might be a threat towards the security of the society.  The second theme focuses on the risk assessment of individuals based on their written communication. We use different technologies including machine learning to experiment the possibilities to detect potential lone offenders.  Our risk assessment approach is implemented in the tool PRAT (Profile Risk Assessment Tool).

Internet users have the ability to use different aliases when they communicate since it offers a degree of anonymity. In the third theme, we present a set of techniques that can be used to identify users with multiple aliases. Our research focuses on solving two different problems: author identification and alias matching. The technologies that we use are based on the idea that each author has a fairly unique writing style and that we can construct a writeprint that represents the author. In a similar manner,  we also use information about when a user communicates to create a timeprint. By combining the writeprint and the timeprint, we can obtain a set of powerful features that can be used to identify users with multiple aliases.

To ensure that the technologies can be used in real scenarios, we have implemented and tested the techniques on data from social media. Several of the results are promising, but more studies are needed to determine how well they work in reality.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2019. , s. 64
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 1786
Nyckelord [en]
digital communities, machine learning, text analysis, linguistic features, linguistic analysis, warning behaviors, Internet, social media, extremism, terrorism, psychological state, author identification, alias matching
Nationell ämneskategori
Teknik och teknologier
Forskningsämne
Datavetenskap
Identifikatorer
URN: urn:nbn:se:uu:diva-379605ISBN: 978-91-513-0605-6 (tryckt)OAI: oai:DiVA.org:uu-379605DiVA, id: diva2:1298150
Disputation
2019-05-17, Room 2446, ITC, Lägerhyddsvägen 2, Uppsala, 10:15 (Engelska)
Opponent
Handledare
Tillgänglig från: 2019-04-24 Skapad: 2019-03-22 Senast uppdaterad: 2019-06-18
Delarbeten
1. A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
Öppna denna publikation i ny flik eller fönster >>A Machine Learning Approach Towards Detecting Extreme Adopters in Digital Communities
2017 (Engelska)Ingår i: 2017 28th International Workshop on Database and Expert Systems Applications (DEXA) / [ed] Tjoa, AM Wagner, RR, IEEE, 2017, s. 1-5Konferensbidrag, Publicerat paper (Övrigt vetenskapligt)
Abstract [en]

In this study we try to identify extreme adopters on a discussion forum using machine learning. An extreme adopter is a user that has adopted a high level of a community-specific jargon and therefore can be seen as a user that has a high degree of identification with the community. The dataset that we consider consists of a Swedish xenophobic discussion forum where we use a machine learning approach to identify extreme adopters using a number of linguistic features that are independent on the dataset and the community. The results indicates that it is possible to separate these extreme adopters from the rest of the discussants on the discussion forum with more than 80% accuracy. Since the linguistic features that we use are highly domain independent, the results indicates that there is a possibility to use this kind of techniques to identify extreme adopters within other communities as well.

Ort, förlag, år, upplaga, sidor
IEEE, 2017
Serie
International Workshop on Database and Expert Systems Applications-DEXA, ISSN 1529-4188
Nyckelord
Discussion forums, Support vector machines, Pragmatics, Manuals, Radio frequency, Electronic mail, Social network services
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-351187 (URN)10.1109/DEXA.2017.17 (DOI)000426078300001 ()978-1-5386-1051-0 (ISBN)
Konferens
28th International Workshop on Database and Expert Systems Applications (DEXA), AUG 28-31, 2017, Lyon3 Univ, Lyon, FRANCE
Tillgänglig från: 2018-05-23 Skapad: 2018-05-23 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
2. Identifying warning behaviors of violent lone offenders in written communication
Öppna denna publikation i ny flik eller fönster >>Identifying warning behaviors of violent lone offenders in written communication
2016 (Engelska)Ingår i: Proc. 16th ICDM Workshops, IEEE Computer Society, 2016, s. 1053-1060Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
IEEE Computer Society, 2016
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-306943 (URN)10.1109/ICDMW.2016.0152 (DOI)978-1-5090-5910-2 (ISBN)
Konferens
ICDM Workshop on Social Media and Risk, SOMERIS 2016, December 12, Barcelona, Spain
Tillgänglig från: 2017-02-02 Skapad: 2016-11-07 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
3. Automatic detection of xenophobic narratives: A case study on Swedish alternative media
Öppna denna publikation i ny flik eller fönster >>Automatic detection of xenophobic narratives: A case study on Swedish alternative media
2016 (Engelska)Ingår i: Proc. 14th International Conference on Intelligence and Security Informatics, IEEE, 2016, s. 121-126Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
IEEE, 2016
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-306903 (URN)10.1109/ISI.2016.7745454 (DOI)000390129600021 ()978-1-5090-3865-7 (ISBN)
Konferens
ISI 2016, September 28–30, Tucson, AZ
Tillgänglig från: 2016-11-17 Skapad: 2016-11-04 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
4. Linguistic analysis of lone offender manifestos
Öppna denna publikation i ny flik eller fönster >>Linguistic analysis of lone offender manifestos
2016 (Engelska)Ingår i: Proc. 4th International Conference on Cybercrime and Computer Forensics, IEEE, 2016Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
IEEE, 2016
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-306941 (URN)10.1109/ICCCF.2016.7740427 (DOI)000390123800007 ()978-1-5090-6096-2 (ISBN)
Konferens
ICCCF 2016, June 12–14, Vancouver, Canada
Tillgänglig från: 2016-11-17 Skapad: 2016-11-07 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
5. Detecting multipliers of jihadism on twitter
Öppna denna publikation i ny flik eller fönster >>Detecting multipliers of jihadism on twitter
2015 (Engelska)Ingår i: Proc. 15th ICDM Workshops, IEEE Computer Society, 2015, s. 954-960Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Detecting terrorist related content on social media is a problem for law enforcement agency due to the large amount of information that is available. In this paper we describe a first step towards automatically classifying twitter user accounts (tweeps) as supporters of jihadist groups who disseminate propaganda content online. We use a machine learning approach with two set of features: data dependent features and data independent features. The data dependent features are features that are heavily influenced by the specific dataset while the data independent features are independent of the dataset and that can be used on other datasets with similar result. By using this approach we hope that our method can be used as a baseline to classify violent extremist content from different kind of sources since data dependent features from various domains can be added.

Ort, förlag, år, upplaga, sidor
IEEE Computer Society, 2015
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-272243 (URN)10.1109/ICDMW.2015.9 (DOI)000380556700127 ()9781467384926 (ISBN)
Externt samarbete:
Konferens
ICDM Workshop on Intelligence and Security Informatics, ISI-ICDM 2015, November 14, Atlantic City, NJ
Tillgänglig från: 2015-11-14 Skapad: 2016-01-12 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
6. Detecting multiple aliases in social media
Öppna denna publikation i ny flik eller fönster >>Detecting multiple aliases in social media
2013 (Engelska)Ingår i: Proc. 5th International Conference on Advances in Social Networks Analysis and Mining, New York: ACM Press, 2013, s. 1004-1011Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
New York: ACM Press, 2013
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-216568 (URN)10.1145/2492517.2500261 (DOI)978-1-4503-2240-9 (ISBN)
Konferens
ASONAM 2013, August 25-29, Niagara Falls, Canada
Forskningsfinansiär
Vinnova
Tillgänglig från: 2013-08-29 Skapad: 2014-01-23 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
7. Timeprints for identifying social media users with multiple aliases
Öppna denna publikation i ny flik eller fönster >>Timeprints for identifying social media users with multiple aliases
2015 (Engelska)Ingår i: Security Informatics, ISSN 2190-8532, Vol. 4, s. 7:1-11, artikel-id 7Artikel i tidskrift (Refereegranskat) Published
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-272242 (URN)10.1186/s13388-015-0022-z (DOI)
Tillgänglig från: 2015-09-24 Skapad: 2016-01-12 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
8. Multi-domain alias matching using machine learning
Öppna denna publikation i ny flik eller fönster >>Multi-domain alias matching using machine learning
2016 (Engelska)Ingår i: Proc. 3rd European Network Intelligence Conference, IEEE, 2016, s. 77-84Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

We describe a methodology for linking aliases belonging to the same individual based on a user's writing style (stylometric features extracted from the user generated content) and her time patterns (time-based features extracted from the publishing times of the user generated content). While most previous research on social media identity linkage relies on matching usernames, our methodology can also be used for users who actively try to choose dissimilar usernames when creating their aliases. In our experiments on a discussion forum dataset and a Twitter dataset, we evaluate the performance of three different classifiers. We use the best classifier (AdaBoost) to evaluate how well it works on different datasets using different features. Experiments show that combining stylometric and time based features yield good results on our synthetic datasets and a small-scale evaluation on real-world blog data confirm these results, yielding a precision over 95%. The use of emotion-related and Twitter-related features yield no significant impact on the results.

Ort, förlag, år, upplaga, sidor
IEEE, 2016
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-306944 (URN)10.1109/ENIC.2016.019 (DOI)000399097600011 ()9781509034550 (ISBN)
Konferens
ENIC 2016, September 5–7, Wroclaw, Poland
Tillgänglig från: 2017-02-02 Skapad: 2016-11-07 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
9. Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
Öppna denna publikation i ny flik eller fönster >>Assessment of risk in written communication: Introducing the Profile Risk Assessment Tool (PRAT)
Visa övriga...
2018 (Engelska)Rapport (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
Belgium: EUROPOL, 2018. s. 24
Nationell ämneskategori
Teknik och teknologier
Identifikatorer
urn:nbn:se:uu:diva-367346 (URN)
Anmärkning

This paper was presented at the 2nd European Counter-Terrorism Centre (ECTC) Advisory Groupconference, 17-18 April 2018, at Europol Headquarters, The Hague.

Tillgänglig från: 2018-11-30 Skapad: 2018-11-30 Senast uppdaterad: 2019-03-22Bibliografiskt granskad
10. Linguistic markers of a radicalized mind-set among extreme adopters
Öppna denna publikation i ny flik eller fönster >>Linguistic markers of a radicalized mind-set among extreme adopters
2017 (Engelska)Ingår i: Proc. 10th ACM International Conference on Web Search and Data Mining, New York: ACM Press, 2017, s. 823-824Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

The words that we use when communicating in social media can reveal how we relate to ourselves and to others. For instance, within many online communities, the degree of adaptation to a community-specific jargon can serve as a marker of identification with the community. In this paper we single out a group of so called extreme adopters of community-specific jargon from the whole group of users of a Swedish discussion forum devoted to the topics immigration and integration. The forum is characterized by a certain xenophobic jargon, and we hypothesize that extreme adopters of this jargon also exhibit certain linguistic features that we view as markers of a radicalized mind-set. We use a Swedish translation of LIWC (linguistic inquiry word count) and find that the group of extreme adopters differs significantly from the whole group of forum users regarding six out of seven linguistic markers of a radicalized mind-set.

Ort, förlag, år, upplaga, sidor
New York: ACM Press, 2017
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-379919 (URN)10.1145/3018661.3022760 (DOI)978-1-4503-4675-7 (ISBN)
Konferens
WSDM 2017, 1st International Workshop on Cyber Deviance Detection
Tillgänglig från: 2017-02-02 Skapad: 2019-03-21 Senast uppdaterad: 2019-04-08Bibliografiskt granskad

Open Access i DiVA

fulltext(584 kB)93 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 584 kBChecksumma SHA-512
5f98ca3cfb1323530bb324f1b9ee4aeb6c0b2552ffacb4e4509b6d9871b89820dd85bf8765d31ccee0ccecc58af8da30e94df5b4e750cd5e3385271c4cbd1286
Typ fulltextMimetyp application/pdf
Köp publikationen >>

Sök vidare i DiVA

Av författaren/redaktören
Shrestha, Amendra
Av organisationen
DatorteknikAvdelningen för datorteknik
Teknik och teknologier

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 93 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 311 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf