Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Abstract Syntax Tree Analysis for Plagiarism Detection
Linköping University, Department of Computer and Information Science. Linköping University, The Institute of Technology.
2012 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Analys av abstrakta syntaxträd för detektion av plagiat (Swedish)
Abstract [en]

Today, universities rely heavily on systems for detecting plagiarism in students’essays and reports. Code submissions however require specific tools. A numberof approaches to finding plagiarisms in code have already been tried, includingtechniques based on comparing textual transformations of code, token strings,parse trees and graph representations. In this master’s thesis, a new system, cojac,is presented which combines textual, tree and graph techniques to detect a broadspectrum of plagiarism attempts. The system finds plagiarisms in C, C++ and Adasource files. This thesis discusses the method used for obtaining parse trees fromthe source code and the abstract syntax tree analysis. For comparison of syntaxtrees, we generate sets of fingerprints, digest forms of trees, which makes thecomparison algorithm more scalable. To evaluate the method, a set of benchmarkfiles have been constructed containing plagiarism scenarios which was analyzedboth by our system and Moss, another available system for plagiarism detection incode. The results show that our abstract syntax tree analysis can effectively detectplagiarisms such as changing the format of the code and renaming of identifiersand is at least as effective as Moss for detecting plagiarisms of these kinds

Place, publisher, year, edition, pages
2012. , 107 p.
Keyword [en]
abstract syntax tree, plagiarism, clone detection, fingerprint
National Category
Computer Science
Identifiers
URN: urn:nbn:se:liu:diva-80888ISRN: LIU-IDA/LITH-EX-A--12/043–SEOAI: oai:DiVA.org:liu-80888DiVA: diva2:548974
Subject / course
Computer and information science at the Institute of Technology
Presentation
2012-08-20, Donald Knuth, Linköpings Universitet, Linköping, 13:00 (Swedish)
Uppsok
Technology
Supervisors
Examiners
Available from: 2012-09-14 Created: 2012-09-03 Last updated: 2012-09-14Bibliographically approved

Open Access in DiVA

fulltext(1744 kB)933 downloads
File information
File name FULLTEXT01.pdfFile size 1744 kBChecksum SHA-512
4bf0bbfc18d1741d3ea346ec61e390506958c3495dfe7db45556a91c870503d616efeff93f8471ec038b3ae1ecbe0d7e91205d6e109c133a6ced062aaf3d1eeb
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Nilsson, Erik
By organisation
Department of Computer and Information ScienceThe Institute of Technology
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 933 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 445 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf