Accurate Adware Detection using Opcode Sequence Extraction
Blekinge Institute of Technology, School of Computing2011 (English)Conference paper (Refereed) Published
Adware represents a possible threat to the security and privacy of computer users. Traditional signature-based and heuristic-based methods have not been proven to be successful at detecting this type of software. This paper presents an adware detection approach based on the application of data mining on disassembled code. The main contributions of the paper is a large publicly available adware data set, an accurate adware detection algorithm, and an extensive empirical evaluation of several candidate machine learning techniques that can be used in conjunction with the algorithm. We have extracted sequences of opcodes from adware and benign software and we have then applied feature selection, using different configurations, to obtain 63 data sets. Six data mining algorithms have been evaluated on these data sets in order to find an efficient and accurate detector. Our experimental results show that the proposed approach can be used to accurately detect both novel and known adware instances even though the binary difference between adware and legitimate software is usually small.
Place, publisher, year, edition, pages
Vienna: IEEE Press , 2011.
Data Mining, Adware Detection, Binary Classification, Static Analysis, Disassembly, Instruction Sequences
IdentifiersURN: urn:nbn:se:bth-7462DOI: 10.1109/ARES.2011.35Local ID: oai:bth.se:forskinfo596323F8D63E0D5DC12578FD004443B0ISBN: 978-0-7695-4485-4/11OAI: oai:DiVA.org:bth-7462DiVA: diva2:835084
Sixth International Conference on Availability, Reliability and Security