Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Decision procedures for path feasibility of string-manipulating programs with complex operations
Birkbeck University of London, UK.
Royal Holloway University of London, UK.
University of Oxford, UK.
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems. (Embedded Systems)ORCID iD: 0000-0002-2733-7098
Show others and affiliations
2019 (English)In: Proceedings of the ACM on Programming Languages, ISSN 2475-1421, Vol. 3, p. 49:1-49:30Article in journal (Refereed) Published
Abstract [en]

The design and implementation of decision procedures for checking path feasibility in string-manipulating programs is an important problem, with such applications as symbolic execution of programs with strings and automated detection of cross-site scripting (XSS) vulnerabilities in web applications. A (symbolic) path is given as a finite sequence of assignments and assertions (i.e. without loops), and checking its feasibility amounts to determining the existence of inputs that yield a successful execution. Modern programming languages (e.g. JavaScript, PHP, and Python) support many complex string operations, and strings are also often implicitly modified during a computation in some intricate fashion (e.g. by some autoescaping mechanisms).

In this paper we provide two general semantic conditions which together ensure the decidability of path feasibility: (1) each assertion admits regular monadic decomposition (i.e. is an effectively recognisable relation), and (2) each assignment uses a (possibly nondeterministic) function whose inverse relation preserves regularity. We show that the semantic conditions are expressive since they are satisfied by a multitude of string operations including concatenation, one-way and two-way finite-state transducers, replaceall functions (where the replacement string could contain variables), string-reverse functions, regular-expression matching, and some (restricted) forms of letter-counting/length functions. The semantic conditions also strictly subsume existing decidable string theories (e.g. straight-line fragments, and acyclic logics), and most existing benchmarks (e.g. most of Kaluza’s, and all of SLOG’s, Stranger’s, and SLOTH’s benchmarks). Our semantic conditions also yield a conceptually simple decision procedure, as well as an extensible architecture of a string solver in that a user may easily incorporate his/her own string functions into the solver by simply providing code for the pre-image computation without worrying about other parts of the solver. Despite these, the semantic conditions are unfortunately too general to provide a fast and complete decision procedure. We provide strong theoretical evidence for this in the form of complexity results. To rectify this problem, we propose two solutions. Our main solution is to allow only partial string functions (i.e., prohibit nondeterminism) in condition (2). This restriction is satisfied in many cases in practice, and yields decision procedures that are effective in both theory and practice. Whenever nondeterministic functions are still needed (e.g. the string function split), our second solution is to provide a syntactic fragment that provides a support of nondeterministic functions, and operations like one-way transducers, replaceall (with constant replacement string), the string-reverse function, concatenation, and regular-expression matching. We show that this fragment can be reduced to an existing solver SLOTH that exploits fast model checking algorithms like IC3.

We provide an efficient implementation of our decision procedure (assuming our first solution above, i.e., deterministic partial string functions) in a new string solver OSTRICH. Our implementation provides built-in support for concatenation, reverse, functional transducers (FFT), and replaceall and provides a framework for extensibility to support further string functions. We demonstrate the efficacy of our new solver against other competitive solvers.

Place, publisher, year, edition, pages
New York, NY, USA, 2019. Vol. 3, p. 49:1-49:30
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:uu:diva-398239DOI: 10.1145/3290362OAI: oai:DiVA.org:uu-398239DiVA, id: diva2:1375065
Conference
46th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2019). Sun 13 - Sat 19 January 2019 ,Cascais, Portugal
Funder
Swedish Research Council, 2014-5484Swedish Foundation for Strategic Research , RIT17-0011Available from: 2019-12-04 Created: 2019-12-04 Last updated: 2020-01-31Bibliographically approved

Open Access in DiVA

fulltext(717 kB)16 downloads
File information
File name FULLTEXT01.pdfFile size 717 kBChecksum SHA-512
694040b2ac857fc5588d71a0000d3836762539d1cab334f4dc254f9275f6b27c07152ed6bb2f0abce721965336e8acd2afffd312eae6a5a360391dcacf871fab
Type fulltextMimetype application/pdf

Other links

Publisher's full texthttps://dl.acm.org/citation.cfm?doid=3302515.3290362

Search in DiVA

By author/editor
Rümmer, Philipp
By organisation
Computer Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 16 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 19 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf