Recent years have seen an increased interest in and availability of many different kinds of corpora. These range from small, but carefully annotated treebanks to large parallel corpora and very large monolingual corpora for big data research.
It remains a challenge to offer flexible and powerful query tools for multilayer annotations of small corpora. When dealing with large corpora, query tools also need to scale in terms of processing speed and reporting through statistical information and visualization options. This becomes evident, for example, when dealing with very large corpora (such as complete Wikipedia corpora) or multi-parallel corpora (such as Europarl or JRC Acquis).
The QueryVis workshop has gathered researchers who develop and evaluate new corpus query and visualization tools for linguistics, language technology and related disciplines. The papers focus on the design of query languages, and on various new visualization options for monolingual and parallel corpora, both for written and spoken language.
We hope that QueryVis will stimulate discussions and trigger new ideas for the workshop participants and any reader of the proceedings. The preparation of the workshop and the reviewing of the submissions has already been an inspiring experience.
All papers were peer-reviewed by three program committee members. We would like to thank all reviewers and contributors for their work and for sharing their thoughts and experiences with us.
Let us all join our forces to make corpus exploration a rewarding, entertaining, and exciting experience which will grant us ever new insights into language and thought.
Linköping: Linköping University Electronic Press, Linköpings universitet , 2015. , 36 p.