Change search
Refine search result
12345 101 - 150 of 202
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 101.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Proceedings of the workshop on New Text: Wikis and blogs and other dynamic text sources2006Conference proceedings (editor) (Refereed)
    Abstract [en]

    Proceedings from the EACL workshop on New Text. The proceedings contain 11 papers on various aspects of analysing blogs, wikis and other dynamic co-authored text sources.

  • 102.
    Karlgren, Jussi
    KTH, School of Electrical Engineering and Computer Science (EECS), Theoretical Computer Science, TCS.
    Regulation of Unpredictable Effects of Decision Making Systems is Non-trivial2018In: 50 Years of Law and IT: The Swedish Law and Informatics Research Institute 1968-2018 / [ed] Peter Wahlgren, Stockholm: The Stockholm University Law Faculty , 2018, p. 127-132Chapter in book (Other academic)
    Abstract [en]

    Technical advances are rapidly delegating decision making in newarenas of human activity to information systems through theapplication of new classification mechanisms from machine learningresearch. How to manage technology-induced change and its effectsthrough legislative systems in order to encourage and supportbehaviour and activities which is desirable and beneficial to thepublic good and dissuade from such which is not is non-trivial. Ingeneral, legislation to cover new technical advances will be based onexisting technology and existing practice. This may seen reasonablebasis to build from and adds legitimacy to regulation and itsapplication, but regulation of technology too often stumbles at thebalancing line between under- standing and promoting future changeproductively and protecting past practice. This paper argues that morethought must be put into the aims of regulatory activities.

  • 103. Karlgren, Jussi
    Reply to Fraser and Wrigley or Definitely Not The Last Word On Language Varieties1994In: Interacting with computers, ISSN 0953-5438, E-ISSN 1873-7951, Vol. 6, no 1, p. 109-110Article in journal (Refereed)
  • 104.
    Karlgren, Jussi
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Språket avslöjar hur vi röstar2014In: Språktidningen, ISSN 1654-5028, no 6, p. 16-22Article in journal (Other (popular science, discussion, etc.))
    Abstract [sv]

    Hur ser det politiska opinionsläget ut? Det går förstås att fråga väljarna. Men bättre är kanske att se vad de skriver. Nu är ett datorprogram väljarnas sympatier på spåren.

  • 105.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Språkliga aspekter i textsökningssystem - forskningsfrågor ingen arbetar med?1999In: Proceedings of the 12th Nordic Conference of Computational Linguistics, 1999, 1Conference paper (Refereed)
    Abstract [en]

    Information access systems based on standard mechanisms can be improved. Not because of any obvious drawbacks in the mechanisms themselves: they provide consistent and stable results, with variation from system to system surprisingly small; the reason to continue work is that the stable results are not only consistent but consistently mediocre. This paper claims linguistic research has a important role to play in the future of information access.

  • 106.
    Karlgren, Jussi
    Stockholm University, SICS.
    Stylistic Experiments for Information Retrieval2000Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Information retrieval systems are built to handle texts as topical items:texts are tabulated by occurrence frequencies of content words in them,under the assumption that text topic is reasonably well modeled by contentword occurrence. But texts have several interesting characteristics beyondtopic. The experiments described in this text investigate {\em stylisticvariation}. Roughly put, style is the difference between two ways of sayingthe same thing --- and systematic stylistic variation can be used tocharacterize the {\em genre} of documents. These experiments investigate ifstylistic information is distinguishable using simple language engineeringmethods, and if in that case this type of information can be used toimprove information retrieval systems.

    A first set of experiments shows that simple measures of stylisticvariation can be used to distinguish genres from each other quiteadequately; how well depends on what the genres in question are.

    A second set of experiments evaluates the utility of stylistic measures forthe purposes of information retrieval, to identify common characteristicsof relevant and non-relevant documents. The conclusion is that the requestsfor information as typically expressed to retrieval systems are too terseand inspecific for non-topical information to improve retrieval results.Systems for information access need to be designed from the beginning tohandle richer information about the texts and documents at hand:information about stylistic variation cannot easily be added to an existingsystem.

    A third set of experiments explores how an interactive system can bedesigned to incorporate stylistic information in the interface between userand system. These experiments resulted in the design an interface forcategorizing retrieval results by genre, and displaying the retrievalresults using this categorization. This interface is integrated into aprototype for retrieving information from the World Wide Web.

  • 107.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS. Stockholm University.
    Stylistic Experiments for Information Retrieval2000Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Information retrieval systems are built to handle texts as topical items: texts are tabulated by occurrence frequencies of content words in them, under the assumption that text topic is reasonably well modeled by content word occurrence. But texts have several interesting characteristics beyond topic. The experiments described in this text investigate {\em stylistic variation}. Roughly put, style is the difference between two ways of saying the same thing --- and systematic stylistic variation can be used to characterize the {\em genre} of documents. These experiments investigate if stylistic information is distinguishable using simple language engineering methods, and if in that case this type of information can be used to improve information retrieval systems. A first set of experiments shows that simple measures of stylistic variation can be used to distinguish genres from each other quite adequately; how well depends on what the genres in question are. A second set of experiments evaluates the utility of stylistic measures for the purposes of information retrieval, to identify common characteristics of relevant and non-relevant documents. The conclusion is that the requests for information as typically expressed to retrieval systems are too terse and inspecific for non-topical information to improve retrieval results. Systems for information access need to be designed from the beginning to handle richer information about the texts and documents at hand: information about stylistic variation cannot easily be added to an existing system. A third set of experiments explores how an interactive system can be designed to incorporate stylistic information in the interface between user and system. These experiments resulted in the design an interface for categorizing retrieval results by genre, and displaying the retrieval results using this categorization. This interface is integrated into a prototype for retrieving information from the World Wide Web.

  • 108.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Stylistic Experiments in Information Retrieval1999In: Natural Language Information Retrieval / [ed] Strzalkowski, Tomek, Springer , 1999, 6Chapter in book (Refereed)
    Abstract [en]

    A discussion on various experiments to utilize stylistic variation among texts for information retrieval purposes.

  • 109.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Stylistic Variation in an Information Retrieval Experiment1996Conference paper (Refereed)
    Abstract [en]

    Texts exhibit considerable stylistic variation. This paper reports an experiment where a corpus of documents (N= 75 000) is analyzed using various simple stylistic metrics. A subset (n = 1000) of the corpus has been previously assessed to be relevant for answering given information retrieval queries. The experiment shows that this subset differs significantly from the rest of the corpus in terms of the stylistic metrics studied.

  • 110.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Sublanguages and Registers -- A Note On Terminology1993In: Interacting with computers, ISSN 0953-5438, E-ISSN 1873-7951, Vol. 5, p. 348-350Article in journal (Refereed)
    Abstract [en]

    The term sublanguage from mathematical linguistics confuses interaction researchers and leads them to believe that implementing natural language interfaces is easier than it is. The term register from sociolinguistics is proposed instead.

  • 111.
    Karlgren, Jussi
    Natural Language Processing Group, SICS.
    Sublanguages and Registers: A Note On Terminology1993In: Interacting with computers, ISSN 0953-5438, E-ISSN 1873-7951, Vol. 5, no 3, p. 348-350Article in journal (Refereed)
    Abstract [en]

    The term sublanguage from mathematical linguistics confuses interaction researchers and leads them to believe that implementing natural language interfaces is easier than it is. The term register from sociolinguistics is proposed instead.

  • 112.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Sökteknologi och personlig integritet2008In: Tekniken bakom språket, Stockholm: Norstedts akademiska , 2008, 1, , p. 20Chapter in book (Refereed)
    Abstract [sv]

    Tekniken för att söka efter information på nätet blir allt mer avancerad. Det är förstås bra på många sätt. Men det gör också att sökmotorerna kan ta reda på allt mer om oss för att anpassa sökresultat och annonser efter våra intressen. Vill vi det? Och kan vi vara säkra på att uppgifterna inte också används i andra, mer tvivelaktiga syften? Det handlar om vem som ska ha makten över informationen. Jussi Karlgren beskriver situationen, tekniken bakom och de möjligheter vi har att skydda vår integritet på nätet.

  • 113.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    T som i text2007In: Tidens Tecken, Santérus Förlag , 2007, 1, , p. 7Chapter in book (Refereed)
    Abstract [sv]

    Informationsteknologin har de senaste decennierna gett oss nya metoder för att skapa, publicera och sprida information, både i text och i annan form, vidare och till mindre kostnad än någonsin tidigare. Det är inte bara produktionsförhållandena och distributionssättet som håller på att ändras - författarens och läsekretsens relativa ställning förändras även de. Kommer det att ha effekter på hur vi förstår text, på vad text är och på hur vi behandlar information?

  • 114.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Textual Stylistic Variation: Choices, Genres and Individuals2010In: Structure of Style, Springer Verlag , 2010, 8, p. 129-142Chapter in book (Refereed)
    Abstract [en]

    This chapter argues for more informed target metrics for the statistical processing of stylistic variation in text collections. Much as operationalized relevance proved a useful goal to strive for in information retrieval, research in textual stylistics, whether application oriented or philologically inclined, needs goals formulated in terms of pertinence, relevance, and utility — notions that agree with reader ex- perience of text. Differences readers are aware of are mostly based on utility — not on textual characteristics per se. Mostly, readers report stylistic differences in terms of genres. Genres, while vague and undefined, are well-established and talked about: very early on, readers learn to distinguish genres. This chapter discusses variation given by genre, and contrasts it to variation occasioned by individual choice.

  • 115.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    The CHORUS gap analysis on user-centered methodology for design and evaluation of multi-media information access systems2008Conference paper (Refereed)
    Abstract [en]

    CHORUS is a Coordination Action, a specific type of project funded by the European commission under its research programmes, intended to bring together research projects with common goals, in the field of search technologies for digital audio-visual content, one of the strategic objectives of the current research frame program. CHORUS coordinates a number of research projects in the general area of audio-visual and multi-media information access and management. The most important single contribution of the CHORUS work plan will be to provide a survey of the field and a roadmap with a gap analysis for the realisation of viable audio-visual search engines by European partners. This is done by several means. CHORUS organises Think-Tanks with industrial participation, focussed workshops to treat specific questions, and more general conferences for academic discussions. CHORUS is now in its final phase, and is currently preparing its final report together with a final conference to mark its publication.

  • 116.
    Karlgren, Jussi
    KTH, Superseded Departments, Computer and Systems Sciences, DSV. Stockholms universitet.
    The Interaction of Discourse Modality and User Expectations in Human-Computer Dialog1992Licentiate thesis, monograph (Other academic)
    Abstract [en]

    This study discusses the behavior of people towards natural language interfaces. It draws parallels to the behavior of people towards other people, and discusses how far these parallels can be stretched. A small experimental study of users performing tasks using a natural language interface to a database is presented, and the results related to the discussion.

    The main points made are

    1) that new modalities like the one used in typical human computer interaction - written interactive communication - are problematic for new users, from lack of conventions; and

    2) that users' attitudes towards computers and of the system's linguistic and other competence shape much of the interaction, and that these attitudes change, and that thus the important factor to take into account in system design is not what the initial attitudes are but rather what the process of changing them is and how to utilize the process of change to teach the user the system language and interaction modality.

  • 117.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    The Use Case Perspective for Single Query Information Access2011Conference paper (Refereed)
    Abstract [en]

    The ”entertain me!”workshop is intended to discuss information access for a complex task based on a single query. Such scenarios may occur for many reasons — a framework for a systematic discussion of differences and likenesses based on the notion of a use case is proposed.

  • 118.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    The whys and wherefores for studying textual genre computationally2005In: Style and Meaning in Language, Art, Music, and Design: Papers from the AAAI Fall Symposium. / [ed] Shlomo Argamon, Shlomo Dubnov, and Julie Jupp., 2005, 1Conference paper (Refereed)
    Abstract [en]

    This brief paper gives an example of statistical stylistic experimentation and argues for more informed measures of variation and choice and more informed measures of readership analysis to be able to posit dimensions of textual variation usefully.

  • 119.
    Karlgren, Jussi
    RISE, Swedish ICT, SICS.
    Word spaces as input to categorisation of attitude2010Conference paper (Refereed)
    Abstract [en]

    Report of underwhelming experiments at NTCIR-8, MOAT track.

  • 120.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Argamon, Shlomo
    Shanahan, James G.
    Stylistic Analysis of Text for Information Access2005Report (Other academic)
  • 121.
    Karlgren, Jussi
    et al.
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Bohman, Martin
    Ekgren, Ariel
    Isheden, Gabriel
    Kullmann, Emelie
    Nilsson, David
    Semantic Topology2014In: Proceedings of the 23d ACM international conference on Conference on information & knowledge management (CIKM '14), New York: Association for Computing Machinery (ACM), 2014, p. 1939-1942Conference paper (Refereed)
    Abstract [en]

    A reasonable requirement (among many others) for a lexical or semantic component in an information system is that it should be able to learn incrementally from the linguistic data it is exposed to, that it can distinguish between the topical impact of various terms, and that it knows if it knows stuff or not.

    We work with a specific representation framework – semantic spaces – which well accommodates the first requirement; in this short paper, we investigate the global qualities of semantic spaces by a topological procedure – mapper – which gives an indication of topical density of the space; we examine the local context of terms of interest in the semantic space using another topologically inspired approach which gives an indication of the neighbourhood of the terms of interest. Our aim is to be able to establish the qualities of the semantic space under consideration without resorting to inspection of the data used to build it.

  • 122.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Bretan, Ivan
    Frost, Niklas
    Jonsson, Lars
    Interaction Models, Reference, and Interactivity for Speech Interfaces to Virtual Environments1995In: Proceedings of 2nd Eurographics Workshop on Virtual Environments --- Realism and Real Time, Monte Carlo, Springer Verlag , 1995, 7Conference paper (Refereed)
    Abstract [en]

    The enhancement of a virtual reality environment with a speech interface is described. Some areas where the virtual reality environment benefits from the spoken modality are identified as well as some where the interpretation of natural language utterances benefits from being situated in a highly structured environment. The issue of interaction metaphors for this configuration of interface modalities is investigated.

  • 123.
    Karlgren, Jussi
    et al.
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS. Gavagai, Sweden.
    Callin, Jimmy
    Collins-Thompson, Kevyn
    Gyllensten, Amaru Cuba
    Ekgren, Ariel
    Jurgens, David
    Korhonen, Anna
    Olsson, Fredrik
    Sahlgren, Magnus
    Schütze, Hinrich
    Evaluating learning language representations2015Conference paper (Refereed)
    Abstract [en]

    Machine learning offers significant benefits for systems that process and understand natural language: (a) lower maintenance and upkeep costs than when using manually-constructed resources, (b) easier portability to new domains, tasks, or languages, and (c) robust and timely adaptation to situation-specific settings. However, the behaviour of an adaptive system is less predictable than when using an edited, stable resource, which makes quality control a continuous issue. This paper proposes an evaluation benchmark for measuring the quality, coverage, and stability of a natural language system as it learns word meaning. Inspired by existing tests for human vocabulary learning, we outline measures for the quality of semantic word representations, such as when learning word embeddings or other distributed representations. These measures highlight differences between the types of underlying learning processes as systems ingest progressively more data.

  • 124.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Clough, Paul
    Gonzalo, Julio
    Multilingual Interactive Experiments with Flickr2006In: ERCIM News, no 66Article in journal (Refereed)
    Abstract [en]

    The Cross-Lingual Evaluation Forum (CLEF) in 2006 will feature a track on interactive image retrieval from dynamic target data taken from the popular Flickr photo-sharing service. In the past, interactive tracks at CLEF have addressed applications such as information retrieval and question answering. This year however, the focus has turned to text-based image retrieval from Flickr.

  • 125.
    Karlgren, Jussi
    et al.
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Cutting, Douglass
    Recognizing Text Genres with Simple Metrics Using Discriminant Analysis1994In: Proceedings of the 15th International Conference on Computational Linguistics, 1994, Vol. 2, p. 1071-1075Conference paper (Refereed)
    Abstract [en]

    A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.

  • 126.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Cutting, Douglass
    Recognizing Text Genres with Simple Metrics Using Discriminant Analysis1994Conference paper (Refereed)
    Abstract [en]

    A simple method for categorizing texts into pre-determined text genre categories using the statistical standard technique of discriminant analysis is demonstrated with application to the Brown corpus. Discriminant analysis makes it possible use a large number of parameters that may be specific for a certain corpus or information stream, and combine them into a small number of functions, with the parameters weighted on basis of how useful they are for discriminating text genres. An application to information retrieval is discussed.

  • 127.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Dalianis, Hercules
    Jongejan, Bart
    Experiments to investigate the connection between case distribution and topical relevance of search terms2008In: Proceedings of the 8th international conference on Language Resources and Evaluation, LREC'08, 2008, 1, , p. 5Conference paper (Refereed)
    Abstract [en]

    We have performed a set of experiments made to investigate the utility of morphological analysis to improve retrieval of documents written in languages with relatively large morphological variation in a practical commercial setting, using the SiteSeeker search system developed and marketed by Euroling AB. The objective of the experiments was to evaluate different lemmatisers and stemmers to determine which would be the most practical for the task at hand: highly interactive, relatively high precision web searches in commercial customer-oriented document collections. This paper gives an overview of some of the results for Finnish and German, and describes specifically one experiment designed to investigate the case distribution of nouns in a highly inflectional language (Finnish) and the topicality of the nouns in target texts. We find that topical nouns taken from queries are distributed differently over relevant and non-relevant documents depending on their grammatical case.

  • 128.
    Karlgren, Jussi
    et al.
    KTH, School of Computer Science and Communication (CSC), Theoretical Computer Science, TCS.
    Ericsson, Linus
    Semantic Space Models for Profiling Reputation of Corporate Entities2013In: CLEF 2013 Evaluation Labs and Workshop: Online Working Notes, CLEF , 2013Conference paper (Refereed)
    Abstract [en]

    Gavagai used its commercially available system for the filtering and po-larity tasks in the evaluation campaign for online reputation management systemsat CLEF 2013. The system is built for large scale analysis of streaming text and aspart of the services Gavagai provides, it measures the public attitude visavi targetsof interest. This mechanism — with no adjustment for this specific task — wasused for polarisation and the experiments performed this year was to test a numberof settings for testing how an attitude might be learned from the data rather thangiven by editorial intervention.

  • 129.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Authors, Genre, and Linguistic Convention2007In: Proceedings from the SIGIR Workshop on Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection, 2007, 1, , p. 5Conference paper (Refereed)
  • 130.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Where Attitudinal Expressions Get Their Attitude2005In: Computing Attitude and Affect in Text, Dordrecht: Springer , 2005, 1, Vol. Vol. 20Chapter in book (Refereed)
  • 131.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Franzén, Kristofer
    RISE, Swedish ICT, SICS.
    Clough, Paul
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Mizzaro, Stefano
    Sanderson, Mark
    Reading between the lines: attitudinal expressions in text2004Conference paper (Refereed)
    Abstract [en]

    This paper describes how a proposed project will research the expression of attitude, affect, and sentiment in text in order to automatically identify and extract such expressions. The project starting points are a set of hypotheses: + There are syntactic and lexical markers in text such that attitudinal information can be harvested using them; + Players, or discourse referents, in text are one such crucial marker for modeling topicality in general and attitudinal information flow in particular; + Attitudes in texts are dependent on text type and domain; + Attitudinal information can be applied in the development of practical tools for information access, among other application areas; + An extended notion of relevance will afford us with a empirical evaluation model for our theories and experiments.

  • 132.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Friesek, Madlen
    Gäde, Maria
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Järvelin, Anni
    RISE, Swedish ICT, SICS.
    Lupu, Mihai
    Müller, Henning
    Petras, Vivian
    Stiller, Juliane
    Initial specification of the evaluation tasks "Use cases to bridge validation and benchmarking" PROMISE Deliverable 2.12011Other (Other academic)
    Abstract [en]

    Evaluation of multimedia and multilingual information access systems needs to be performed from a usage oriented perspective. This document outlines use cases from the three use case domains of the PROMISE project and gives some initial pointers to how their respective characteristics can be extrapolated to determine and guide evaluation activities, both with respect to benchmarking and to validation of the usage hypotheses. The use cases will be developed further during the course of the evaluation activities and workshops projected to occur in coming CLEF conferences.

  • 133.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Täckström, Oscar
    RISE, Swedish ICT, SICS.
    SICS at NTCIR-7 MOAT: constructions represented in parallel with lexical items2008In: Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies, 2008, 1, , p. 4p. 237-240Conference paper (Refereed)
    Abstract [en]

    This paper describes experiments to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also on structural features of the text as represented by presence of function words (in other approaches often removed by stop lists) and by presence of constructional features (typically disregarded by most other analyses). In our analysis, following a constructional grammatical framework, structural features are treated similarly to vocabulary features. Our result gives us reason to conclude - provisionally, until more empirical verification experiments can be performed - that: * Linguistic structural information does help in establishing whether a sentence is opinionated or not; whereas * Linguistic information of this specific type does not help in distinguishing sentences of differing polarity.

  • 134.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Eriksson, Gunnar
    RISE, Swedish ICT, SICS.
    Täckström, Oscar
    RISE, Swedish ICT, SICS.
    Sahlgren, Magnus
    RISE - Research Institutes of Sweden, ICT, SICS.
    Between Bags and Trees - Constructional Patterns in Text Used for Attitude Identification2010Conference paper (Refereed)
    Abstract [en]

    This paper describes experiments to use non-terminological information to find attitudinal expressions in written English text. The experiments are based on an analysis of text with respect to not only the vocabulary of content terms present in it (which most other approaches use as a basis for analysis) but also with respect to presence of structural features of the text represented by constructional features (typically disregarded by most other analyses). In our analysis, following a construction grammar framework, structural features are treated as occurrences, similarly to the treatment of vocabulary features. The constructional features in play are chosen to potentially signify opinion but are not specific to negative or positive expressions. The framework is used to classify clauses, headlines, and sentences from three different shared collections of attitudinal data. We find that constructional features transfer well across different text collections and that the information couched in them integrates easily with a vocabulary based approach, yielding improvements in classification without complicating the application end of the processing framework.

  • 135.
    Karlgren, Jussi
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Theoretical Computer Science, TCS. Gavagai, Stockholm, Sweden.
    Esposito, L.
    Gratton, C.
    Kanerva, P.
    Authorship profiling without using topical information: Notebook for PAN at CLEF 20182018In: CLEF 2018 Working Notes, CEUR-WS , 2018, Vol. 2125Conference paper (Refereed)
    Abstract [en]

    This paper describes an experiment made for the PAN 2018 shared task on author profiling. The task is to distinguish female from male authors of microblog posts published on Twitter using no extraneous information except what is in the posts; this experiment focusses on using non-topical information from the posts, rather than gender differences in referential content.

  • 136.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Fahlén, Lennart
    RISE, Swedish ICT, SICS.
    Wallberg, Anders
    RISE - Research Institutes of Sweden, ICT, SICS.
    Hansson, Pär
    RISE, Swedish ICT, SICS, Software and Systems Engineering Laboratory.
    Ståhl, Olov
    RISE, Swedish ICT, SICS.
    Söderberg, Jonas
    RISE, Swedish ICT, SICS, Software and Systems Engineering Laboratory.
    Åkesson, Karl-Petter
    RISE, Swedish ICT, SICS.
    Socially intelligent interfaces for increased energy awareness in the home2008In: The Internet of Things, Springer , 2008, 2, , p. 13p. 263-275Chapter in book (Refereed)
    Abstract [en]

    This paper describes how home appliances might be enhanced to improve user awareness of energy usage. Households wish to lead comfortable and manageable lives. Balancing this reasonable desire with the environmental and political goal of reducing electricity usage is a challenge that we claim is best met through the design of interfaces that allows users better control of their usage and unobtrusively informs them of the actions of their peers. A set of design principles along these lines is formulated in this paper. We have built a fully functional prototype home appliance with a socially aware interface to signal the aggregate usage of the user's peer group according to these principles, and present the prototype in the paper.

  • 137.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Franzén, Djuna
    Johnson, Åsa
    Studying collaboration and annotation as factors in achieving trust in electronic documents2009Report (Other academic)
    Abstract [sv]

    This report presents two master student graduation studies on trust and collaboration when searching for information on the Internet. The studies were done as part of the SLIM project with Swedish Law and Informatics Research Institute, Faculty of Law, Stockholm University, financed by The Bank of Sweden Tercentenary Foundation (Stiftelsen Riksbankens Jubileumsfond).

  • 138.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Gambäck, Björn
    RISE, Swedish ICT, SICS.
    Kanerva, Pentti
    RISE, Swedish ICT, SICS.
    Acquiring (and Using) Linguistic (and World) Knowledge for Information Access2002 (ed. 1)Book (Refereed)
    Abstract [en]

    Information access tasks need flexible text understanding. While full text understanding remains a distant and possibly unattainable goal, to deliver better information access performance we must advance content analysis beyond the simple algorithms used today--and the dynamic nature of both information needs and information sources will make a flexible model or set of models a necessity. Models must either be adaptive or easily adapted by some form of low-cost intervention; and they must support incremental knowledge build-up. The first requirement involves acquisition of information from unstructured data; the second involves defining an inspectable and transparent model and developing an understanding of knowledge-intensive interaction. Text understanding needs a theory. Knowledge modeling, semantics, or ontology construction are areas marked by the absence of significant consensus either in points of theory or scope of application. Even the terminology and success criteria of the somewhat overlapping fields are fragmented. Some approaches to content modeling lay claim to psychological realism, others to inspectability; some are portable, others transparent; some are robust, others logically sound; some efficient, others scalable. Information access tasks give focus to modeling. It is too much to hope for a set of standards to emerge from the intellectually fairly volatile and fragmented area of semantics or cognitive modeling. But in our application areas -- namely, those in the general field of information access - external success criteria are better established. Compromise from theoretical underpinnings in the name of performance. Information access tasks need flexible text understanding. While full text understanding remains a distant and possibly unattainable goal, to deliver better information access performance we must advance content analysis beyond the simple algorithms used today--and the dynamic nature of both information needs and information sources will make a flexible model or set of models a necessity.

  • 139.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Gambäck, BjörnRISE, Swedish ICT, SICS.Kanerva, PenttiRISE, Swedish ICT, SICS.
    Notes from AAAI Spring Symposium on Acquiring (and Using) Linguistic (and World) Knowledge for Information Access2002Collection (editor) (Refereed)
  • 140.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Gambäck, Björn
    RISE, Swedish ICT, SICS.
    Rayner, Manny
    RISE, Swedish ICT, SICS.
    Samuelsson, Christer
    RISE, Swedish ICT, SICS.
    Spoken Language Translator: First-Year Report1994Report (Refereed)
    Abstract [en]

    This document is the first-year report for a project whose long-term goal is the construction of a practically useful system capable of translating continuous spoken language within a restricted domain. The main deliverable resulting from the first year is a prototype, the Spoken Language Translator (SLT), which can translate queries from spoken English to spoken Swedish in the domain of air travel planning. The system was developed by SRI International, the Swedish Institute of Computer Science, and Telia Research AB. Most of it is constructed from previously existing pieces of software, which have been adapted for use in the speech translation task with as few changes as possible. The main components are connected together in a pipelined sequence as follows. The input signal is processed by SRI's DECIPHER(TM), a speaker-independent continuous speech recognition system. It produces a set of speech hypotheses which is passed to the English-language processor, the SRI Core Language Engine (CLE), a general natural- language processing system. The CLE grammar associates each speech hypothesis with a set of possible logical-form-like representations, typically producing 5 to 50 logical forms per hypothesis. A preference component is then used to give each of them a numerical score reflecting its linguistic plausibility. When the preference component has made its choice, the highest-scoring logical form is passed to the transfer component, which uses a set of simple non-deterministic recursive pattern-matching rules to rewrite it into a set of possible corresponding Swedish representations. The preference component is now invoked again, to select the most plausible transferred logical form. The result is fed to a second copy of the CLE, which uses a Swedish- language grammar and lexicon developed at SICS to convert the form into a Swedish string and an associated syntax tree. Finally, the string and tree are passed to the Telia Prophon speech synthesizer, which utilizes polyphone synthesis to produce the spoken Swedish utterance. The system's current performance figures, measured on previously unseen test data, are as follows. For sentences of length 12 words and under, 65% of all utterances are such that the top-scoring speech hypothesis is an acceptable one. If the speech hypothesis is correct, then a translation is produced in 80% of the cases; and 90% of all translations produced are acceptable. Nearly all incorrect translations are incorrect due to their containing errors in grammar or naturalness of expression, with errors due to divergence in meaning between the source and target sentences accounting for less than 1% of all translations. Making fairly conservative extrapolations from the current SLT prototype, we believe that simply continuing the basic development strategy could within three to five years produce an enhanced version, which recognized about 90% of the short sentences (12 words or less) in a specific domain, and produced acceptable translations for about 95-97% of the sentences correctly recognized. Since the greater part of the system's knowledge would reside in domain-independent grammars and lexicons, it would be possible to port it to new domains with a fairly modest expenditure of effort.

  • 141.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Gambäck, Björn
    RISE, Swedish ICT, SICS.
    Samuelsson, Christer
    RISE, Swedish ICT, SICS.
    Clustering sentences1993Conference paper (Refereed)
    Abstract [en]

    The paper describes an experiment on a set of translated sentences obtained from a large group of informants. We discuss the question of transfer equivalence, noting that several target-language translations of a given source- language sentence will be more or less equivalent. Different equivalence classes should form clusters in the set of translated sentences. The main topic of the paper is to examine how these clusters can be found: we consider --- and discard as inappropriate --- several different methods of examining the sentence set, including traditional syntactic analysis, finding the most likely translation with statistical methods, and simple string distance measures.

  • 142.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Gonzalo, Julio
    Interactive Image Retrieval2010In: ImageCLEF --- Experimental Evaluation in Visual Information Retrieval, Springer , 2010, 16, p. 117-138Chapter in book (Refereed)
    Abstract [en]

    Information retrieval access research is based on evaluation as the main vehicle of research: benchmarking procedures are regularly pursued by all contributors to the field. But benchmarking is only one half of evaluation: to validate the results the evaluation must include the study of user behaviour while performing tasks for which the system under consideration is intended. Designing and performing such studies systematically on research systems is a challenge, breaking the mould on how benchmarking evaluation can be performed and how results can be perceived. This is the research question of interactive information retrieval. The question of evaluation has also come to the fore through application moving from exclusively treating topic-oriented text to including other media, most notably images. This development challenges many of the underlying assumptions of topical text retrieval, and requires new evaluation frameworks, not unrelated to the questions raised by interactive study. This chapter describes how the interactive track of the Cross-Language Evaluation Forum (iCLEF) has addressed some of those theoretical and practical challenges.

  • 143.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Continued experiments on cross-language relevance assessment2003In: Comparative Evaluation of Multilingual Information Access Systems: 4th CLEF workshop: Revised Selected Papers, 2003, 1Conference paper (Refereed)
    Abstract [en]

    An experiment on how users assess document usefulness for an information access task in their native language (Swedish) versus a language they have near-native competence in (English). Results show that relevance assessment in a foreign language takes more time and is prone to errors compared to assessment in the reader’s first language.

  • 144.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    Cross-language relevance assessment2003In: Advances in Cross-Language Information Retrieval, Third Workshop of the Cross-Language Evaluation Forum (CLEF), 2003, 1Conference paper (Refereed)
    Abstract [en]

    An experiment on how users assess relevance in a foreign language they know well is reported. Results show that relevance assessment in a foreign language takes more time and is prone to errors compared to assessment in the reader’s first language. The results are related to task and context and an enhanced methodology for performing context-sensitive studies is reported.

  • 145.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Hansen, Preben
    RISE, Swedish ICT, SICS.
    SICS at iCLEF 2002: cross-language relevance assessment and task context2003In: Advances in Cross-Language Information Retrieval: Third Workshop of the Cross-Language Evaluation Forum, CLEF 2002: Revised Papers, 2003, 1, , p. 9Conference paper (Refereed)
    Abstract [en]

    An experiment on how users assess relevance in a foreign language they know well is reported. Results show that relevance assessment in a foreign language takes more time and is prone to errors compared to assessment in the readersrsquo first language. The results are related to task and context and an enhanced methodology for performing context-sensitive studies is reported.

  • 146.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Holst, Anders
    RISE, Swedish ICT, SICS, Decisions, Networks and Analytics lab.
    Sahlgren, Magnus
    RISE - Research Institutes of Sweden, ICT, SICS.
    Filaments of Meaning in Word Space2008Conference paper (Refereed)
    Abstract [en]

    Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global space and that this structure shows potential to be exploited for further semantic analysis using methods for local analysis of vector space structure rather than globally scoped methods typically in use today such as singular value decomposition or principal component analysis.

  • 147.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Höök, Kristina
    RISE, Swedish ICT, SICS.
    Lantz, Ann
    Palme, Jacob
    Pargman, Daniel
    The glass box user model for filtering1994Report (Other academic)
    Abstract [en]

    The first requirement on an interactive system in a domain such as information filtering is to be an interface to knowledge, rather than just a knowledgeable interface. We borrow the computation instruction metaphor of a system as "a black box in a glass box" as a means to conceptualize the problem of giving a user control over the actions of an interactive system. The application domain we work in is that of information filtering. In the "black box", we hide complex knowledge of the domain objects such as facts and assumptions about text genre identification, while the "glass box", which is what the user sees, only shows the neat top level knowledge of the domain conceptual categories such as e.g. categorization rules.

  • 148. Karlgren, Jussi
    et al.
    Höök, Kristina
    Lantz, Ann
    KTH, Superseded Departments, Numerical Analysis and Computer Science, NADA.
    Palme, Jakob
    Pargman, Daniel
    The glass box user model for filtering1994In: / [ed] A. Kobsa and D. Litman, 1994Conference paper (Refereed)
    Abstract [en]

    The first requirement on an interactive system in a domain such as information filtering is to be an interface to knowledge, rather than just a knowledgeable interface. We borrow the computation instruction metaphor of a system as "a black box in a glass box" as a means to conceptualize the problem of giving a user control over the actions of an interactive system. The application domain we work in is that of information filtering. In the "black box", we hide complex knowledge of the domain objects such as facts and assumptions about text genre identification, while the "glass box", which is what the user sees, only shows the neat top level knowledge of the domain conceptual categories such as e.g. categorization rules.

  • 149.
    Karlgren, Jussi
    et al.
    RISE, Swedish ICT, SICS.
    Järvinen, Timo
    Foreground and background text in retrieval2002Conference paper (Refereed)
    Abstract [en]

    Our hypothesis is that certain clauses have foreground functions in text, while other clauses have background functions and that these functions are expressed or reflected in the syntactic structure of the clause. Presumably these clauses will have differing utility for automatic approaches to text understanding; a summarization system might want to utilize background clauses to capture commonalities between numbers of documents while an indexing system might use foreground clauses in order to capture specific characteristics of a certain document.

  • 150.
    Karlgren, Jussi
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Theoretical Computer Science, TCS.
    Kanerva, P.
    High-dimensional distributed semantic spaces for utterances2019In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 25, no 4, p. 503-517Article in journal (Refereed)
    Abstract [en]

    High-dimensional distributed semantic spaces have proven useful and effective for aggregating and processing visual, auditory and lexical information for many tasks related to human-generated data. Human language makes use of a large and varying number of features, lexical and constructional items as well as contextual and discourse-specific data of various types, which all interact to represent various aspects of communicative information. Some of these features are mostly local and useful for the organisation of, for example, argument structure of a predication; others are persistent over the course of a discourse and necessary for achieving a reasonable level of understanding of the content. This paper describes a model for high-dimensional representation for utterance and text-level data including features such as constructions or contextual data, based on a mathematically principled and behaviourally plausible approach to representing linguistic information. The implementation of the representation is a straightforward extension of Random Indexing models previously used for lexical linguistic items. The paper shows how the implementedmodel is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality, which is computationally habitable, and which is suitable as a bridge between symbolic representations such as dependency analysis and continuous representations used, for example, in classifiers or further machine-learning approaches. This is achieved with operations on vectors that constitute a powerful computational algebra, accompanied with an associative memory for the vectors. The paper provides a technical overview of the framework and a worked through implemented example of how it can be applied to various types of linguistic features.

12345 101 - 150 of 202
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf