Digitala Vetenskapliga Arkivet

Change search
Refine search result
12 1 - 50 of 57
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics. University of Tartu, Estonia.
    Gilmartin, Emer
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Lippus, Pärtel
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Breath holds in chat and chunk phases of multiparty casual conversation2020In: Proceedings of Speech Prosody 2020, 2020, p. 779-783Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 2.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Lippus, Pärtel
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Creak in the respiratory cycle2018In: Proceedings of Interspeech 2018 / [ed] B. Yegnanarayana, The International Speech Communication Association (ISCA), 2018, p. 1408-1412Conference paper (Refereed)
    Abstract [en]

    Creakiness is a well-known turn-taking cue and has been observed to systematically accompany phrase and turn ends in several languages. In Estonian, creaky voice is frequently used by all speakers without any obvious evidence for its systematic use as a turn-taking cue. Rather, it signals a lack of prominence and is favored by lengthening and later timing in phrases. In this paper, we analyze the occurrence of creak with respect to properties of the respiratory cycle. We show that creak is more likely to accompany longer exhalations. Furthermore, the results suggest there is little difference in lung volume values regardless of the presence of creak, indicating that creaky voice might be employed to preserve air over the course of longer utterances. We discuss the results in connection to processes of speech planning in spontaneous speech.

    Download full text (pdf)
    fulltext
  • 3.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Backchannels and breathing2014In: Proceedings from FONETIK 2014: Stockholm, June 9-11, 2014 / [ed] Mattias Heldner, Stockholm: Department of Linguistics, Stockholm University , 2014, p. 47-52Conference paper (Other academic)
    Abstract [en]

    The present study investigated the timing of backchannel onsets within speaker’s own and dialogue partner’s breathing cycle in two spontaneous conversations in Estonian. Results indicate that backchannels are mainly produced near the beginning, but also in the second half of the speaker’s exhalation phase. A similar tendency was observed in short non-backchannel utterances, indicating that timing of backchannels might be determined by their duration rather than their pragmatic function. By contrast, longer non-backchannel utterances were initiated almost exclusively right at the beginning of the exhalation. As expected, backchannels in the conversation partner’s breathing cycle occurred predominantly towards the end of the exhalation or at the beginning of the inhalation. 

    Download full text (pdf)
    Backchannels and breathing
  • 4.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics. University of Tartu, Estonia.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Breath holds in spontaneous speech2019In: Eesti ja soome-ugri keeleteaduse ajakiri, ISSN 1736-8987, E-ISSN 2228-1339, Vol. 10, no 1, p. 13-34Article in journal (Refereed)
    Abstract [en]

    This article provides a first quantitative overview of the timing and volume-related properties of breath holds in spontaneous conversations. Firstly, we investigate breath holds based on their position within the coinciding respiratory interval amplitude. Secondly, we investigate breath holds based on their timing within the respiratory intervals and in relation to communicative activity following breath holds. We hypothesise that breath holds occur in different regions of the lung capacity range and at different times during the respiratory phase, depending on the conversational and physiological activity following breath holds. The results suggest there is not only considerable variation in both the time and lung capacity scales, but detectable differences are also present in breath holding characteristics involving laughter and speech preparation, while breath holds coinciding with swallowing are difficult to separate from the rest of the data based on temporal and volume information alone.

  • 5.
    Aare, Kätlin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Inhalation amplitude and turn-taking in spontaneous Estonian conversations2015In: Proceedings from Fonetik 2015 Lund, June 8-10, 2015 / [ed] Malin Svensson Lundmark, Gilbert Ambrazaitis, Joost van de Weijer, Lund: Lund University , 2015, p. 1-5Conference paper (Other academic)
    Abstract [en]

    This study explores the relationship between inhalation amplitude and turn management in four approximately 20 minute long spontaneous multiparty conversations in Estonian. The main focus of interest is whether inhalation amplitude is greater before turn onset than in the following inhalations within the same speaking turn. The results show that inhalations directly before turn onset are greater in amplitude than those later in the turn. The difference seems to be realized by ending the inhalation at a greater lung volume value, whereas the initial lung volume before inhalation onset remains roughly the same across a single turn. The findings suggest that the increased inhalation amplitude could function as a cue for claiming the conversational floor.

    Download full text (pdf)
    fulltext
  • 6. Bruggeman, Anna
    et al.
    Schade, Leoni
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Beware of the individual: Evaluating prominence perception in spontaneous speech2022In: Proceedings of Speech Prosody 2022, 2022Conference paper (Refereed)
    Abstract [en]

    Much of the existing research on prominence perception has focused on read speech in American English and German. The present paper presents two experiments that build on and extend insights from these studies in two ways. Firstly, we elicit prominence judgments on spontaneous speech. Secondly, we investigate gradient rather than binary prominence judgments by introducing a finger tapping task. We additionally provide a within-participant comparison of gradient prominence results with binary prominence judgments to evaluate their correspondence. Our results show that participants exhibit different success rates in tapping the prominence pattern of spontaneous data, but generally tapping results correlate well with binary prominence judgments within individuals. Random forest analysis of the acoustic parameters involved shows that pitch accentuation and duration play important roles in both binary judgments and prominence tapping patterns. We can also confirm earlier findings from read speech that differences exist between participants in the relative importance rankings of various signal and systematic properties.

  • 7.
    Buschmeier, Hendrik
    et al.
    Bielefeld University, Germany.
    Malisz, Zofia
    Bielefeld University, Germany.
    Skubisz, Joanna
    Bielefeld University, Germany.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics. Bielefeld University, Germany.
    Wachsmuth, Ipke
    Bielefleld University, Germany.
    Kopp, Stefan
    Bielefeld University, Germany.
    Wagner, Petra
    Bielefeld University, Germany.
    ALICO: A multimodal corpus for the study of active listening2014In: Proceedings of LREC 2014, 2014, p. 3638-3643Conference paper (Refereed)
    Abstract [en]

    The Active Listening Corpus (ALICO) is a multimodal database of spontaneous dyadic conversations with diverse speech andgestural annotations of both dialogue partners. The annotations consist of short feedback expression transcription with correspondingcommunicative function interpretation as well as segmentation of interpausal units, words, rhythmic prominence intervals andvowel-to-vowel intervals. Additionally, ALICO contains head gesture annotation of both interlocutors. The corpus contributes to researchon spontaneous human–human interaction, on functional relations between modalities, and timing variability in dialogue. It also providesdata that differentiates between distracted and attentive listeners. We describe the main characteristics of the corpus and present the mostimportant results obtained from analyses in recent years.

    Download full text (pdf)
    fulltext
  • 8.
    Cortes, Elisabet Eir
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Šimko, Juraj
    Articulatory Consequences of Vocal Effort Elicitation Method2018In: Proceedings of Interspeech 2018 / [ed] B. Yegnanarayana, The International Speech Communication Association (ISCA), 2018, p. 1521-1525Conference paper (Refereed)
    Abstract [en]

    Articulatory features from two datasets, Slovak and Swedish, were compared to see whether different methods of eliciting loud speech (ambient noise vs. visually presented loudness target) result in different articulatory behavior. The features studied were temporal and kinematic characteristics of lip separation within the closing and opening gestures of bilabial consonants, and of the tongue body movement from /i/ to /a/ through a bilabial consonant. The results indicate larger hyper - articulation in the speech elicited with visually presented target. While individual articulatory strategies are evident, t he speaker groups agree on increasing the kinematic features consistently within each gesture in response to the increased vocal effort. Another concerted strategy is keeping the tongue response considerably smaller than that of the lips, presumably to preserve acoustic prerequisites necessary for the adequate vowel identity. While the method of visually presented loudness target elicits larger span of vocal effort, the two elicitation methods achieve comparable consistency per loudness conditions.

    Download full text (pdf)
    fulltext
  • 9.
    Edlund, Jens
    et al.
    KTH Speech, Music and Hearing.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Catching wind of multiparty conversation2014In: Proceedings of Multimodal Corpora: Combining applied and basic research targets (MMC 2014) / [ed] Jens Edlund, Dirk Heylen, Patrizia Paggio, Reykjavik, Iceland: European Language Resources Association , 2014, p. 35-36Chapter in book (Other academic)
    Abstract [en]

    The paper describes the design of a novel multimodal corpus of spontaneous multiparty conversations in Swedish. The corpus is collected with the primary goal of investigating the role of breathing and its perceptual cues for interactive control of interaction. Physiological correlates of breathing are captured by means of respiratory belts, which measure changes in cross sectional area of the rib cage and the abdomen. Additionally, auditory and visual correlates of breathing are recorded in parallel to the actual conversations. The corpus allows studying respiratory mechanisms underlying organisation of spontaneous conversation, especially in connection with turn management. As such, it is a valuable resource both for fundamental research and speech techonology applications.

    Download full text (pdf)
    fulltext
  • 10.
    Edlund, Jens
    et al.
    KTH Speech, Music and Hearing.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics. Department of Linguistics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Is breathing prosody?2014In: International Symposium on Prosody to Commemorate Gösta Bruce, Lund: Lund University , 2014Conference paper (Other academic)
    Abstract [en]

    Even though we may not be aware of it, much breathing in face-to-face conversation is both clearly audible and visible. Consequently, it has been suggested that respiratory activity is used in the joint coordination of conversational flow. For instance, it has been claimed that inhalation is an interactionally salient cue to speech initiation, that exhalation is a turn yielding device, and that breath holding is a marker of turn incompleteness (e.g. Local & Kelly, 1986; Schegloff, 1996). So far, however, few studies have addressed the interactional aspects of breathing (one notable exeption is McFarland, 2001). In this poster, we will describe our ongoing efforts to fill this gap. We will present the design of a novel corpus of respiratory activity in spontaneous multiparty face-to-face conversations in Swedish. The corpus will contain physiological measurements relevant to breathing, high-quality audio, and video. Minimally, the corpus will be annotated with interactional events derived from voice activity detection and (semi-) automatically detected inhalation and exhalation events in the respiratory data. We will also present initial analyses of the material collected. The question is whether breathing is prosody and relevant to this symposium? What we do know is that the turntaking phenomena that of particular interest to us are closely (almost by definition) related to several prosodic phenomena, and in particular to those associated with prosodic phrasing, grouping and boundaries. Thus, we will learn more about respiratory activity in phrasing (and the like) through analyses of breathing in conversation. References Local, John K., & Kelly, John. (1986). Projection and 'silences': Notes on phonetic and conversational structure. Human Studies, 9, 185-204. McFarland, David H. (2001). Respiratory markers of conversational interaction. Journal of Speech, Language, and Hearing Research, 44, 128-143. Schegloff, E. A. (1996). Turn organization: One intersection of grammar and interaction. In E. Ochs, E. A. Schegloff & S. A. Thompson (Eds.), Interaction and Grammar (pp. 52-133), Cambridge: Cambridge University Press.

  • 11.
    F. Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    When a Dog is a Cat and How it Changes Your Pupil Size: Pupil Dilation in Response to Information Mismatch2017In: Proceedings of Interspeech 2017 / [ed] Francisco Lacerda, David House, Mattias Heldner, Joakim Gustafson, Sofia Strömbergsson, Marcin Włodarczak, 2017, p. 674-678Conference paper (Refereed)
    Abstract [en]

    In the present study, we investigate pupil dilation as a measure of lexical retrieval. We captured pupil size changes in reaction to a match or a mismatch between a picture and an auditorily presented word in 120 trials presented to ten native speakers of Swedish. In each trial a picture was displayed for six seconds, and 2.5 seconds into the trial the word was played through loudspeakers. The picture and the word were matching in half of the trials, and all stimuli were common high-frequency monosyllabic Swedish words. The difference in pupil diameter trajectories across the two conditions was analyzed with Functional Data Analysis. In line with the expectations, the results indicate greater dilation in the mismatch condition starting from around 800 ms after the stimulus onset. Given that similar processes were observed in brain imaging studies, pupil dilation measurements seem to provide an appropriate tool to reveal lexical retrieval. The results suggest that pupillometry could be a viable alternative to existing methods in the field of speech and language processing, for instance across different ages and clinical groups.

    Download full text (pdf)
    fulltext
  • 12.
    Forssén Renner, Lena
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarzcak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The surprised pupil: New perspectives in semantic processing research2016In: ISSBD 2016, 2016Conference paper (Refereed)
    Abstract [en]

    In the research on semantic processing and brain activity, the N400-paradigm has been long known to reflect a reaction to unexpected events, for instance the incongruence between visual and verbal information when subjects are presented with a picture and a mismatching word. In the present study, we investigate whether an N400-like reaction to unexpected events can be captured with pupillometry. While earlier research has firmly established a connection between changes in pupil diameter and arousal, the findings have not been so far extended to the domain of semantic processing. Consequently, we measured pupil size change in reaction to a match or a mismatch between a picture and an auditorily presented word. We presented 120 trials to ten native speakers of Swedish. In each trial a picture was displayed for six seconds, and 2.5 seconds into the trial the word was played through loudspeakers. The picture and the word were matching in half of the trials, and all stimuli were common high-frequency monosyllabic Swedish words. For the analysis, the baseline pupil size at the sound playback onset was compared against the maximum pupil size in the following time window of 3.5 seconds. The results show a statistically significant difference (t(746)=-2.8, p < 0.01) between the conditions. In line with the hypothesis, the pupil was observed to dilate more in the incongruent condition (on average by 0.03 mm). While the results are preliminary, they suggest that pupillometry could be a viable alternative to existing methods in the field of language processing, for instance across different ages and clinical groups. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a view to locating temporal regions of greatest differences between the conditions. In the future, we intend to validate the results on a larger sample of participants as well as expand the analysis with a functional analysis accounting for temporal changes in the data. This will allow locating temporal regions of greatest differences between the conditions.

  • 13. Gilmartin, Emer
    et al.
    Aare, Kätlin
    Stockholm University, Faculty of Humanities, Department of Linguistics. University of Tartu, Estonia.
    O'Reilly, Maria
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Between and within speaker transitions in multiparty conversation2020In: Proceedings of Speech Prosody, 2020, p. 799-803Conference paper (Refereed)
    Abstract [en]

    Casual conversation proceeds as a series of contributions from participants, either speaking in the clear or in overlap. The pattern of who is speaking or not (the conversational floor state) changes constantly throughout a conversation. We examine the nature and frequency of these state changes or transitions in multiparty talk, which may involve more complicated floor state transitions than dyadic interactions. We contrast within and between speaker transitions, analyzing the evolution of the conversational floor state from a stretch of single party speech in the clear to the next stretch of single party speech in the clear by the original or a different speaker. We investigate the effect of applying a minimum duration of single party speech in the clear to the incoming speaker’s production, finding substantial differences in how transitions are categorized. Over 40\% of the transitions categorized as between or within speaker change category depending on whether a minimum duration is applied to the following stretch of single party speech.

    Download full text (pdf)
    fulltext
  • 14.
    Gilmartin, Emer
    et al.
    ADAPT Centre, Trinity College Dublin, Ireland.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Getting from A to B: Complexities of turn change and retention in conversation2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 3457-3461Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 15. Gilmartin, Emer
    et al.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Getting from A to B: Exploring floor state transitions in conversation2021In: Proceedings of SemDial 2021, 2021Conference paper (Other academic)
  • 16. Hammarsten, Jonna
    et al.
    Harris, Roxanne
    Henriksson, Nilla
    Pano, Isabelle
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Temporal aspects of breathing and turn-taking in Swedish multiparty conversations2015In: Proceedings from Fonetik 2015 / [ed] Malin Svensson Lundmark, Gilbert Ambrazaitis, Joost van de Weijer, Lund: Centre for Languages and Literature, 2015, p. 47-50Conference paper (Other academic)
    Abstract [en]

    Interlocutors use various signals to make conversations flow smoothly. Recent research has shown that respiration is one of the signals used to indicate the intention to start speaking. In this study, we investigate whether inhalation duration and speech onset delay within one’s own turn differ from when a new turn is initiated. Respiratory activity was recorded in two three-party conversations using Respiratory Inductance Plethysmography. Inhalations were categorised depending on whether they coincided with within-speaker silences or with between- speaker silences. Results showed that within-turn inhalation durations were shorter than inhalations preceding new turns. Similarly, speech onset delays were shorter within turns than before new turns. Both these results suggest that speakers ‘speed up’ preparation for speech inside turns, probably to indicate that they intend to continue. 

    Download full text (pdf)
    fulltext
  • 17.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Carlsson, Denise
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Does lung volume size affect respiratory rate and utterance duration?2019In: Proceedings from Fonetik 2019, 2019, p. 97-102Conference paper (Other academic)
    Abstract [en]

    This study explored whether lung volume size affects respiratory rate and utterance duration. The lung capacity of four women and four men was estimated with a digital spirometer. These subjects subsequently read a nonsense text aloud while their respiratory movements were registered with a Respiratory Inductance Plethysmography (RIP) system. Utterance durations were measured from the speech recordings, and respiratory cycle durations and respiratory rates were measured from the RIP recordings. This experiment did not show any relationship between lung volume size and respiratory rate or utterance duration.

    Download full text (pdf)
    fulltext
  • 18.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Riad, Tomas
    Stockholm University, Faculty of Humanities, Department of Swedish Language and Multilingualism, Scandinavian Languages. Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Sundberg, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Zora, Hatice
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pride and prominence2021In: Working Papers in Linguistics: Proceedings of Fonetik 2021, 2021, p. 1-6Conference paper (Other academic)
    Abstract [en]

    Given the importance of the entire voice source in prominence expression, this paper aims to explore whether the word accent distinction can be defined by the voice quality dynamics moving beyond the tonal movements.To this end, a list of word accent pairs in Central Swedish were recorded and analysed based on a set of acoustic features extracted from the accelerometer signal. The results indicate that the tonal movements are indeed accompanied by the voice quality dynamics such as intensity, periodicity, harmonic richness and spectral tilt, and suggest that these parameters might contribute to the perception of one vs. two peaks associated with the word accent distinction in this regional variant of Swedish. These results, although based on limited data, are of crucial importance for the designation of voice quality variation as a prosodic feature per se.

  • 19.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wagner, Petra
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Deep throat as a source of information2018In: Proceedings Fonetik 2018 / [ed] Åsa Abelin, Yasuko Nagano-Madsen, Gothenburg: University of Gothenburg, 2018, p. 33-38Conference paper (Other academic)
    Abstract [en]

    In this pilot study we explore the signal from an accelerometer placed on the tracheal wall (below the glottis) for obtaining robust voice quality estimates. We investigate cepstral peak prominence smooth, H1-H2 and alpha ratio for distinguishing between breathy, modal and pressed phonation across six (sustained) vowel qualities produced by four speakers and including a systematic variation of pitch. We show that throat signal spectra are unaffected by vocal tract resonances, F0 and speaker variation while retaining sensitivity to voice quality dynamics. We conclude that the throat signal is a promising tool for studying communicative functions of voice prosody in speech communication.

    Download full text (pdf)
    fulltext
  • 20.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Beňuš, Štefan
    Gravano, Agustín
    Voice Quality as a Turn-Taking Cue2019In: Proceedings of Interspeech 2019 / [ed] Gernot Kubin, Zdravko Kačič, The International Speech Communication Association (ISCA), 2019, p. 4165-4169Conference paper (Refereed)
    Abstract [en]

    This work revisits the idea that voice quality dynamics (VQ) contributes to conveying pragmatic distinctions, with two case studies to further test this idea. First, we explore VQ as a turn-taking cue, and then as a cue for distinguishing between different functions of affirmative cue words. We employ acoustic VQ measures claimed to be better suited for continuous speech than those in own previous work. Both cases indicate that the degree of periodicity (as measured by CPPS) is indeed relevant in the production of the different pragmatic functions. In particular, turn-yielding is characterized by lower periodicity, sometimes accompanied by presence of creaky voice. Periodicity also distinguishes between backchannels, agreements and acknowledgements.

    Download full text (pdf)
    fulltext
  • 21.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Is breathing silence?2016In: Proceedings of Fonetik 2016 / [ed] Jens Edlund, Stockholm: KTH Royal Institute of Technology, 2016, p. 35-38Conference paper (Other academic)
    Abstract [en]

    This paper investigates whether inhalation noises are treated as silences in speech communication. A perception experiment revealed differences in pause detection thresholds for breathing pauses and silent pauses. This in turn indicates that breathing pauses are treated differently by the perceptual system, and could potentially carry a communicative function. 

    Download full text (pdf)
    fulltext
  • 22.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Pitch Slope and End Point as Turn-Taking Cues in Swedish2015In: Proceedings of the 18th International Congress of Phonetic Sciences / [ed] Maria Wolters, Judy Livingstone, Bernie Beattie, Rachel Smith, Mike MacMahon, Jane Stuart-Smith, Jim Scobbie, Glasgow: University of Glasgow , 2015Conference paper (Refereed)
    Abstract [en]

    This paper examines the relevance of parameters related to slope and end-point of pitch segments for indicating turn-taking intentions in Swedish. Perceptually motivated stylization in Prosogram was used to characterize the last pitch segment in talkspurts involved in floor-keeping and turn- yielding events. The results suggest a limited contribution of pitch pattern direction and position of its endpoint in the speaker’s pitch range to signaling turn-taking intentions in Swedish. 

    Download full text (pdf)
    Pitch slope and End Point as Turn-Taking Cues in Swedish
  • 23.
    Heldner, Mattias
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Branderud, Peter
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Stark, Johan
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    The RespTrack system2019In: 1st International Seminar on the Foundations of Speech : BREATHING, PAUSING, AND THE VOICE, 1st –3rd December 2019 in Sønderborg, Denmark: Conference Proceedings, 2019, p. 16-18Conference paper (Refereed)
    Abstract [en]

    This paper describes the RespTrack system for measuring and real-time monitoring of respiratory movements. RespTrack was developed in the Phonetics Laboratory at Stockholm University and the authors have been using it extensively for research for the past five years. Here, we describe briefly the underlying techniques, calibration, digitization as well as recent developments of the system. The presentation at SEFOS 2019 will also include a live demonstration of the system.

    Download full text (pdf)
    fulltext
  • 24.
    Kirkland, Ambika
    et al.
    KTH Royal Institute of Technology.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Gustafson, Joakim
    KTH Royal Institute of Technology.
    Székely, Éva
    KTH Royal Institute of Technology.
    Evaluating the impact of disfluencies on the perception of speaker competence using neural speech synthesis2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 550-554Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 25. Kirkland, Ambika
    et al.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Gustafson, Joakim
    Szekely, Eva
    Perception of smiling voice in spontaneous speech synthesis2021In: Proceedings of Speech Synthesis Workshop (SSW11), 2021Conference paper (Refereed)
    Abstract [en]

    Smiling during speech production has been shown to result in perceptible acoustic differences compared to non-smiling speech. However, there is a scarcity of research on the perception of “smiling voice” in synthesized spontaneous speech. In this study, we used a sequence-to-sequence neural text-tospeech system built on conversational data to produce utterances with the characteristics of spontaneous speech. Segments of speech following laughter, and the same utterances not preceded by laughter, were compared in a perceptual experiment after removing laughter and/or breaths from the beginning of the utterance to determine whether participants perceive the utterances preceded by laughter as sounding as if they were produced while smiling. The results showed that participants identified the post-laughter speech as smiling at a rate significantly greater than chance. Furthermore, the effect of content (positive/neutral/negative) was investigated. These results show that laughter, a spontaneous, non-elicited phenomenon in our model’s training data, can be used to synthesize expressive speech with the perceptual characteristics of smiling.

  • 26.
    Lacerda, Francisco
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    House, DavidHeldner, MattiasStockholm University, Faculty of Humanities, Department of Linguistics.Gustafson, JoakimStrömbergsson, SofiaWlodarczak, MarcinStockholm University, Faculty of Humanities, Department of Linguistics.
    Interspeech 2017: Situated interaction: Book of abstracts2017Conference proceedings (editor) (Refereed)
  • 27.
    Lameris, Harm
    et al.
    KTH Royal Institute of Technology.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Gustafson, Joakim
    KTH Royal Institute of Technology.
    Székely, Éva
    KTH Royal Institute of Technology.
    Neural speech synthesis with controllable creaky voice style2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 3141-3145Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 28.
    Laskowski, Kornel
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics. Voci Technologies, Inc., USA.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    A Scalable Method for Quantifying the Role of Pitch in Conversational Turn-Taking2019In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue: Proceedings of the Conference, Association for Computational Linguistics, 2019, p. 284-292Conference paper (Refereed)
    Abstract [en]

    Pitch has long been held as an important signalling channel when planning and deploying speech in conversation, and myriad studies have been undertaken to determine the extent to which it actually plays this role. Unfortunately, these studies have required considerable human investment in data preparation and analysis, and have therefore often been limited to a handful of specific conversational contexts. The current article proposes a framework which addresses these limitations, by enabling a scalable, quantitative characterization of the role of pitch throughout an entire conversation, requiring only the raw signal and speech activity references. The framework is evaluated on the Switchboard dialogue corpus. Experiments indicate that pitch trajectories of both parties are predictive of their incipient speech activity; that pitch should be expressed on a logarithmic scale and Z-normalized, as well as accompanied by a binary voicing variable; and that only the most recent 400 ms of the pitch trajectory are useful in incipient speech activity prediction.

    Download full text (pdf)
    fulltext
  • 29.
    Ludusan, Bogdan
    et al.
    Bielefeld University.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Exploring the role of formant frequencies in the classification of phonation type2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 1726-1730Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 30. Ludusan, Bogdan
    et al.
    Wagner, Petra
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Cue interaction in the perception of prosodic prominence: The role of voice quality.2021In: Proceedings of Interspeech 2021, 2021Conference paper (Refereed)
    Abstract [en]

    Voice quality is an important dimension in human communication, used to mark a variety of phenomena in speech, including prosodic prominence. Even though numerous studies have shown that speakers modify their voice quality parameters for marking prosodic prominence, the impact of these modifications on perceived prominence is less studied. Our investigation looks at the effect of a well-known measure of voice quality, cepstral peak prominence (CPP), on syllabic prominence ratings given by both naive and expert listeners. Employing read speech materials in German, we quantify the role of CPP alone and in combination with other acoustic cues marking prominence, namely intensity, duration and fundamental frequency. While CPP, by itself, had a significant effect on the perceived prominence for most of the listeners, when used in conjunction with the other cues, its impact was reduced. Moreover, when assessing the importance of each of these four cues for determining the perceived prominence score we found important individual variation, as well as differences between naive and expert listeners.

  • 31. Malisz, Zofia
    et al.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Buschmeier, Hendrik
    Skubisz, Joanna
    Kopp, Stefan
    Wagner, Petra
    The ALICO corpus: analysing the active listener2016In: Language resources and evaluation, ISSN 1574-020X, E-ISSN 1574-0218, Vol. 50, no 2, p. 411-442Article in journal (Refereed)
    Abstract [en]

    The Active Listening Corpus (ALICO) is a multimodal data set of spontaneous dyadic conversations in German with diverse speech and gestural annotations of both dialogue partners. The annotations consist of short feedback expression transcriptions with corresponding communicative function interpretations as well as segmentations of interpausal units, words, rhythmic prominence intervals and vowel-to-vowel intervals. Additionally, ALICO contains head gesture annotations of both interlocutors. The corpus contributes to research on spontaneous human–human interaction, on functional relations between modalities, and timing variability in dialogue. It also provides data that differentiates between distracted and attentive listeners. We describe the main characteristics of the corpus and briefly present the most important results obtained from analyses in recent years.

  • 32. Suni, Antti
    et al.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Vainio, Martti
    Šimko, Juraj
    Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings2019In: Proceedings of Interspeech 2019 / [ed] Gernot Kubin, Zdravko Kačič, The International Speech Communication Association (ISCA), 2019, p. 2538-2542Conference paper (Refereed)
    Abstract [en]

    We present a methodology for assessing similarities and differences between language varieties and dialects in terms of prosodic characteristics. A multi-speaker, multi-dialect WaveNet network is trained on low sample-rate signal retaining only prosodic characteristics of the original speech. The network is conditioned on labels related to speakers’ region or dialect. The resulting conditioning embeddings are subsequently used as a multi-dimensional characteristics of different language varieties, with results consistent with dialectological studies. The method and results are illustrated on a Swedia 2000 corpus of Swedish dialectal variation.

    Download full text (pdf)
    fulltext
  • 33. Ward, Nigel
    et al.
    Kirkland, Ambika
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Székely, Éva
    Two pragmatic functions of breathy voice in American English conversation2022In: Proceedings of Speech Prosody 2022, 2022Conference paper (Refereed)
    Abstract [en]

    Although the paralinguistic and phonological significance of breathy voice is well known, its pragmatic roles have been little studied. We report a systematic exploration of the pragmatic functions of breathy voice in American English, using a small corpus of casual conversations, using the Cepstral Peak Prominence Smoothed measure as an indicator of breathy voice, and using a common workflow to find prosodic constructions and identify their meanings. We found two prosodic constructions involving breathy voice. The first involves a short region of breathy voice in the midst of a region of low pitch, functioning to mark self-directed speech. The second involves breathy voice over several seconds, combined with a moment of wider pitch range leading to a high pitch over about a second, functioning to mark an attempt to establish common ground. These interpretations were confirmed by a perception experiment.

  • 34.
    Wikse Barrow, Carla
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Strömbergsson, Sofia
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Individual variation in the realisation and contrast of Swedish children’s word-initial voiceless fricatives2024In: Article in journal (Refereed)
  • 35.
    Wikse Barrow, Carla
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Strömbergsson, Sofia
    Karolinska Institutet.
    Variability in Swedish voiceless fricative contrasts2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 813-817Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 36.
    Wikse Barrow, Carla
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Włodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Thörn, Lisa
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Static and dynamic spectral characteristics of Swedish voiceless fricatives2022In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 152, no 5, p. 2588-2600Article in journal (Refereed)
    Abstract [en]

    Descriptions of the acoustic characteristics of Swedish voiceless fricatives are scarce and are limited to static measures derived from the speech of a small number of speakers. The current study provides an updated acoustic description of the static (spectral, temporal, and intensity) characteristics of word-initial voiceless fricatives in Central Standard Swedish. In addition, temporal variation of spectral centre of gravity is modelled using a generalized additive mixed model. Results show that fricatives were differentiated in terms of spectral properties, duration, and intensity level, such that sibilant fricatives were generally longer and more intense than non-sibilant fricatives. Spectral centre of gravity differentiated between all places of articulation apart from labio-dental /f/. Gender differences were found for centre of gravity in /s/ but overall, sex/gender differences were small. Dynamic analyses revealed differences in curvature as well as overall level of spectral centre of gravity across the duration of the fricative, associated with place of articulation and mediated by vowel context, fricative duration, and speaker specific patterns. The results from the present study are valuable for future cross-linguistic research, and as reference for investigations concerning children's acquisition of Swedish voiceless fricatives.

  • 37.
    Wlodarczak, Marcin
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    RespInPeace: Toolkit for processing respiratory belt data2019In: Proceedings of Fonetik 2019, 2019, p. 115-118Conference paper (Other academic)
    Abstract [en]

    RespInPeace is a Python toolkit for processing respiratory data collected using Respiratory Inductance Plethysmography (RIP). It provides methods for signal normalisation, calibration, parametrisation as well as for detection of respiratory events, such as inhalations, exhalations and breath holds. The paper gives a short overview of the most important functions of the program.

    Download full text (pdf)
    fulltext
  • 38.
    Wlodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Breathing in Conversation2020In: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 11, article id 575566Article in journal (Refereed)
    Abstract [en]

    This work revisits the problem of breathing cues used for management of speaking turns in multiparty casual conversation. We propose a new categorization of turn-taking events which combines the criterion of speaker change with whether the original speaker inhales before producing the next talkspurt. We demonstrate that the latter criterion could be potentially used as a good proxy for pragmatic completeness of the previous utterance (and, by extension, of the interruptive character of the incoming speech). We also present evidence that breath holds are used in reaction to incoming talk rather than as a turn-holding cue. In addition to analysing dimensions which are routinely omitted in studies of interactional functions of breathing (exhalations, presence of overlapping speech, breath holds), the present study also looks at patterns of breath holds in silent breathing and shows that breath holds are sometimes produced toward the beginning (and toward the top) of silent exhalations, potentially indicating an abandoned intention to take the turn. We claim that the breathing signal can thus be successfully used for uncovering hidden turn-taking events, which are otherwise obscured by silence-based representations of interaction.

  • 39.
    Wlodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory Constraints in Verbal and Non-verbal Communication2017In: Frontiers in Psychology, E-ISSN 1664-1078, Vol. 8, article id 708Article in journal (Refereed)
    Abstract [en]

    In the present paper we address the old question of respiratory planning in speech production. We recast the problem in terms of speakers' communicative goals and propose that speakers try to minimize respiratory effort in line with the H&H theory. We analyze respiratory cycles coinciding with no speech (i.e., silence), short verbal feedback expressions (SFE's) as well as longer vocalizations in terms of parameters of the respiratory cycle and find little evidence for respiratory planning in feedback production. We also investigate timing of speech and SFEs in the exhalation and contrast it with nods. We find that while speech is strongly tied to the exhalation onset, SFEs are distributed much more uniformly throughout the exhalation and are often produced on residual air. Given that nods, which do not have any respiratory constraints, tend to be more frequent toward the end of an exhalation, we propose a mechanism whereby respiratory patterns are determined by the trade-off between speakers' communicative goals and respiratory constraints.

  • 40.
    Wlodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics.
    Bruggeman, Anna
    Bielefeld University.
    Wagner, Petra
    Bielefeld University.
    Voice quality dynamics of turn-taking events in Swedish and German2023In: Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS), 2023, p. 3477-3481Conference paper (Refereed)
    Download full text (pdf)
    fulltext
  • 41.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Gilmartin, Emer
    Speaker transition patterns in three-party conversation: Evidence from English, Estonian and Swedish2021In: Proceedings of Interspeech 2021, 2021Conference paper (Refereed)
    Abstract [en]

    During conversation, speakers hold and relinquish the floor, resulting in turn yield and retention. We examine these phenomena in three-party conversations in English, Swedish, and Estonian. We define within- and between-speaker transitions in terms of shorter intervals of speech, silence and overlap bounded by stretches of one-party speech longer than 1 second by the same or different speakers. This method gives us insights into how turn change and retention proceed, revealing that the majority of speaker transitions are more complex and involve more intermediate activity than a single silence or overlap. We examine the composition of within and between transitions in terms of number of speakers involved, incidence and proportion of solo speech, silence and overlap. We derive the most common within- and between-speaker transitions in the three languages, finding evidence of striking commonalities in how the floor is managed. Our findings suggest that current models of turn-taking used in dialogue technology could be extended using these results to more accurately reflect the realities of human-human dialogue.

  • 42.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Breathing in conversation — what we’ve learned2019In: 1st International Seminar on the Foundations of Speech : BREATHING, PAUSING, AND THE VOICE, 1st –3rd December 2019 in Sønderborg, Denmark: Conference Proceedings / [ed] Niebuhr, Oliver; Neitsch, Jana; Berger, Stephanie; Fischer, Kerstin; Michalsky, Jan; Eisenberger, Selina; Jelínek, Matouš, 2019, p. 13-15Conference paper (Refereed)
    Abstract [en]

    In this paper, we provide an overview of selected findings on interactional aspects of breathing in multiparty conversation, accumulated largely over the course of a four- year research project Breathing in conversation, carried out at the Department of Linguistics, Stockholm University. In particular, we focus on results demonstrating the contribution of the respiratory signal to prediction of imminent speech activity, as well as on turn-holding and turn-yielding cues.

    Download full text (pdf)
    fulltext
  • 43.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Capturing respiratory sounds with throat microphones2017In: Nordic Prosody: Proceedings of the XIIth Conference, Trondheim 2016 / [ed] Jardar Eggesbö Abrahamsen, Jacques Koreman, Wim van Dommelen, Peter Lang Publishing Group, 2017, p. 181-190Conference paper (Refereed)
    Abstract [en]

    This paper presents the results of a pilot study using throat microphones for recording respiratory sounds. We demonstrate that inhalation noises are louder before longer stretches of speech than before shorter utterances (< 1 s) and in silent breathing. We thus replicate the results from our earlier study which used close-talking head-mounted microphones, without the associated data loss due to cross-talk. We also show that inhalations are louder within than before a speaking turn. Hence, the study provides another piece of evidence in favour of communicative functions of respiratory noises serving as potential turn-taking (for instance, turn-holding) cues. 

    Download full text (pdf)
    fulltext
  • 44.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Contribution of voice quality to prediction of turn-taking events2022In: Proceedings of Speech Prosody 2022 / [ed] S. Frota, M. Cruz, & M. Vigário, 2022, p. 485-489Conference paper (Refereed)
    Abstract [en]

    This paper evaluates the contribution of acoustic voice quality measures to prediction of upcoming floor change and retention. In order to minimize the influence of vocal tract resonances, the measures were calculated from miniature accelerometers attached to the tracheal wall. Overall, speaker changes accom- panied by silence were characterized by lower periodicity and steeper spectral slope than turn-holds and speaker changes in- volving overlapping speech. When used on their own, voice quality features contributed to prediction of turn-taking category, this was particularly true of smoothed cepstral peak prominence (CPPS). At the same time, their importance was limited when used in combination with fundamental frequency and intensity, especially compared to the joint effect of these two predictors.

    Download full text (pdf)
    fulltext
  • 45.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Exhalatory turn-taking cues2018In: Proceedings 9th International Conference on Speech Prosody 2018 / [ed] Katarzyna Klessa, Jolanta Bachan, Agnieszka Wagner, Maciej Karpiński, Daniel Śledziński, Poznań, Poland: The International Speech Communication Association (ISCA), 2018, p. 334-338Conference paper (Refereed)
    Abstract [en]

    The paper is a study of kinematic features of the exhalation which signal that the speaker is done speaking and wants to yield the turn. We demonstrate that the single most prominent feature is the presence of inhalation directly following the exhalation. However, several features of the exhalation itself are also found to significantly distinguish between turn holds and yields, such as slower exhalation rate and higher lung level at exhalation onset. The results complement existing body evidence on respiratory turn-taking cues which has so far involved mainly inhalatory features. We also show that respiration allows discovering pause interruptions thus allowing access to unrealised turn-taking intentions.

  • 46.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory belts and whistles: A preliminary study of breathing acoustics for turn-taking2016In: Proceedings of Interspeech 2016 / [ed] Nelson Morgan, International Speech Communication Association, 2016, p. 510-514Conference paper (Refereed)
    Abstract [en]

    This paper presents first results on using acoustic intensity of inhalations as a cue to speech initiation in spontaneous multiparty conversations. We demonstrate that inhalation intensity significantly differentiates between cycles coinciding with no speech activity, shorter (< 1 s) and longer stretches of speech. While the model fit is relatively weak, it is comparable to the fit of a model using kinematic features collected with Respiratory Inductance Plethysmography. We also show that incorpo- rating both kinematic and acoustic features further improves the model. Given the ease of capturing breath acoustics, we consider the results to be a promising first step towards studying communicative functions of respiratory sounds. We discuss possible extensions to the data collection procedure with a view to improving predictive power of the model. 

    Download full text (pdf)
    fulltext
  • 47.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory Properties of Backchannels in Spontaneous Multiparty Conversation2015In: Proceedings of the 18th International Congress of Phonetic Sciences / [ed] Maria Wolters, Judy Livingstone, Bernie Beattie, Rachel Smith, Mike MacMahon, Jane Stuart-Smith, Jim Scobbie, Glasgow: University of Glasgow , 2015Conference paper (Refereed)
    Abstract [en]

    In this paper we report on first results of a newly started project focussing on interactional functions of breathing in spontaneous multiparty conversation. Specifically, we investigate respiratory patterns associated with backchannels (short feedback expressions), and compare them with breathing cycles observed during longer stretches of speech or while listening to interlocutor’s speech. Overall, inhalations preceding backchannels were found to resemble those in quiet breathing to a large degree. The results are discussed in terms of temporal organisation and respiratory planning in these utterances. 

    Download full text (pdf)
    fulltext
  • 48.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Respiratory turn-taking cues2016In: Proceedings of Interspeech 2016 / [ed] Nelson Morgan, The International Speech Communication Association (ISCA), 2016, p. 1275-1279Conference paper (Refereed)
    Abstract [en]

    This paper investigates to what extent breathing can be used as a cue to turn-taking behaviour. The paper improves on existing accounts by considering all possible transitions between speaker states (silent, speaking, backchanneling) and by not relying on global speaker models. Instead, all features (including breathing range and resting expiratory level) are estimated in an incremental fashion using the left-hand context. We identify several inhalatory features relevant to turn-management, and assess the fit of models with these features as predictors of turn-taking behaviour.

    Download full text (pdf)
    fulltext
  • 49.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Turn-taking in conversation from the larynx down2021In: The Role of the Current Speaker in Conversational Turn Taking – Theoretical, Experimental, and Corpus Linguistic Perspectives on Speaker Contributions to Aligned Turn-Timing / [ed] Barthel, Mathias, 2021Conference paper (Refereed)
    Abstract [en]

    In this talk, we will give an overview of some of our results, both old and new, about respiratory and phonatory turn-taking cues. Both of these aspects of turn coordination are rarely addressed in literature, which focuses primarily on its articulatory and prosodic characteristics.

    In the respiratory part of the presentation, we will discuss a new categorisation of turn-taking events which combines the criterion of speaker change with whether the original speaker inhales be- fore producing the next talkspurt. We will demonstrate that the latter criterion could be potentially used as a proxy for pragmatic completeness of the previous utterance (and, by extension, of the inter- ruptive character of the incoming speech). Specifically, respiratory properties of silences accompanied by speaker change in which the original speaker continues talking without breathing in are similar to those in within-speaker, turn-holding silences. We will also present evidence that the likelihood of speaker change is higher during pauses accompanied by a respiratory hold, suggesting that breath holds are used in reaction to incoming talk rather than as a turn-holding cue. In addition to analysing dimensions which are routinely omitted in studies of interactional functions of breathing (exhalations, presence of overlapping speech, breath holds), we will analyse patterns of breath holds in silent breath- ing and show that breath holds are sometimes produced towards the beginning (and towards the top) of silent exhalations, potentially indicating an abandoned intention to take the turn. We claim that the breathing signal can thus be successfully used for uncovering hidden turn-taking events, which are otherwise obscured by silence-based representations of interaction.

    Moving up from the lungs to the larynx, in the second part of the talk we will focus on our ongoing work on voice quality variation in spontaneous interactions, a topic which has received little attention so far, not least because of the technical difficulties associated with recording phonation in continuous speech. In order to circumvent these problems, we are using miniature accelerometers attached to the skin of the tracheal wall below the glottis (“throat microphones”). Tue method, which has been used for some time in ambulatory postoperative voice monitoring, provides a good approximation of the voice source without the need for glottal inverse-filtering. We will demonstrate that the accelerometer signal can be successfully used to differentiate between voice qualities in isolated vowels while being unaffected by vocal tract resonances, fo and speaker variation. We will also present some preliminary result comparing several voice quality measures in speech intervals preceding silences accompanied by speaker change or followed by more speech from the same person. We demonstrate that utterances ending in speaker changes are characterised by lower periodicity and higher rates of creaky voice. Tue findings are thus consistent with the “trailing-off” character of these silences, as suggested in literature.

  • 50.
    Włodarczak, Marcin
    et al.
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Heldner, Mattias
    Stockholm University, Faculty of Humanities, Department of Linguistics, Phonetics.
    Edlund, Jens
    Breathing in Conversation: An Unwritten History2015In: Proceedings of the 2nd European and the 5th Nordic Symposium on Multimodal Communication / [ed] Kristiina Jokinen, Martin Vels, Linköping, 2015, p. 107-112Conference paper (Refereed)
    Abstract [en]

    This paper attempts to draw attention of the multimodal communication research community to what we consider a long overdue topic, namely respiratory activity in conversation. We submit that a turn towards spontaneous interaction is a natural extension of the recent interest in speech breathing, and is likely to offer valuable insights into mechanisms underlying organisation of interaction and collaborative human action in general, as well as to make advancement in existing speech technology applications. Particular focus is placed on the role of breathing as a perceptually and interactionally salient turn-taking cue. We also present the recording setup developed in the Phonetics Laboratory at Stockholm University with the aim of studying communicative functions of physiological and audio-visual breathing correlates in spontaneous multiparty interactions

    Download full text (pdf)
    fulltext
12 1 - 50 of 57
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf