Digitala Vetenskapliga Arkivet

Ändra sökning
Avgränsa sökresultatet
1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Elowsson, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Modeling Music: Studies of Music Transcription, Music Perception and Music Production2018Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
    Abstract [sv]

    Denna avhandling presenterar tio studier inom tre viktiga delområden av forskningsområdet ”Music Information Retrieval” (MIR) – ett forskningsområde fokuserat på att extrahera information från musik. Del A riktar in sig på musiktranskription, del B på musikperception och del C på musikproduktion. En avslutande del diskuterar maskininlärningsmetodiken och spanar framåt (del D).

    I del A presenteras system som kan transkribera musik med hänsyn till rytm och polyfon tonhöjd. De två första publikationerna beskriver metoder för att estimera tempo och positionen av taktslag i ljudande musik. En metod för att beräkna den mest framstående periodiciteten (”cepstroiden”) beskrivs, samt hur denna kan användas för att guida de applicerade maskininlärningssystemen.  Systemet för polyfon tonhöjdsestimering kan både identifiera ljudande toner samt notstarter- och slut. Detta system är både tonhöjdsinvariant samt invariant med hänseende till variationer över tid inom ljudande toner. Transkriptionssystemen tränas till att predicera flera musikaspekter i en hierarkisk struktur. Transkriptionsresultaten är de bästa som rapporterats i tester på flera olika dataset.

    Del B fokuserar på perceptuella särdrag i musik. Dessa kan prediceras för att modellera fundamentala perceptionsaspekter, men de kan också användas som representationer i modeller som försöker klassificera övergripande musikparametrar. Modeller presenteras som kan predicera den upplevda hastigheten samt den upplevda dynamiken i utförandet med hög precision. Medelvärdesbildade skattningar från omkring 20 lyssnare utgör målvärden under träning och evaluering.

    I del C utforskas aspekter relaterade till musikproduktion. Den första studien analyserar variationer i medelvärdesspektrum mellan populärmusikaliska musikstycken. Analysen visar att nivån på perkussiva instrument är en viktig faktor för spektrumdistributionen – data antyder att denna nivå är bättre att använda än genreklassificeringar för att förutsäga spektrum. Den andra studien i del C behandlar musikkomposition. Ett algoritmiskt kompositionsprogram presenteras, där relevanta musikparametrar fogas samman en hierarkisk struktur. Ett lyssnartest genomförs för att påvisa validiteten i programmet och undersöka effekten av vissa parametrar.

    Avhandlingen avslutas med del D, vilken placerar den utvecklade maskininlärningstekniken i ett vidare sammanhang och föreslår nya metoder för att generalisera rytmprediktion. Den första studien diskuterar djupinlärningssystem som predicerar olika musikaspekter i en hierarkisk struktur. Relevanta koncept presenteras tillsammans med förslag för framtida implementationer. Den andra studien föreslår en tempoinvariant metod för att processa log-frekvensdomänen av rytmsignaler med så kallade convolutional neural networks. Den föreslagna arkitekturen kan använda sig av magnitud, relative fas mellan rytmkanaler, samt ursprunglig fas från frekvenstransformen för att ta sig an flera viktiga problem relaterade till rytm.

    Ladda ner fulltext (pdf)
    Introduction and Summary of Dissertation: Modeling Music - Anders Elowsson
  • 2.
    Elowsson, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Polyphonic Pitch Tracking with Deep Layered LearningManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    This paper presents a polyphonic pitch tracking system able to extract both framewise and note-based estimates from audio. The system uses six artificial neural networks in a deep layered learning setup. First, cascading networks are applied to a spectrogram for framewise fundamental frequency (f0) estimation. A sparse receptive field is learned by the first network and then used for weight-sharing throughout the system. The f0 activations are connected across time to extract pitch ridges. These ridges define a framework, within which subsequent networks perform tone-shift-invariant onset and offset detection. The networks convolve the pitch ridges across time, using as input, e.g., variations of latent representations from the f0 estimation networks, defined as the “neural flux.” Finally, incorrect tentative notes are removed one by one in an iterative procedure that allows a network to classify notes within an accurate context. The system was evaluated on four public test sets: MAPS, Bach10, TRIOS, and the MIREX Woodwind quintet, and performed state-of-the-art results for all four datasets. It performs well across all subtasks: f0, pitched onset, and pitched offset tracking.

  • 3.
    Elowsson, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Tempo-Invariant Processing of Rhythm with Convolutional Neural NetworksManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Rhythm patterns can be performed with a wide variation of tempi. This presents a challenge for many music information retrieval (MIR) systems; ideally, perceptually similar rhythms should be represented and processed similarly, regardless of the specific tempo at which they were performed. Several recent systems for tempo estimation, beat tracking, and downbeat tracking have therefore sought to process rhythm in a tempo-invariant way, often by sampling input vectors according to a precomputed pulse level. This paper describes how a log-frequency representation of rhythm-related activations instead can promote tempo invariance when processed with convolutional neural networks. The strategy incorporates invariance at a fundamental level and can be useful for most tasks related to rhythm processing. Different methods are described, relying on magnitude, phase relationships of different rhythm channels, as well as raw phase information. Several variations are explored to provide direction for future implementations.

  • 4.
    Lã, Filipa M.B.
    et al.
    University of Distance-Learning, MADRID, Spain.
    Ternström, Sten
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH, Musikakustik.
    Flow ball-assisted training: immediate effects on vocal fold contacting2019Ingår i: Pan-European Voice Conference 2019 / [ed] Jenny Iwarsson, Stine Løvind Thorsen, University of Copenhagen , 2019, s. 50-51Konferensbidrag (Refereegranskat)
    Abstract [en]

    Background: The flow ball is a device that creates a static backpressure in the vocal tract while providing real-time visual feedback of airflow. A ball height of 0 to 10 cm corresponds to airflows of 0.2 to 0.4. L/s. These high airflows with low transglottal pressure correspond to low flow resistances, similar to the ones obtained when phonating into straws with 3.7 mm diameter and of 2.8 cm length. Objectives: To investigate whether there are immediate effects of flow ball-assisted training on vocal fold contact. Methods: Ten singers (five males and five females) performed a messa di voce at different pitches over one octave in three different conditions: before, during and after phonating with a flow ball. For all conditions, both audio and electrolaryngographic (ELG) signals were simultaneously recorded using a Laryngograph microprocessor. The vocal fold contact quotient Qci (the area under the normalized EGG cycle) and dEGGmaxN (the normalized maximum rate of change of vocal fold contact area) were obtained for all EGG cycles, using the FonaDyn system. We introduce also a compound metric Ic ,the ‘index of contact’ [Qci × log10(dEGGmaxN)], with the properties that it goes to zero at no contact. It combines information from both Qci and dEGGmaxN and thus it is comparable across subjects. The intra-subject means of all three metrics were computed and visualized by colour-coding over the fo-SPL plane, in cells of 1 semitone × 1 dB. Results: Overall, the use of flow ball-assisted phonation had a small yet significant effect on overall vocal fold contact across the whole messa di voce exercise. Larger effects were evident locally, i.e., in parts of the voice range. Comparing the pre-post flow-ball conditions, there were differences in Qci and/or dEGGmaxN. These differences were generally larger in male than in female voices. Ic typically decreased after flow ball use, for males but not for females. Conclusion: Flow ball-assisted training seems to modify vocal fold contacting gestures, especially in male singers.

  • 5.
    Pabon, Peter
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH, Musikakustik. Royal Conservatoire, The Hague, Netherlands.
    Ternström, Sten
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH, Musikakustik.
    Feature maps of the acoustic spectrum of the voice2020Ingår i: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588, Vol. 34, nr 1, s. 161.e1-161.e26Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The change in the spectrum of sustained /a/ vowels was mapped over the voice range from low to high fundamental frequency and low to high sound pressure level (SPL), in the form of the so-called voice range profile (VRP). In each interval of one semitone and one decibel, narrowband spectra were averaged both within and across subjects. The subjects were groups of 7 male and 12 female singing students, as well as a group of 16 untrained female voices. For each individual and also for each group, pairs of VRP recordings were made, with stringent separation of the modal/chest and falsetto/head registers. Maps are presented of eight scalar metrics, each of which was chosen to quantify a particular feature of the voice spectrum, over fundamental frequency and SPL. Metrics 1 and 2 chart the role of the fundamental in relation to the rest of the spectrum. Metrics 3 and 4 are used to explore the role of resonances in relation to SPL. Metrics 5 and 6 address the distribution of high frequency energy, while metrics 7 and 8 seek to describe the distribution of energy at the low end of the voice spectrum.

    Several examples are observed of phenomena that are difficult to predict from linear source-filter theory, and of the voice source being less uniform over the voice range than is conventionally assumed. These include a high-frequency band-limiting at high SPL and an unexpected persistence of the second harmonic at low SPL. The two voice registers give rise to clearly different maps. Only a few effects of training were observed, in the low frequency end below 2 kHz. The results are of potential interest in voice analysis, voice synthesis and for new insights into the voice production mechanism.

  • 6. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Ternström, Sten
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Musikakustik.
    Acoustic comparison of soprano solo and choir singing.1987Ingår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 82, nr 3, s. 830-836Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Five soprano singers were recorded while singing similar texts in both choir and solo modes of performance. A comparison of long-term-average spectra of similar passages in both modes indicates that subjects used different tactics to achieve somewhat higher concentrations of energy in the 2- to 4-kHz range when singing in the solo mode. It is likely that this effect resulted, at least in part, from a slight change of the voice source from choir to solo singing. The subjects used slightly more vibrato when singing in the solo mode.

  • 7. Rossing, T D
    et al.
    Sundberg, Johan
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Ternström, Sten
    Acoustic comparison of voice use in solo and choir singing.1986Ingår i: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 79, nr 6, s. 1975-1981Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    An experiment was carried out in which eight bass/baritone singers were recorded while singing in both choral and solo modes. Together with their own voice, they heard the sound of the rest of the choir and a piano accompaniment, respectively. The recordings were analyzed in several ways, including computation of long-time-average spectra for each passage, analysis of the sound levels in the frequency ranges corresponding to the fundamental and the "singer's formant," and a comparison of the sung levels with the levels heard by the singers. Matching pairs of vowels in the two modes were inverse filtered to determine the voice source spectra and formant frequencies for comparison. Differences in both phonation and articulation between the two modes were observed. Subjects generally sang with more power in the singer's formant region in the solo mode and with more power in the fundamental region in the choral mode. Most singers used a reduced frequency distance between the third and fifth formants for increasing the power in the singer's formant range, while the difference in the fundamental was mostly a voice source effect. In a choral singing mode, subjects usually adjusted their voice levels to the levels they heard from the other singers, whereas in a solo singing mode the level sung depended much less on the level of an accompaniment.

  • 8.
    Ternström, Sten
    KTH, Tidigare Institutioner (före 2005), Talöverföring och musikakustik. KTH, Tidigare Institutioner (före 2005), Tal, musik och hörsel. KTH, Skolan för elektroteknik och datavetenskap (EECS), Intelligenta system, Tal, musik och hörsel, TMH, Musikakustik.
    Choir acoustics: an overview of scientific research published to date2003Ingår i: International Journal of Research in Choral Singing, Vol. 1, nr 1, s. 3-12Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Choir acoustics is but one facet of choir-related research, yet it is one of the most tangible. Several aspects of sound can be measured objectively, and such results can be related to known properties of voices, rooms, ears and musical scores. What follows is essentially an update of the literature overview in my Ph.D. dissertation from 1989 of empirical investigations known to me that deal specifically with the acoustics of choirs, vocal groups, or choir singers. This compilation of sources is no doubt incomplete in certain respects; nevertheless, it will hopefully prove to be useful for researchers and others interested in choir acoustics.

    Ladda ner fulltext (pdf)
    http://www.speech.kth.se/prod/publications/files/qpsr/2002/2002_43_1_001-008.pdf
  • 9.
    Ternström, Sten
    et al.
    KTH, Skolan för datavetenskap och kommunikation (CSC), Tal, musik och hörsel, TMH, Musikakustik.
    Nordmark, Jan
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Tal, musik och hörsel, TMH, Musikakustik.
    Intonation preferences for major thirds with non-beating ensemble sounds1996Ingår i: Proc. of Nordic Acoustical Meeting: NAM'96, Helsinki, 1996, s. 359-365, artikel-id F2Konferensbidrag (Refereegranskat)
    Abstract [en]

    The frequency ratios, or intervals, of the twelve-tone scale can be mathematically dejned in several slightly diferent ways, each of which may be more or less appropriate in different musical contexts. For maximum mobility in musical key, instruments of our time with fixed tuning are typically tuned in equal temperament, except for performances of early music or avant-garde contemporary music. Some contend that pure intonation, being free of beats, is more natural, and would be preferred in instruments with variable tuning. The sound of choirs is such that beats are very unlikely to serve as cues for intonation. Choral performers have access to variable tuning, yet have not been shown to prefer pure intonation. The difference between alternative intonation schemes is largest for the major third interval. Choral directors and other musically expert subjects were asked to adjust to their preference the intonation of 20 major third intervals in synthetic ensemble sounds. The preferred size of the major third was 395.4 cents, with intra-subject averages ranging from 388 to 407 cents.

    Ladda ner fulltext (pdf)
    fulltext
1 - 9 av 9
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf