Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Studies in Corpora and Idioms: Getting the cat out of the bag
Stockholm University, Faculty of Humanities, Department of English.ORCID iD: 0000-0002-6481-1975
2014 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

“Idiomatic” expressions, usually called “idioms”, such as a dime a dozen, a busman’s holiday, or to have bats in your belfry are a curious part of any language: they usually have a fixed lexical (why a busman?) and structural composition (only dime and dozen in direct conjunction mean ‘common, ordinary’), can be semantically obscure (why bats?), yet are widely recognized in the speech community, in spite of being so rare that only large corpora can provide us with access to sufficient empirical data on their use.

In this compilation thesis, four published studies focusing on idioms in corpora are presented. Study 1 details the creation of and data in the author’s medium-sized corpus from 1999, the 3.7 million word Coll corpus of online university student newspapers, with comparisons to data from standard corpora of the time. Study 2 examines the extent to which recognized idioms are to be found in the Coll corpus and how they can be varied. Study 3 draws upon the British National Corpus and a series of British and American newspaper corpora to see how idioms may be “anchored” in their contexts, primarily by the device of premodification via an adjective appropriate to the context, not to the idiom. Study 4 examines idiom-usage patterns in the Time Magazine corpus, focusing on possible aspects of diachronic change over the near-century Time represents.

The introductory compilation chapter places and discusses these studies in their contexts of contemporary idiom and corpus research; building on these studies, it provides two specific examples of potential ways forward in idiom research: an examination of the idioms used in a specific subgenre of newspapers (editorials), and a detailed suggestion for teachers about how to examine multiple facets of a specific modern idiom (the glass ceiling) in the classroom. Finally, a summing-up includes suggestions for further research, particularly at the level of the patterning of individual idioms, rather than treating them as a homogeneous phenomenon.

Place, publisher, year, edition, pages
Stockholm: Department of English, Stockholm University , 2014. , 217 p.71-90 p.
Keyword [en]
Coll corpus, corpora, corpus creation, idioms, idiom variation, idiom-breaking, online newspapers, student newspapers, college newspapers
National Category
Specific Languages
Research subject
English
Identifiers
URN: urn:nbn:se:su:diva-18029ISBN: 978-91-7447-975-1 (print)OAI: oai:DiVA.org:su-18029DiVA: diva2:740770
Public defence
2014-10-11, Lecture Hall 7 D, Universitetsvägen 10 D, Stockholm, 10:00 (English)
Opponent
Supervisors
Available from: 2014-09-18 Created: 2007-10-16 Last updated: 2014-12-16Bibliographically approved
List of papers
1. The Coll Corpus: Towards a corpus of web-based college student newspapers
Open this publication in new window or tab >>The Coll Corpus: Towards a corpus of web-based college student newspapers
2002 (English)In: New Frontiers of Corpus Research: Papers from the 21st International Conference on English Language Research on Computerized Corpora, Amsterdam: Rodopi, 2002, 71-90 p.Chapter in book (Other academic)
Abstract [en]

Unlike major English-language corpora hitherto released, on-line college student newspapers provide an unexplored record from much younger writers. In these newspapers, 20-year-olds address their peers in a situation that largely parallels standard newspaper writing as regards formal correctness and time pressure. Nearly unconstrained by outside intervention or house style sheets, they deal with a range of university student interests, including creative writing.

This preliminary version of the Coll Corpus consists of one issue each of nearly all 300-plus college and university newspapers available on the Web as of spring 1999, with a total of 3.88 million words. Although AmE dominates, the resultant geographical distribution is relatively well matched to actual population ratios. In its present form, the corpus already allows exploration of numerous lexical and semantic features along temporal and geographic dimensions. Given its on-line accessibility, future versions should be easily expandable by several orders of magnitude.

Place, publisher, year, edition, pages
Amsterdam: Rodopi, 2002
Keyword
corpus linguistics, corpora, electronic newspapers, Internet, web newspapers
National Category
Specific Languages
Identifiers
urn:nbn:se:su:diva-131850 (URN)90-420-1237-4 (ISBN)
Available from: 2007-10-16 Created: 2007-10-16 Last updated: 2014-08-27Bibliographically approved
2. The College Idiom: Idioms in the COLL Corpus
Open this publication in new window or tab >>The College Idiom: Idioms in the COLL Corpus
2008 (English)In: ICAME Journal: Computers in English Linguistics, ISSN 0801-5775, no 32, 115-39 p.Article in journal (Refereed) Published
Abstract [en]

As with much of vocabulary, idioms in the stricter sense appear to be acquired continually throughout one’s lifetime. Since most of the material in current large-scale corpora comes from writers well out of their teens, the 3.7 M word COLL corpus of college student online newspapers from Australia, the British Isles, New Zealand, North America and South Africa (Minugh 2002) provides one of the few already-compiled sources of writing by 20-year-olds, and thus is an interesting starting point for an investigation of which idioms are in use in the writing of university students in the English-speaking world when they address their peers. Using the idioms specified in the Collins COBUILD Dictionary of Idioms as our starting point, the COLL corpus will be examined for use of idioms. Specific questions to investigate include which idioms occur, their geographic and subgenre distribution, their positions in the texts and their textual functions. Idiom-breaking, i.e. playful variation, may also be expected to occur in this particular genre, and the corpus can provide an indication of how prevalent this is, as well.

Keyword
corpus linguistics, idioms, online newspapers
National Category
Specific Languages
Identifiers
urn:nbn:se:su:diva-17053 (URN)
Available from: 2009-01-05 Created: 2009-01-05 Last updated: 2014-08-26Bibliographically approved
3. The filling in the sandwich: internal modification of idioms
Open this publication in new window or tab >>The filling in the sandwich: internal modification of idioms
2007 (English)In: Corpus Linguistics 25 Years on, Rodopi, Amsterdam , 2007, 205-224 p.Chapter in book (Refereed)
Abstract [en]

Idiomatic expressions—defined as (relatively) fixed and semantically opaque units such as 'a one-horse town' or 'buy the farm' (= ‘die’)—are basically self-contained, but can be “anchored” in the discourse at hand via e.g. post-modification: "A great many people thought that the pendulum of permissiveness had swung too far." But internal expansion is also possible: "These dangers are being swept under the risk-factor rug." Using the BNC and newspaper CDs as corpora of sufficient size (approximately 300 million words in all), the patterns and frequency of such anchoring internal expansions in contemporary English are investigated, and compared with those for alternative formulations and the simplex form. Anchoring internal expansion is found to be generally possible, and occasionally inventive, but usually infrequent (with exceptions such as 'not have a leg to stand on'); anchoring the idiom via exemplification in a following clause is a primary discourse alternative.

Place, publisher, year, edition, pages
Rodopi, Amsterdam, 2007
Keyword
idioms, corpus linguistics, FEI, internal modification
National Category
Specific Languages
Identifiers
urn:nbn:se:su:diva-17854 (URN)978-90-420-2195-2 (ISBN)
Available from: 2007-10-16 Created: 2007-10-16 Last updated: 2014-08-26Bibliographically approved
4. Is Time A’Changin’?: A diachronic investigation of the idioms used in Time
Open this publication in new window or tab >>Is Time A’Changin’?: A diachronic investigation of the idioms used in Time
2008 (English)In: Selected Papers from the 2006 and 2007 Stockholm Metaphor Festival / [ed] Nils-Lennart Johannesson & David C. Minugh, Stockholm: Acta Universitatis Stockholmiensis, 2008, 111-130 p.Chapter in book (Other academic)
Abstract [en]

A newly-available net-based corpus of 105 million words of written American English (Time Magazine, 1923–2006, at http://corpus.byu.edu/time) was investigated for the occurrence and diachronic distribution of various types of ‘pure’ idioms such as be raining cats and dogs. Idioms from the Collins COBUILD Dictionary of Idioms (2002 (1995)) were selected for four types of variation and change. Group 1, the 46 idioms labeled ‘old-fashioned’, proved to be noticeably more common before 1970. Group 2, several constructions of the type as scarce as X, exhibited considerably more variation than in more diversified corpora such as the British National Corpus. Group 3, Biblically-derived idioms, were generally less common after 1960, but with the lowest frequencies in the 1930s. The frequencies for the final group, 32 idioms focusing on deception, were relatively constant from the 1950s on, with an interesting dip in the 1970s. Changes in editorial policies may possibly have influenced these results. While not of sufficient magnitude for detailed studies of individual items over time, the Time corpus clearly is sufficient to provide us with a great deal of data and numerous valuable insights into the use of these idioms.

Place, publisher, year, edition, pages
Stockholm: Acta Universitatis Stockholmiensis, 2008
Series
Stockholm studies in English, ISSN 0346-6272 ; 103
Keyword
idiom, corpus, language change, variation, American English
National Category
Specific Languages
Research subject
English
Identifiers
urn:nbn:se:su:diva-106876 (URN)
Available from: 2014-08-26 Created: 2014-08-26 Last updated: 2014-08-27Bibliographically approved

Open Access in DiVA

fulltext(1725 kB)1046 downloads
File information
File name FULLTEXT01.pdfFile size 1725 kBChecksum SHA-512
f324bbf18a562710d3010a242bbacfab176e4bbe4bd560e80db9ccd23cc67168ce17920abf9d3e052d207fe4a3cc7c3f200c76d7fb51ec9594c2da6748d533f9
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Minugh, David
By organisation
Department of English
Specific Languages

Search outside of DiVA

GoogleGoogle Scholar
Total: 1046 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 539 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf