Birkbeck spelling error corpus / Roger Mitton

  • Birkbeck spelling error corpus / Roger Mitton
  • A collection of computer-readable corpora of English spelling errors

Mitton, Roger, 1946- (ed.)


Distributed by the University of Oxford under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Download: zip



Editorial Practice

This is a legacy resource. It is likely to be difficult to use, and is now mainly of interest for the study of the history of text encoding or digital humanities.

Encoding format: Plain text

OTA keywords


LC keywords

English language -- Orthography and spelling -- Glossaries, vocabularies, etc.
Anthologies -- Great Britain -- 20th century

  • designation: Text data
  • size: (38 files : total of ca. 1.8 MB)
Creation Date


Source Description

Birkbeck spelling error corpus Mitton, Roger, 1946- s.n. [London]: [1985]

Note: Contents: Part One. Native Speakers, CHES -- Misspellings of thirty words by about 200 ten-year-old children in England and Wales. These children were a random sample from the several thousand tested in the Child Health and Education Survey in 1980. -- NFER -- Misspellings made in two short dictations by about 80 adult literacy students in England and Wales – a sample from a survey conducted by the National Foundation for Educational Research in 1978-79. -- PERIN -- Three sets of material collected by Dr Dolores Perin in research conducted in the late 1970’s: 1) A dictation and a piece of constrained writing from 36 secondary-school leavers and from six adult-literacy students; 2) Misspellings from free writing by six adult-literacy students; 3) A spelling test of forty words from about 170 students in London secondary schools. -- PETERS -- Two samples of the material collected by Dr Margaret Peters in primary and secondary schools in Cambridge in the 1960's: 1) Misspellings from about 150 children in spelling tests and dictations at the ages of 9, 10 and 11, and in a spelling test and a piece of free writing at age 15 (ie each child at four different ages); 2) Misspellings from short compositions by over 900 fifteen-year-olds. -- ASHFORD The complete text of ‘The Young Visiters’ (sic) by Daisy Ashford, a short novel written by a nine-year-old in Victorian England, the text contains many of her original misspellings. -- FAWTHROP -- Two files supplied in computer-readable form by David Fawthrop, collected in the course of his research into spelling correction at Bradford University: 1) A compilation of four collections of American spelling errors, already in published form; 2) A collection of misspellings from the writing of three British people, all of whom considered themselves to be poor spellers. -- GATES -- The most common misspellings of 3876 words taken from spelling tests given to schoolchildren in New York City in the 1930’s; this material was taken from the book ‘A List of Spelling Difficulties in 3876 Words’, by Arthur I. Gates. -- HOLBROOK -- Passages taken from the creative writing of about 20 children in a British secondary school in the 1960’s, published (with their original misspellings) in ‘English for the Rejected’ by David Holbrook. -- SHEFFIELD -- A list of about 380 misspellings, mostly keying errors, taken from typewritten or computer-terminal input, collected from staff and students in the Department of Information Studies of Sheffield University by Angell, Freund and Willett as part of a piece of research into spelling correction. -- WING -- Misspellings (mainly handwriting slips) from essays written by 40 candidates in the Cambridge University Entrance Examinations in 1976. This corpus was collected and put into computer-readable form by Wing and Baddeley and is described in their paper ‘Spelling errors in handwriting : a corpus and a distributional analysis’. -- MASTERS -- Misspellings of about 260 words made in spelling tests by 600 students in Iowa in the 1920’s – 200 8th graders, 200 high-school seniors and 200 college seniors – collected by H.V. Masters for his PhD research. This is the full set of data that he analysed, not just the examples he included in his published report. -- UPWARD -- Misspellings taken from answers to a questionnaire completed by about 160 15-year-olds in Nottingham. The material was supplied by Chris Upward of the Dept of Modern Languages, Aston University. Part Two. Non-native Speakers. EXAMS -- Misspellings taken from scripts submitted in English examinations by overseas students in 18 countries, with 50 or 100 scripts from each country. The scripts were made available for this exercise by the University of Cambridge Local Examinations Syndicate and the University of London Schools Examinations Board. -- ABO -- Misspellings collected from Finnish-speaking Finns and Swedish-speaking Finns in a series of tests conducted in the 1970’s by Dr Rolf Palmberg and Dr Hakan Ringbom of the Abo Akademi, Finland. -- APPLING -- Two files of data (APPLING1 and APPLING2) collected by students of the Applied Linguistics Department of Birkbeck College in the course of their work as teachers of English -- BLOOR -- Errors taken from a corpus of written English by 12 Algerian students. The file was supplied by Meriel Bloor of the Department of Modern Languages, Aston University. -- SUOMI -- Errors taken from test papers written by 60 Finnish speakers and 45 Swedish speakers aged 15–16 years. The data formed part of a thesis submitted by Riitta Suomi at the Abo Akademi, Finland. -- TELEMARK -- Errors taken from examination papers written by 145 advanced Norwegian students of English at Telemark College, Norway. They were recorded in a study by Dr Nils Rottingen. -- TESDELL -- Errors from 56 students taking an English Language Proficiency Test at Iowa State University in 1981–82. The material was collected by Lee S. Tesdell



Permanent URL