The Hong Kong South China Morning Post Corpus


The Hong Kong South China Morning Post Corpus


Benson, Phil (ed.)


Use of this resource is restricted in some manner. Usually this means that it is available for non-commercial use only with prior permission of the depositor and on condition that this header is included in its entirety with any copy distributed.

Download: click to apply for permission to download as required by the licensing restrictions (this will open a form on another page)



Editorial Practice

Encoding format: COCOA

OTA keywords

Linguistic corpora

LC keywords

Linguistic analysis (Linguistics)
Newspapers -- Language
Chinese newspapers -- China -- 20th century

  • designation: Text data
  • size: (41 files : ca. 7 MB)
Source Description

The Hong Kong South China Morning Post Corpus Beczak, Thaddeus South China Morning Post Hong Kong: 1993

Note: The Hong Kong South China Morning Post corpus consists of 2874 Hong Kong and China news reports originally published in the South China Morning Post, Hong Kong's leading circulation daily English-language newspaper. The reports were published between February 1992 and March 1992. In total, the corpus contains 1 million+ running words. The reports in the corpus are not a complete set of items for this period, and they are not listed in any special order in the files. The corpus has been produced solely as a large sample of text for linguistic analysis



Permanent URL