The Lancaster Corpus of Mandarin Chinese

Title

The Lancaster Corpus of Mandarin Chinese [Electronic resource]

Editor McEnery, A.M. (ed.); Xiao, Richard (ed.)
Availability This resource is freely available, you should be able to download it now.
Languages

Chinese

Editorial Practice

Encoding format: XML

OTA keywords Linguistic corpora
Corpus
LC keywords

Componential analysis (Linguistics)
Linguistic analysis (Linguistics)
Chinese language--Modern Chinese, 1919-

Extent
  • designation: Text data
  • size: 30 files : ca. 42.8 mb
Creation Date 2004
Source Description

.

Notes

Mode of access: Online. OTA website

Title proper taken from AHDS Catalogue Form

The Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern Mandarin Chinese and cross-linguistic contrast of Chinese and British/American English. The corpus sampled 15 written text categories including news, literary texts, academic prose and official documents etc published in P.R.China in the early 1990s. The same sampling frame and period as FLOB/FROWN were used in LCMC. The corpus is encoded in Unicode (UTF-8) and marked up in XML.

www.ling.lancs.ac.uk/corplang/lcmc