The Lancaster Corpus of Mandarin Chinese

Title

The Lancaster Corpus of Mandarin Chinese [Electronic resource]

Editor

McEnery, A.M. (ed.); Xiao, Richard (ed.)

Availability

Distributed by the University of Oxford under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Download: zip

Languages

Mandarin Chinese

Editorial Practice

Encoding format: XML

OTA keywords

Linguistic corpora
Corpus

LC keywords

Componential analysis (Linguistics)
Linguistic analysis (Linguistics)
Chinese language--Modern Chinese, 1919-

Extent
  • designation: Text data
  • size: 30 files : ca. 42.8 mb
Creation Date

2004

Source Description

no source record

Notes

Title proper taken from AHDS Catalogue Form

The Lancaster Corpus of Mandarin Chinese (LCMC) is designed as a Chinese match for the FLOB and FROWN corpora for modern British and American English. The corpus is suitable for use in both monolingual research into modern Mandarin Chinese and cross-linguistic contrast of Chinese and British/American English. The corpus sampled 15 written text categories including news, literary texts, academic prose and official documents etc published in P.R.China in the early 1990s. The same sampling frame and period as FLOB/FROWN were used in LCMC. The corpus is encoded in Unicode (UTF-8) and marked up in XML.

www.ling.lancs.ac.uk/corplang/lcmc