British Academic Spoken English corpus
| Title | British Academic Spoken English corpus |
| Author | |
| Availability | As this resource is restricted in some way, you will have to apply for approval to get a copy. |
| Languages | English |
| Editorial Practice | Encoding format: TEI XML |
| OTA keywords |
Linguistic corpora Corpus |
| LC keywords | |
| Extent |
|
| Creation Date | The original recordings were made between 1999 and 2005. During this period, recordings were stored on computer and converted to MP3 format. The transcription work was conducted between 2000 and 2006, and the mark-up was added between 2003 and 2006. |
| Source Description | : |
| Notes |
The BASE corpus consists of 160 lectures and 39 seminars recorded in a variety of university departments. Holdings are distributed across four broad disciplinary groups, each represented by 40 lectures and 10 seminars. These groups are: Arts and Humanities, Social Studies and Sciences, Physical Sciences, and Life and Medical Sciences. The lectures and seminars have been transcribed and annotated using a system devised in accordance with the TEI Guidelines. There is a DTD file which must be kept in the same folder as the corpus files, named 'base.dtd'. The transcription and mark-up conventions are described in the 'BASE manual' document which is in PDF format, and the holdings are described in the Excel spreadsheet, 'BASE corpus holdings.xls'. The token count for the entire corpus is 1.6 million, and the files contain the transcripts of nearly 200 hours of recording. Nesi, H. and H. Basturkmen (2006) 'Lexical bundles and discourse signalling in academic lectures'. International Journal of Corpus Linguistics 11(3) 147-168 Thompson, P. (2006) 'A corpus perspective on the lexis of lectures, with a focus on Economics lectures'. In K. Hyland and M. Bondi (eds) Academic Discourse Across Disciplines Bern: Peter Lang, pp. 253-270 Nesi, H. (2002) 'An English spoken academic word list' , in Braasch, A. and Provlsen, C. (eds) Proceedings of the Tenth EURALEX International Congress, Copenhagen: Center for Sprogteknologi Nesi, H. (2001) 'A corpus based analysis of academic lectures across disciplines', in: Cotterill, J. and Ife A. (eds) Language Across Boundaries, London: Continuum Press Also available at: http://www2.warwick.ac.uk/fac/soc/celte/research/base/ |
