Electronic language resources in Oxford

1. Information for Oxford Users

The Corpus Linguistics course is held each year in Hilary Term at OUCS, Thursdays 12:30 to 13:30 - more details here.

OUCS is helping to co-ordinate access to electronic language resources across the University of Oxford via a small working group with participation from several departments. Please get in touch if you want to get involved! We have termly meetings to discuss resources and projects, usually on Friday of week 4. The next meeting will be held at OUCS on 17th May 2012, and the new Oxford-based BNCWeb service will be demonstra.

2. Available online to Oxford users

Below is a list of some of the resources for which groups within Oxford have licences, and to which students and staff have access.

The University of Oxford has licences for 2008, 2009, 2010 and 2013 for the Linguistic Data Consortium. Take a look at their catalogue, and if there is something there that you are interested in, get in touch with Martin Wynne. Thanks to OUP who paid for the 2009 licence in full for the University, ComLab who are paying for the 2010 licence, and the Phonetics Laboratory for 2013. The following resources have been downloaded from the LDC and are now available online from IT Services for Oxford users. Consult the LDC catalogue for the full list of what is available, and get in touch with martin.wynne at oucs.ox.ac.uk. Please note that you are bound by the terms and conditions of the user agreements associated with each of these resources, which can be found on the LDC website.

Coming soon:

  • LDC2008T18 New York Times Annotated Corpus (3.3 Gb)

3. Other corpus resources in Oxford

We are also assembling a list of corpora, copies of which may be available in Oxford, but under a variety of different licensing and access arrangements. Please get in touch to add to the list. For these resources, contact Martin Wynne unless otherwise stated.

  • BNC XML version, BNC Baby (sampler on one CD)
  • Corpus of Spoken Dutch
  • IPI-PAN corpus of Polish
  • COLT Corpus of London Teenagers' Speech
  • Gesprochenes Jiddisch Textzeugen einer Europäisch-jüdischen Kultur
  • ICAME corpus collectionA
  • East meets West: a compendium of multilingual resources (the TELRI CD, parallel aligned corpora in many European languages)
Other corpus linguistics resources:

5. Support services in Oxford

OUCS can also provide advice and consultancy to members of the University on issues relating to :

  • Discovering, creating and using digital literary and linguistic resources
  • Licensing issues relating to digital literary and linguistic resources
  • Connecting your project or resources with national and international infrastructures
  • Writing the Technical Appendix of an AHRC Research Grant application
  • Planning a digital project in the humanities
  • Digital preservation of research data
  • Electronic Text Encoding
  • Corpus linguistics

See the research support pages at OUCS for more information.

Oxford University Library Services can advise on many aspects of discovery and use of linguistic resources.