Resources

Tools for Corpus Studies

• AntConc Corpus analysis toolkit wordlists, concordancer, keywords Linux, Mac
• CLAN A tool for searching and analyzing child language data in the CHAT transcription format. Search, wordlists, collocation, child language, CHILDES
• Corpus Presenter Tree tagger and corpus analysis software wordlists, parsing, concordancer, visualization Windows
• Corpus tools An R package for managing, querying, and analyzing texts. text analysis
• ELAN Transcription and annotation of sound or video files transcription, annotation Linux, Mac, Windows
• English Grammar Profiler A CEFR grammar profiler for ESL/EFL.grammar, parsing, CEFR, esl, efl
• LEXA A complex lemmatizer, lexis, lemmaizer
• NVIVO A commercial Computer-Assisted Qualitative Data Analysis Software

Useful Websites for Corpus Studies
• https://corpus-analysis.com/

• ACLThe Association for Computational Linguistics.
• Michael Barlow's Corpus Linguistics Page
• A comprehensive site with information on Corpus Linguistics and many good links.
• Concordances and Corpora Tutorial. By C. Ball.
• CTI: The Computers in Teaching Initiative Centre for Textual Studies.
• English Language Corpora and Corpus resources
• General list found at the BNC site.
• ICAME The International Computer Archive of Modern English
• collects and distributes "information on English language material available for computer processing and on linguistic research completed or in progress on the material". Holds an archive of English text corpora in machine-readable form.
• LDC The Linguistic Data Consortium (University of Pennsylvania)
• "creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes".
• Annotated list of resources on statistical natural language processing and corpus-based computational linguistics. By Christopher Manning.
• Systematic Dictionary of Corpus Linguistics
• "an attempt to group, systemize, define and explain the basic English terms in Corpus linguistics and relative fields" 

Project Sites

• AMALGAM
• Automatic Mapping among Lexico-Grammatical Annotation Models (Leeds).
• Bank of English
• The Canterbury Tales Project
• CETH
• Center for Electronic Texts in the Humanties
• ETAP
• Swedish project creating and annotating a parallel corpus for the recognition of translation equivalents
• Gutenberg Project Home Page.
• ICE - International Corpus of English Project
• MULTEX.
• Multilingual Text Tools and Corpora.
• PEDANT
• Parallel Texts in Göteborg.
• TELRI
• Funded by the European Commission to create a network and supply resources for use within NLP.
• Textcorpora und Erschliessungswerkzeuge (University of Stuttgart).
• "Textual Corpora and Tools for their exploration".

Research centers/groups

• CCALAS
• Centre for Computer Analysis of Language and Speech Leeds.
• CECL The Center for English Corpus Linguistics (Louvain, Belgium). Research on computer learner corpora.
• Cobuild/Bank of English.
• CONTRAST (Contrastive linguistic studies and translation).
• Projects "with one common denominator: the use of bilingual text corpora as empirical material".
• CSLU
• Center for Spoken Language Understanding, Oregon, USA.
• ELSNET
• European Network in Language and Speech.
• TOSCA Research Group
• (University of Nijmegen).
• UCREL
• University Center for Computer Corpus Research on Language.

© Copyright 2023 Minhaj University Lahore - All Rights Reserved