Thursday, 12 June 2008

250,00,00,000 Words in English Language!

250,00,00,000 Words

in English Language!

How many words are there in English Language? If you just ask this simple question, surely you will get a variety of answers. A recent survey says that there are 250,00,00,000 Words! All these words are found in a corpus.

What is a corpus?

A corpus is a collection of texts of written (or spoken) language presented in electronic form. It provides the evidence of how language is used in real situations, from which lexicographers can write accurate and meaningful dictionary entries.

The Oxford English Corpus

The Oxford English Corpus is at the heart of dictionary-making in Oxford in the 21st century and ensures that we can track and record the very latest developments in language today. The Oxford English Corpus is central to the process and to Oxford’s £35 million research programme - the largest language research programme in the world.
The Oxford English Corpus is a text corpus of English language used by the makers of the Oxford English Dictionary and by Oxford University Press’s language research programme. It is the largest corpus of its kind, containing over two billion words.

Brigham Young University Corpus of American English

The freely-available 360+ million word BYU Corpus of American English is the only large corpus of American English currently available, and the only publicly-available corpus of American English to contain a wide array of texts from a number of genres. In addition, since new texts will be added at least two times each year (20 million new words each year), it will serve as a unique linguistic history of American English since 1990.


COBUILD (Collins Birmingham University International Language Database) CORPUS, is a British research facility set up at the University of Birmingham in 1980 and funded by Collins publishers. The Bank of English is the name of the COBUILD corpus, a collection of English texts The corpus totals 525 million words.
The British National Corpus

The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources.

Longman Learners’ Corpus

Longman Learners’ Corpus is a 10 million word computerized database made up entirely of language written by students of English.

1 comment:

"கருவெளி" said...

It's really amazing to know english language has these many words. Thank you once again for introducing (corpus) new things to us.