Common English words

Vivian Cook

A crucial aspect of much study of words is how often they occur, their frequency. Often the crucial question is not whether a word exists, but how often it is used; immunosurveillance may be in the OED but it doesn’t occur once in the BNC. Modern computers have made establishing word frequency quite easy, as this example shows. The technique is to establish a ‘corpus’ of English texts from books etc, usually now running to hundreds of millions of words, and then to search it for occurrences of a word or phrase using a program called a concordancer, usually taking into account a spread of words before and after the target word. Different corpora exist going from the British National Corpus (BNC) to Collins and Birmingham University International Language Database (COBUILD), but it is easy to construct your own and to search it with an easily obtaoinable concordancers such as Wordsmith. Even Google can be used: feed in immunosurveillance and it lists 58,900 pages; feed in phone and it lists 933 million – of course this is only counting pages, not words themselves as a word may be used many times on a single page.

One interesting thing is that there is a very little difference between different sources over which words are most frequent. The following list compare the most frequent words from the BNC a wide-ranging source, from the writing of seven-year-old children, from the narrative parts of Jane Austen’s novels and from Japanese learners of English.

 

BNC

7-year-olds’ writing

Jane Austen

Japanese learners

1.        

the  

and

the

I

2.       

of

the

to

to

3.       

and

a

and

the

4.       

a

I

of

you

5.       

in

to

a

and

6.       

to

was

her

a

7.       

it

it

I

my

8.       

is

he

was

in

9.   

was

we

in

it

10.   

I

in

it

for

As can be seen there is very little difference between these; the is in the top three for all of them; of, and, a, to, I and it are in all the lists, was in all the lists but one. Whoever you are, whatever you are writing about, you’re going to be using the same highly frequent words. Yet probably you didn’t make all three of your commonest words in English structure words above. Structure words like of and the glue the nouns and verbs of English together (see Content and Structure Words). The top 100 words are all structure words bar four – time, people, new and way.

Here are the most common content words from the BNC:

 

Nouns

Verbs

Adjectives

1.         

time

say

new

2.         

people

know

good

3.         

way

get

old

4.         

year

go

different

5.         

government

see

local

6.         

day

make

small

7.         

man

think

great

8.         

world

take

social

9.         

work

some

important

10.      

life

use

national

Again it is unlikely that you had all of these right. Our off-the-cuff guesses might include man and day but who would have guessed government and world? These frequencies are quite different from those used say in teaching English to non-native speakers, which tend to start from concrete visualisable words like train and banana rather than abstractions like year or work.

Main source: British National Corpus (BNC)

Words index  Vivian Cook

Formerly at homepage.ntlworld.com/vivian.c, which is defunct