Frequencies for English Punctuation Marks
Vivian Cook Spelling stats Punctuation site
Based on a writing system corpus some 459 thousand words long. This includes three novels of different types (276 thousand words), selections of articles from two newspapers (55 thousand), one bureaucratic report (94 thousand), and assorted academic papers on language topics (34 thousand). More information is in Cook, V.J. (2013) ‘Standard punctuation and the punctuation of the street’ in M. Pawlak and L. Aronin (eds.), Essential Topics in Applied Linguistics and Multilingualism, Springer International Publishing Switzerland (2013), 267-290
Score per 1000 running words |
|
|
Average |
. full stop | 65.3 |
, comma |
61.6 |
; semi-colon |
3.2 |
: colon |
3.4 |
! exclamation |
3.3 |
? question |
5.6 |
’ apostrophe/ single quotation |
24.3 |
“ double quotation |
26.7 |
- hyphen |
15.3 |
TOTAL | 208.7 |
score = divided by total words in 1000s, .05 rounded up |
Punctuation Mark Percentages Source: Meyer (1987)'s analysis of the Brown Corpus |
Commas |
47% |
Full
stops |
45% |
Dashes |
2% |
Parentheses |
2% |
Semicolons |
2% |
Question marks |
1% |
Colons |
1% |
Exclamation
marks |
1% |
Frequencies as Google N-grams
NB commas are missing as Ngrams uses them as dividers
Frequencies for comma distribution |
|
Elements in a series (words, phrases, clauses etc) |
20.3% |
Sentence-initial elements (words, phrases, clauses etc) | 20.2% |
Sentence-final elements (phrases, clauses) | 5.0% |
Non-restrictive phrases or clauses | 17.3% |
Appositives | 26.1% |
Interrupters | 6.6% |
Quotations | 4.5% |
Source: Bayraktar et al, 1998 based on Wall Street Journal |
Google Ngram historical frequencies for English 1500-2000 AD
Colon :
Double quotation " "
Question mark ?
Exclamation mark !
Semi-colon ;