Background to the English Writing System
Vivian Cook

My background chapter for V. Cook & D. Ryan (eds.) (2016) The Routledge Handbook of the English Writing System

... This chapter introduces some of the background issues in the study of the English writing system. It is intended as a brief overview, looking at some of the basic ideas that contributors assume readers are familiar with. In addition it draws attention to some aspects which have not been treated at length in our contributions, such as the material nature of writing.

General Issues of writing systeMS AND ENGLISH

The relationship of spoken and written language

Linguists have typically maintained that written language is a representation of spoken language, seen in such classic statements as de Saussure’s 'Language and writing are two distinct systems of signs; the second exists for the sole purpose of representing the first' (de Saussure 1916, trans. Baskin 1960: 23). If writing is completely dependent on speech, there is little reason for it to receive attention in its own right.

From its origins, written language has, however, been far more than speech written down. According to Samuel Butler (1908):

The written symbol extends infinitely, as regards time and space, the range within which one mind can communicate with another; it gives the writer's mind a life limited by the duration of ink, paper, and readers, as against that of his flesh, and blood body.

While linguists such as Daniels (1996) still emphasise the primacy of speech, many today increasingly claim that speech and writing exist largely in their own right rather than one being subordinated to the other; ‘The sound system and the writing system are the two modes of expression by which the lexicogrammar of a language is represented’ (Halliday and Mattheisen 2013: 7). Speech can indeed be written down; writing can be read aloud. But speech is only written down for specialised purposes, like court transcripts; writing is read aloud in a few limited situations, like news-reading from teleprompters. The writing system is an alternative form of representation for the language, not the language itself, and is not subservient to the phonology, though related to it. The important relationship is between writing and language, not between writing and speech.

Crucial differences between speech and writing arise from their dissimilar engagement with time. Unless recorded, speech is gone the moment after it is said; writing is available effectively until it physically disintegrates or disappears. Among other things, writing is an external physical memory system for recording and for planning - to avoid the vagaries and limitations of our internal memory systems, exemplified par excellence in diaries and desktop calendars. As St Augustine (397, II, 5) put it, ‘because words pass away as soon as they strike upon the air, and last no longer than their sound, men have by means of letters formed signs of words’.

Types of writing system

According to Perfetti (1999: 168), a writing system ‘determines in a general way how written units connect with units of language’. The same language can be represented by more than one writing system: English can be represented in Braille, shorthand or written Morse code. The complementary terms to ‘writing system’ are ‘script’ and ‘orthography’. A script is the actual physical symbols of the writing system, for instance Roman or Cyrillic alphabets; an orthography is the rules for using a script in a particular writing system, that is to say how the symbols spell out words etc. However these terms vary considerably in meaning between writers. Sproat (2000: 25) for example uses ‘the terms “orthography” and “writing system” interchangeably’.

The classification of writing systems has a long and chequered history; for recent accounts see Borgwaldt and Joyce (2013) and Rogers (2005). The main issue is how to reconcile the two levels of language that written symbols correspond to, on the one hand items in the lexicon, whether words or morphemes, called morphographic or meaning-based systems, on the other hand sounds in the phonology, whether syllables (Japanese kana), phonemes (Italian) or consonant phonemes only (Arabic), called phonographic or sound-based systems.

Lexical correspondences for individual symbols occur only on the fringes of the English system. On a standard keyboard, < £ $ % & @ # > all have particular meanings corresponding to English words or phrases, per cent, pound, dollar, ampersand, at, hash. (Conventionally angle brackets < > are used to enclose written forms, just as strokes / / and square brackets [ ] enclose spoken forms, though this usage is by no means standard to all writing system researchers and does not suit all the chapters in this volume). None of these symbols correspond directly to the pronunciation: you either know that < & > corresponds to ampersand /mpsnd/ or you don’t. As we will see, English also involves lexical correspondences for individual words such as yacht and of, which cannot be handled as straightforward phonological correspondences.

English text messaging provides some examples of phonological correspondence at the syllable level: in C u 4 t <C>, <u> , <4> and <t> correspond to spoken syllables rather than individual sounds. However most phonological correspondences in English connect letters or combinations of letters to distinctive sounds of the language, i.e. phonemes.

In the seminal paper by Katz and Frost (1992: 71), a writing system in which each symbol corresponds to a particular sound of the language, and, vice versa, each sound corresponds to a symbol, is called ‘transparent’ or ‘shallow’. In transparent systems such as Turkish, contrasted with English in literacy education in Chapter 16, every letter can be read aloud and every sound can be spelled according to a set of one-to-one correspondences between letters and sounds. Transparency is relative rather than absolute - Turkish is more transparent than English - and transparency can be estimated in various ways (Neef and Balestra 2013).

Full transparency implies that the relationship between sounds and letters is isomorphic: one letter one sound, one sound one letter. English is a long way from meeting this ideal; as William Salesbury already pointed out in 1547, ‘You cannot fail to know that in English they do not read and pronounce every word literally and fully as it is written’. Two letters may correspond to a single sound, <sh> to // in sharp; three letters to one sound, <pph> to /f/ in sapphire; or a single letter to two sounds, <x> to /ks/ in tax. Some so-called ‘silent’ letters have no direct link to sounds in particular words, say the <h> in hour or the <l> in salmon compared to those in houri /h/ and salmonella /slmnel/.

Conversely a writing system in which ‘in which the letter-phoneme relation is substantially equivocal’ is called ‘opaque’ or ‘deep’ (Katz and Frost 1992: 71). It would be impossible to work out the pronunciation of the following words through correspondence rules: hour /a/, Leicester /lest/, lieutenant (in British English /leftennt/), ptarmigan /t:mgn/, colonel /knl /, reveille /rvl/ and hiccough /hkp/ (though, since 1950, more likely to be spelled hiccup, according to Ngram Viewer).

Rather than one-to one correspondences, English has many alternatives in both directions. The single vowel letter <a> for example corresponds to at least five different English sounds: // bad, /:/ bath, // about, /e/ many, // cauliflower. In reverse the spoken diphthong /e/ corresponds to twelve different spellings: <a> lake, <au> gauge, <et> ballet, <ai> aid, <er> foyer, <ay> stay, <e> café, <ea> steak, <eigh> weigh, <e> matinée, <ae> sundae, <ey> they.

While the English writing system is mainly phonological, it is far from transparent. If the goal of a writing system is to represent the sheer sounds of the language as faithfully as possible then the English system is highly inefficient. But, if written language represents language rather than speech, English may be representing other levels of language than the sounds. The <g>s in sign and malign for instance seem redundant, unnecessary ‘silent’ letters - if we are looking for a direct sound correspondence. But these <g>s connect to families of words in which the <g> is not silent, such as signify and malignant. Without the <g>, these underlying links in our minds would not have been activated, supporting the concept of spelling representing something deeper than surface phonology, one of the threads that runs through the chapters of this book.

Dual routes in reading

So far the English writing system has been treated in terms of language as ‘an abstract external entity’, as described in grammars and dictionaries (Cook 2010): ‘the English language’. But the writing system is also part of language in the senses of the possession of a community’ and of ‘the knowledge in the mind of an individual’, in which the writing system is not an external entity but an internal system in the mind of the user. Using the writing system involves processing written and spoken information and relying on memory processes to retain the information for shorter or longer periods.

The continuum between lexical and phonological writing systems outlined above parallels two ways of processing the text, visually for meaning and phonologically for sound; ‘Reading theorists have reached unanimity concerning the existence in the human reading system of two separate procedures for reading aloud - that is, dual routes from print to speech’ (Coltheart 2005: 23). Different models have now been developed within this broad dual-route architecture, including connectionist models such as Seidenberg and McLelland (1989) and the Dual Route Cascaded Model (Coltheart et al 2001).

Following the lexical route means seeing the word blossom, recognising it visually as a whole word <blossom>, finding its pronunciation /blsm/ in a mental lexicon along with its meaning and then saying it aloud. Following the phonological route means seeing a word such as blossom, recognising the letters <b l o s s o m>, converting them into the corresponding phonemes /blsm/, and then saying them aloud: the meaning is available by matching in the mental lexicon but is not necessary for reading aloud.

The two routes thus involve alternative ways of processing text. Following the lexical route a reader can recognise the word through, and look it up in their mental dictionary as a whole to retrieve the meaning without knowing its pronunciation. The phonological route, however, involves ‘assembling phonology from a word’s component letters’ (Katz and Frost 1992: 71). A reader using the phonological route can recognise the letters of salad as <salad> and apply the correspondence rules to get /sld/, without knowing its meaning.

Both routes are used by readers of English, some words being processed entirely through the phonological route, some through the lexical. The advantage of the phonological route is that it can apply to unknown words. English newsreaders demonstrate this when they read foreign politician’s names aloud, say Ukraine's energy minister, Volodymyr Demchyshyn, China’s president, Xi Jinping, or the Zulu king, Goodwill Zwelithini kaBhekuzulu. Any reader has the ability to deal with non-words, whether conforming to English spelling, like broave or shart, or not, like *qish or *rawh. Companies are confident that people can pronounce their non-English names, Aviva, Skandia, AXA, Ansva and Kwelm, to take just insurance groups.

Some English words like colonel and Wednesday are necessarily processed through the lexical route since their spoken correspondences are virtually unique, namely /knl/ and /wenzdi/. The spellings of these words have to be learned individually, not through letter to sound correspondence rules. Seidenberg (1992) claims that the lexical route is used for the most frequent 200 words of English, dense with function words like the, where and of - the only word in which <f> corresponds to /v/.

Those who believe that the only valid route is phonological exclaim at the sheer number of exceptions to correspondence rules. Those who believe that English readers use both routes regard these exceptions as separate lexical entries, not dauntingly large in number compared to the 1945 characters Japanese children have to learn at primary school or the 40,000 or so in a dictionary of ‘traditional’ Chinese. Methods for teaching reading tend to favour one route or the other, whether the ‘look and say’ method’s emphasis on the lexical route or the ‘phonics’ method’s concentration on the phonological route. Chapters 16 and 26 both draw attention to the need for children to be taught both routes. Different forms of dyslexia are indeed associated with one or the other route; some children who have initially mastered the phonological route find it impossible to go on to the lexical route, vital for efficient silent reading (Frith 1985). Aphasia too can affect either route (Funnell 1983).

The phoneme and the writing system

Most writing research has discussed phonology in terms of phonemes - the minimum sound units that distinguish one word from another, say /tent/ tent distinguished from /dent/ dent by the /t~d/ contrast, or red /red/ from rod /rd/ by the /e~/ contrast. In this view, speech consists of strings of discrete contrasting phonemes, rather than being a continuous stream of sound: /skm/ is a temporal sequence of four phonemes /s/, /k/, /i/ and /m/, parallel to the visual sequence of four letters in <skim>: both speech and writing can be chopped up into discrete contrasting segments that occur one after the other, whether phonemes or letters.

Treating speech as a string of phonemes does not, however, account for the discontinuous elements that occur in speech or writing, called ‘split digraphs’ in Brooks (2015: 6). The difference in pronunciation between <note> /nt/ and <not> /nt/ is shown by the final <e>, which has no direct sound correspondence; <o_e> can be analysed as a single unit split by an intervening consonant <t>, with the correspondence /e/, as Albrow (1972) argues, and the same applies to <a_e> < hat/hate>, <e_e> <gen/gene>, <i_e> <writ/write>, and <u_e> <cut/cute>, to which Brooks (2015: 432) adds <y_e> <byte>. This is sometimes called ‘The fairy e rule’ - ‘Fairy E waves her wand and makes the vowel in front say its name’, an often used rule of thumb in primary schools - or, as Hart (1569: 33) perhaps first put it, ‘for the quantitie of the preceeding vowell’.

The differences between phonemes come down to the distinctive features that make them up, such as voice, which distinguishes voiced /d/ dent from unvoiced /t/ tent, or continuant, which inter alia distinguishes plosive /t/ tin from fricative // thin,or lateral, which distinguishes /l/ lip from /r/ rip. ‘These distinctive features occur in lumps or bundles, each of which we call a phoneme’ (Bloomfield 1933: 79); a table of distinctive features for English consonants is given in Chapter 4. Writing systems research cannot confine itself to the phoneme, indispensable as the term may be as an overall label.

the properties of the English writing system

Any writing system has many elements. While most discussion of the English writing system concentrates on the sound-letter correspondences, these are only part of the whole system.


English text is normally read from left-to-right in rows from top-to-bottom of the page, unlike Arabic (right-to-left) or traditional Japanese (columns). In some circumstances, English is written in columns top-to-bottom, or sideways top-to-bottom. English books are read by turning pages from right-to-left, while the reverse happens in Japanese, traditionally written vertically; the sequence of reading speech balloons in English comics is left-to-right, in Japanese the opposite. English letters also face in a particular direction (Treiman and Kessler 2013): <b> is not <p> (top-bottom inversion) and <d> is not <b> (left-right inversion), a problem for some dyslexics.

Script and letters

English now uses a Roman alphabet of 26 letters, in lowercase and capitals, plus italic lowercase and capitals, called by Gill (1931) different alphabets. The differences between these forms of the alphabet form a useful resource for the English writing system: a capital letter may mark a grammatical difference between a proper name and a common noun <Hall> versus <hall>; italics may show emphasis <He’s the expert on glottochronology>; and so on. Historically the forms of the alphabet were used separately, i.e. for complete texts rather than combined for contrastive purposes, as described in Tschichold (1928: 79). The range of forms has been amplified comparatively recently by the addition of bold and small caps (Bringhurst, 2005), all now available through word processing programs rather than restricted to type-setters Chapter 25 discusses how letters are used in printed texts.

Capital letters are written within virtual squares, easily discernible in say <Newcastle> in the Times New Roman typeface or indeed <NEWCASTLE> in the Keys typeface. Printed English since Victorian times has been heavily influenced by the classical Roman letters carved in stone on inscriptions such as Trajan’s column in AD114 rather than other forms of Roman letter (Gray 1960). The modern inscription in Fig 1.1 brings out their use of serifs - ‘The broadening of triangular forms at the terminals of letters’ (Hill 2010: 186) - seen at the tips of the strokes at the bottom and top strokes of <A> and <T> etc - and the varying width of the line, as seen in the <S> and <N>.

To many the capital letter is somehow the prototypical form of a letter: official forms demand to be ‘printed’ in capitals or ‘block’ capitals; amateur handwritten notices tend to be all capitals; modern text art chiefly uses capitals, as seen in Art and Text (Beech, Harris and Hill 2009), as do balloons in strip cartoons. Yet, if anything, capitals are harder to read than lowercase as their square shapes make them more difficult to differentiate. The letters for UK motorway signs for instance were based on research that demonstrated the most legible signs from a speeding car combined capitals and lowercase (Kinneir 1980).

The lowercase or ‘minuscule’ letters on the other hand have ascenders above the line as in <d> and <b> and descenders below the line as in <p> and <y>. They are derived from cursive letters written quickly by hand with brushes or quills rather than laboriously carved on stone with chisels. While capitals are sometimes called ‘big’ and lowercase ‘little’, size is relevant only to the few lowercase letters that have similar shapes to their capital versions, say <c/C  j/J  o/O  s/S  x/X>. The shapes of most capital/lowercase pairs are quite distinct, as in <a/A  h/H   q/Q  t/T>.

Italic letters have a distinctive slant < a l p v f g k n >; the italic letter < a > has a closed form compared to regular open <a>. Modern uses of italics are for emphasis <I do not believe in ectoplasm>, for stage directions in plays <Exit, pursued by a bear>, for citing book titles <The Decline and Fall of the Roman Empire>, and so on.

At the start of printing in Europe in the fifteenth century, printed letters were derived from handwritten forms; Chapter 25 describes the complex relationship between the two ever since. Printed texts are produced by machine in as many copies as are needed - they are reproducible; lettering is produced by the individual’s hand, usually as a single copy. According to the type designer Fred Smeijers (2011: 19), ‘There are just three kinds of letters: written, drawn or lettered’. Writing produced by a brush or pen is individually done by hand and is effectively unique; a signwriter may pride themselves that their work can be identified as theirs (Lewery 1989). Written and drawn letters are not created in standardised forms but can also be specially made for a unique occasion, as in stone monuments, handwritten letters, doctors’ prescriptions and a thousand more. While we are used now to the jargon of fonts and typefaces derived from print, these non-print letterforms need a different descriptive language, for example miniscule, cursive and ductus, ‘used of all aspects of the actual writing of letter forms’ (Roberts 2005: 7). However the sheer adaptability of computers has to some extent blurred this distinction.

English letters over time

The English alphabet has remained substantially the same since the Old English of the 9th century, a handful of letters being gained, a handful lost, even if the older forms of the letters themselves may be hard for a reader to recognise; their history is documented in Chapters 6 and 8. Some Old English letters no longer exist in modern English, in particular <> (ash) seen in <fder> (father), <> (eth) in <eoran> (earth), and <> (thorn) in <in> (thine). By Wycliff’s Bible in 1395, <> and <> had been replaced by <th> as in <erthe> and <thi>, and <> (yogh) had replaced some <g>s as in <forue> (forgive), supplanted by <gh> in <right> etc in the sixteenth century. Some letters that are now distinct were variants of the same letter in the early English alphabets. <heven> and <vs> are <heuen> and <us> in the 1611 version in the Book of Common Prayer. Only after the mid-seventeenth century are these pairs of letters firmly distinct.

A curious letter that came and went is the long <s>, i.e. < >. The 1739 Book of Common Prayer has single <s> <againt>, doubled <trepaes> and an ungainly combination with <s> <trepas>. By the end of the eighteenth century its day was past, as described in Chapter 8.

Graphemes and correspondence units

The most important unit for the phonological correspondence rules of English is not the individual letter as such but the letter or combination of letters that corresponds with a particular sound, and vice versa. This correspondence unit is often known as a ‘grapheme’, by analogy with the phoneme: a grapheme is ‘any minimal letter string used in correspondences’ (Carney 1994: xxvii). The term ‘allograph’ is sometimes used to refer to alternative forms of the same letter, by analogy with allophone; for example <g> and <g> are distinctive allographs of <g>; the human eye will accept a wide range of shapes as the same lowercase letter, say <>, to take some popular fonts. Sometimes allographs may become distinct graphemes, as with <J/I> and <U/V>.

The term ‘grapheme’ is, however, not unproblematic. Many reject it as forcing the writing system to be analysed in terms of phonology rather than independently. Venezky (1970) treats grapheme as a synonym for letter and prefers to call the unit for stating correspondences a ‘relational unit’; his list of consonant relational units is given in the box. Albrow (1972) prefers the term ‘orthographic symbol’. Many researchers produce exhaustive lists of English graphemes, Brooks (2015) having 89 in his main system, plus 195 others, illustrated in the box. Recent discussion of the grapheme can be found in Altmann (2008).


Relational Units for English (Venezky 1970)
Major Units: b c ch ck d dg f g gh h j k l m n p ph q r rh s sh t tch th u v x w wh xs y z
Minor Units kh sch gn

English Graphemes (Brooks 2015: 255-257)

Main System: b bb c ce ch ci ck d dd dg dge ed f ff g ge gg h j k l le ll m mm n ng nn p ph pp q r rr s se sh si ss ssi t tch th ti tt v ve w wh x y z zz
bh bd bp bt bu bv + 189 more

Above the phoneme comes the syllable. Chapter 4 analyses the spoken syllable in terms of two or more levels of structure: a syllable /bg/ consists of an onset /b/ and a rime /g/; the rime in turn subdivides into a nucleus // and a coda /g/. Children are believed to acquire the spelling of onset and rime separately, and then to separate the letters corresponding to the rime gradually: they first learn say bag as <b> and <ag> and then separate <ag> into <a> and <g> (Goswami 1999). Children often acquire a consonant cluster as a unit rather than as separate phonemes, spelling street as set and screams as sceem (Treiman and Kessler 2014).

As an opaque writing system, English has complex rules for linking speech to writing and vice versa, as much of the rest of this book demonstrates. Carney (1994) for example details thirteen correspondence rules for the letter <g> with /g/ guide, // contagious, etc, 41 correspondence rules for the letter <o> with // goat, /:/ floor, etc. As the English language has never had a body to dictate what the forms of the language should be, unlike say l’Academie Francaise for French, English spelling ‘rules’ are descriptions of how words behave in English. They emerge out of our writing because that is how English writing works, not because some authority tells us what to do. This does not prevent popular discussion taking the rules of spelling to be commandments engraved on stone by an unnameable, unchallengeable authority.

The principle of invariance

One implicit assumption about the modern English writing system is that a word is always spelled in the same way, regardless of its sound correspondence: scissors has to be spelled <scissors> not <sizerz>, even if the latter corresponds more accurately to its pronunciation /szz/. A written word is seen as fixed and unchanging. A limited dispensation from invariance is afforded to proper-names, as in Vivian, Vyvyan, Vivien, and Vivienne(with a gender difference between the first two and last two in British English); the possessor of a name can insist on how it is spelled or said, say Keynes /kenz/ for the economist or Menzies /mgs/ or C.J. Cherryh with final silent <h> for the novelist.

This insistence on invariance is comparatively new in English, and is often at odds with consistent letter to sound correspondence rules. When a language is spoken with multiple accents, the spelling cannot both reflect how a word is said and always be the same. Middle English is famous for its variable spellings, not only across dialects but also within the writing of the same individual; much the same was true of Older Scots, as discussed in Chapter 19. Looking at citations for scissors in the Oxford English Dictionary (2015), in the fifteenth century the spellings included cysars, cysurs, cysour, cisours, sesours, sisours, sisoures, scisors, and sysowre; there were around ten variant spellings per century until 1700; even the twentieth century had five: cissers, cithors, scissors, sissors, sizzers. The modern spelling <scissors>, though first found in 1484, was only one of the variants for many centuries; between about 1750 and 1820 scissars was the most popular form according to Ngram Viewer. The <sc> spelling is one of the many examples of English erroneously adopting a spelling based on Latin, in this case treating it as coming from scissor ‘a cutter of cloth’ rather than from cisoria ‘cutting instrument’. Other historically inaccurate re-spellings include the <s> of island added to Middle English iland/éaland on the belief that it was derived from French (isle) rather than from an Old English word for ‘water’ eag still seen in Anglesey and ait.

At some point then English spelling fixed on the spelling of individual words, rather than relying on general sound/letter correspondences, probably through the word-based attempts by the great dictionaries of Johnson (1755), Webster (1828) and others to lay down a fixed form of the language. Once the spelling of a word is set, any deviation is a mistake and a solecism, even when the pronunciation is obvious. Most popular discussion of spelling concerns invariance: using anything but the accepted spelling of a word is a sign of lack of education and carelessness and a betrayal of the English language, according to many highly literate English people:  'Spelling is one of the outward and visible marks of a disciplined mind’ (Kilpatrick 1988). Competitions like the annual Scripps National Spelling Bee in the USA concentrate on the invariant spelling of infrequent words like cypseline, pyrrhuloxia and scherenschnitte (words in the 2015 competition, absent from the 100 million running words of the British National Corpus). It is very dangerous to your social prestige and employment prospects to spell paid as <payed>, to confuse compliment and complement, or their, there and they’re, or to forget how many <c>s and <m>s there are in accommodate. British newspapers attacked the then Prime Minister Gordon Brown in 2009 for sending a handwritten letter of condolence to a dead soldier’s mother that she felt was disrespectful because he misspelled Mrs Janes as <Mrs James>, greatest as <greatst>, your as <you> and colleagues as <colleagus>, possibly more due to his poor eyesight and illegible handwriting than lack of respect. The extent of the problem that English-speaking children have with invariant spelling can be gauged from Peters (1970) test of 967 ten-year-olds’ spelling of saucer; only 47 per cent were right; the most popular mistake was sauser, followed by sorser, suacer and sacer, and so on down to 126 one-off spellings such as scorceri and suarser; Chapter 29 discusses children’s misspellings of scissors.

The importance of spelling for teaching children to read and write is emphasised in Chapter 13. Spelling words correctly is a vital part of literacy education; children will be marked down at examinations and when submitting applications for college or jobs if they make many spelling mistakes. According to Kreiner et al (2002), a writer who makes more than two per cent of spelling errors seems poor or unintelligent. Nevertheless using alternative spellings to the usual letter/sound correspondence rules is very much a feature of English, as discussed in Chapter 3.

Orthographic regularities

One of the principles of English spelling proposed by Venezky (1999: 6) is ‘Letter distribution is capriciously limited’. Orthographic units are not free to occur anywhere in the word or syllable. For instance <k> and <ck> both correspond to the sound /k/ yet <ck> only occurs at the end of syllables as in back and tick, never at the beginning; there are no English words *ckab or *ckit, though these are perfectly pronounceable as /kb/ cab and /kt/ kit.

Such rules are called variously ‘orthographic constraints’ (Treiman 1993) and ‘orthographic regularities’ (Cook 2004a). Correspondence units that start words but do not end them include: <wr> write, <wh> whom, <ch> cheat, <j> jug and <rh> rhesus. Those that finish words but do not start them include <tch> match and <ng> ring. Double consonants occur freely both within words, officer, adder and at the end of words, gruff, odd, but cannot occur at the beginning, *ffame, *ddont.

An efficient user of the English writing system must know not only the standard spelling correspondences and the particular spellings of many individual words but also the orthographic regularities about where letters may occur. Treiman (1993) found that the eight-year-olds she tested had already substantially mastered these regularities, being able to tell that beff is a possible English word but *ffeb is not, though both are equally acceptable phonologically. It is interesting just how odd the words look that break these rules: *ckall is unambiguously /k:l/ call, *dgell /el/ gel, *farh /f:/ far, yet they look completely strange and unEnglish.

The spelling of inflectional morphemes

The past tense ending in regular English verbs is nowadays typically spelled <ed> but has three spoken forms: /t/ looked, /d/ opened and /d/ waded. The inflection <ed> conveys the meaning of ‘past’ but does not correspond to the actual pronunciation, which is predictable from its phonetic context. The exact sound/letter correspondences are neglected in favour of a morpheme correspondence.

So the <ed> in content words is different from the <ed> in verbal inflections; the adjectives learned /lnd/ and blessed /blesd/ for example contrast with the past tense verbs learned /lnt/ or /lnd/ and blessed /blest/. Before the eighteenth century, <ed> tended to be used for stressed adjectives, as in the historically related form <learnd>. Since about 1700 the spelling <ed> in inflections has not corresponded unambiguously to a single pronunciation (see Cook 2004b for an account of the different past tense rules for speech and writing). These morpheme-related aspects of spelling support the argument that English graphemes connect to the English language at different levels of language.

The possessive inflection <s> shows another relationship in that the apostrophe has traditionally distinguished between plural <s> and possessive <s>, as in tigers/tiger’s, and between singular and plural possession through position, tigers’/tiger’s. However in many public texts today <’s> is a common way of showing plural, known as the greengrocer’s apostrophe. A street sign advertises tours of Colchester’s Church’s; a university notice directs one to the LECTURE THEATRE’S; a newspaper headline proclaims DRIVER’S SMASHED UP MY FLAT - to the horror of those who regard the apostrophe as the last line of defence against the barbarians at the gate. Chapter 8 discusses the history of the apostrophe.

The accents used as correspondence norms in the English Writing System

A system that combines sound/letter correspondences with invariant spellings has to select a single reference point out of the many different English accents across the globe. British English has traditionally been described in terms of Received Pronunciation (RP): ‘the regionless upper-class and upper-middle-class accent of British - mainly English - English’ (Trudgill 2003: 114). Cruttendon (2014) has adopted General British (GB) to include other countries than England, allowing for variation under the heading of Regional General British. Kruse in Chapter 10 here uses Southern British Standard (SBS) and North American Standard (NAS). The norm for discussing letter/sound correspondences in spelling research in England has mostly been taken to be SBS/RP/GB, as in say Carney (1994) and Brooks (2015), not very different from Puttenham’s (1589: 144) choice of ‘the vsuall speach of the Court, and that of London and the shires lying about London within lx miles, and not much aboue’. For the USA, Venezky (1970) relies on the General American (GenAm) accent of Kenyon and Knott (1951), which ‘corresponds to the layman’s perception of an American accent without marked regional characteristics’ (Wells 1982: 470).

Choosing one accent as the correspondence standard creates the problem that any chosen accent is not used by all speakers. Trudgill claims that approximately three per cent of British speakers speak RP (Trudgill 2001), a tiny minority in the UK, compared say to the Northern accents used by half the population of England (Wells 1982). RP is furthermore a status accent, the one to which the speech of politicians and newsreaders tends to gravitate. In a survey, the accent called Queen’s English, a lay equivalent to RP, was evaluated highest of 34 accents for prestige (Coupland and Bishop 2007).

The choice of a correspondence standard is then a social decision, as Coulmas points out in Chapter 17, rather than one based on considerations about how many speakers use it, its general comprehensibility or ease of learning etc. Choosing an accent such as RP or GenAm ignores the accents of most native users of English, whether Brooklyn, Geordie or Toronto, and indeed the multifarious accents of second language users. The present book is rich with varieties of English such as Scots (Chapter 19) and Irish (Chapter 20), as well as English used by speakers of other languages (Chapters 22 and 24).

One kind of variation in English spelling is captured as American versus British, say - American style first - feces/faeces, plow/plough and traveler/traveller. Cummings (1988: 26) points out that ‘the differences between American and, say British English spelling are quite modest’, amounting to a few hundred words. His Chapter 18 here shows how American and British spelling differ not just from one another but from dictionary to dictionary. These two main varieties broadly extend to the rest of the English-using world. Take the word labour in on-line English language newspapers around the world. Hardly surprisingly, British-style labour is found in Canada, Thailand, New Zealand, India and Nigeria, American-style labor in Israel, Korea, Singapore, Japan and the Philippines. Australia distinguishes labour from the Labor Party; the Australian Government Style Manual generally recommends ‘what is often thought of as British rather than American practice’ (Peters and Delbridge 1989: 129). Despite this, color appears in Australian newspapers two and a half times as often as colour (Peters and Delbridge 1989).


Edinburgh: One ay the things thit concerned us maist wis the fact thit ye couldnae really relax in his company, especially if he'd hud a bevvy. Irvine Welsh

South Florida: You ain't been used tuh knockin' round and doin' fuh yo'self, Mis' Starks. You been well taken keer of, you needs a man. Zora Neale Hurston

Geordie (Newcastle upon Tyne): Me nyem it’s Billy Oliver, Iv Benwell Town aw dwell; And aw’s a cliver chep, aw’s shure, Tho’ aw de say’d mysel. Bill Oliver’s Ramble, 1842

Some writers try to convey regional dialect accent through spelling, illustrated in the box; people assert their local identity in print. Showing dialect through non-standard spelling is nevertheless a double-edged weapon. Readers with other accents may struggle to get through a few lines of Geordie poetry or Uncle Remus. Dialect speakers may be pleased to see their accent reflected in speech: in Nippers, a series of readers for children (1968-1974), Leila Berg tried to cover all the variations of the child’s word for ‘mother’, mum, mummy, mom, mam, etc. Or they may feel stereotyped as yokels who cannot spell properly. Jaffe (2000) shows the delicate balance in the representation of black American English where, the more accurate the portrayal of the accent in writing, the more it stigmatises the speaker as uneducated.

Showing people’s actual pronunciation through non-standard spelling needs to be distinguished from the traditional literary convention through which novelists can indicate a person has a dialect accent by using non-standard spellings that correspond to standard speech, called ‘eye-dialect’, as argued in Chapter 21. To take some eye-dialect examples: unstressed vowels can be shown with <er> for //, as in<fer> for <for>, <ter> for <to> and <yer> for <you>; alternative non-standard spellings can be given for the standard pronunciation as in <wot> for <what>, /wt/ in both cases, <luv> for <love> /lv/, <mister> for <Mr> /mst/. These reflect the typical RP pronunciation, but not the accepted spelling. A novelist can label a character as dialectal, uneducated or uncouth through eye-dialect without showing any actual difference in accent from the standard.

The invariance of word spelling creates a dilemma for those spelling reformers who want to make spelling better reflect pronunciation. If this means choosing one ‘standard’ out of all the accents available, much of the spelling will be opaque for many readers. If it means having different spellings for each dialect, a reader could only read texts with ease that were written in their own dialect. And, if it means adhering to the pronunciation at one moment of time, Dr Johnson (1755) pointed out ‘some have endeavoured to accommodate orthography better to the pronunciation, without considering that this is to measure by a shadow, to take that for a model or standard which is changing while they apply it’: spelling would have to keep up with changes in pronunciation. In both cases many people would be disadvantaged. The argument for spelling reform in Chapter 11 in fact favours simplification rather than adaptation to pronunciation.

The accent chosen also has implications for the teaching of reading. Perhaps the majority of English-speaking children are taught letter/sound correspondences that are not based on their own accent, nor always those of their teachers. Correspondences based on RP force them in effect to learn the phonology of another dialect. Essex children have spellings such as <woo> for <wall> and <fevr> for <feather>, revealing their local realisation of final /l/ as a vowel /u:/ and of // as /v/ (Bromley 2002). Children who speak the Hoosier dialect in Indiana spell <when> as <win> and <pen> as <pin>, showing their local /i/ pronunciation (Treiman 1993). In RP, words like muffin and rocket have // in the second syllable, in Australian English //; Australian children aged 6-8 spell <muffin> as <muffen> ten times as often as UK children, while UK children spell <rocket> as <rockit> three times as often as Australian children (Kemp 2009), developed further in Chapter 13.

Children who do not speak the target accent can have specific problems with some spellings. This shows up in children’s difficulties with the presenters’ accents in the British TV spelling competition Hardspell and with the presenters’ perception of the children’s accents in the Scripps National Spelling Bee (McMenamin and Kerr 2014). In England a school might well encourage children to use an RP ‘standard’ accent because of the broader life opportunities it affords but this is a separate issue from teaching children how to read and write, even if most teaching of spelling has prejudged it by using RP. The need for children to be aware of another accent when learning English spelling puts a burden on those who do not speak the ‘standard’ accent - the majority in many classes.


English makes use of a fairly standard set of ‘Western’ punctuation marks (Nunberg 1990). The actual forms used in English differ from those found in continental Europe chiefly over quotation marks. English uses single and double quotation marks at the top of the line < “ ” ‘ > rather than the up and down marks < „ “ > used for German and in many East European languages, or the goose feet < « » > found in French and Russian texts, the reverse goosefeet found in Switzerland < » « > (Cook 2004a) or the long dashes used in Spanish < - - > (and indeed in James Joyce).

Punctuation originated as a way of providing hints for poor readers on how to read manuscripts aloud (Parkes 1992), reflected in advice such as:

A Comma Stops the Voice while we may privately tell one, a Semi Colon two; a Colon three: and a Period four. (Mason 1748)

little different from that available to University of Hull students today:

Where you think a reader should make a major pause (draw breath), use a full stop. Where you think a reader should make a smaller pause, use a comma. (University of Hull 2007)

Note that these punctuation marks do not represent actual pauses in speech, which seldom occur at grammatical boundaries, but are guides to potential pauses in reading aloud.

Punctuation also helps the reader to understand the grammatical structure of the sentence, the focus of Chapter 5. Using the scheme in Halliday (1985) as a starting point, a paragraph can be shown by indent or leading (pronounced /led/); a lexical sentence by < ! ? . >; a word by space, a morpheme by < ’ - >; and so on; line breaks also function as punctuation in street signs (Cook 2013). The point where the phonological and grammatical functions of punctuation coincide is the overlap between the grammatical clause and the phonological tone group: ‘other things being equal, each clause is spoken as one tone group’ (Halliday 1985: 36). There is often tension between the two systems; some writers punctuate more by structure, some more by pauses; editing a book such as this reveals the wide individual differences in punctuation, particularly in the use of commas.

An easily overlooked feature of English is using spaces to separate words. Since spaces are essentially invisible to the reader, they are scarcely perceived as punctuation. Yet word spaces are not necessary to a script; letter-based writing systems such as Vietnamese and Thai do not have them. Historically, spaces only became standardised in European writing about the 7th century AD, originating from Irish scribes (Saenger 1997). Harris (1986) regards the invention of the word space as comparable in importance to the invention of zero in mathematics. In particular it facilitated silent reading; Saenger (1997) claims this had profound effects on intellectual life through the privacy it afforded compared with the public nature of reading aloud.

The materiality of writing

Writing and speech take different physical forms, whether material texts or sound waves. Writing is above all making symbols on a surface: ‘For most of the five thousand years of writing history, all our techniques and technologies have been aimed at making visible marks stick to surfaces’ (Levy 2001: 34). A writing system reflects the strengths and limitations of the material on which texts are written and the material that the letters are made of: texts are material objects’ (Kress and van Leeuwen 1996: 231). A writing system is tied to the technology available at a particular moment in time. Clay tablets require a different kind of writing from printed books, blackboards a different kind of writing from computer monitors. The materiality of writing has mostly been considered by typographers, as in Chapter 25, and calligraphers like Clayton (2014).

Kress and van Leeuwen (1996: 232) distinguish three material elements of writing:
- the surface on which marks are made, such as paper pages, blackboards, stone
- the substance they are made of, such as ink, paint, pixels
- the means by which they are made, whether pens, printers, brushes, chisels.

Let us take some examples of how material has affected letters.


The Old English letters <> (thorn) and <> (wynn) were taken from the early runic futhark , relics of which are scattered sparsely across England and Scotland. Runes were made by carving with a knife or sharp object on something solid like stone or bone. A glance at the bone inscription <> raihan meaning ‘roe deer’, found at Caistor-by-Norwich and dated about 400AD, shows that the runes are largely made of straight lines, because of the difficulty in cutting curves, like Ogham described in Chapter 8. Letter shapes are a consequence of the materials available to the writer (Jackson 1981).

Serif Roman capital letters

The technique of making Roman inscriptions was to draw the outline of the letter with a brush before cutting it out with a chisel. The serifs on Roman letters show the chisel following the flourish made by the brush at the end of the stroke (Catich 1968). Serif letters predominated until the introduction of sans serif fonts in the early nineteenth century, which became the very sign of modernity for the twentieth century (Tschichold 1928). Serif and sans serif are now familiar to every PC user through Times New Roman and Ariel respectively. A quirk of Roman lettering technique has become a staple of our lives.

Reading on a screen

Letters on a computer monitor appear quite differently from those on a printed page, essentially lit from the back like a stained glass window compared to lit from the front like a painting: ‘The screen mimics the sky, not the earth’, Bringhurst (2005: 193). The orientation of reading is usually different, typically a screen being vertical in front of the reader, a book horizontal. These demands led to a generation of fonts specifically designed to be legible on screen, such as Verdana and Georgia, designed by Matthew Carter (Re 2003), discussed in Chapter 25.

Pen and paper

The forms of letters depend upon the instrument used to make them, particularly affecting the thickness of the line. The development of the minuscule letter in England in the 10th century depended upon square-cut quill pens made from goose feathers (Clayton 2014), rediscovered by Johnston (1906). Most modern biros and felt-tips are pointed and so have no variation in line width. The development of pen technology goes hand in hand with advances in paper-making technology, which also affect the history of print (Mueller 2014). Serif screen fonts still mimic the effects of the pen both in the varying width of line and the ‘stress’ showing the angle of pen-hold; sans serif fonts tend to have uniform thickness of line.

Materials and socio-semiotics

The conventional choice of materials goes with the socio-semiotic meaning of the text. Scollon and Scollon (2003: 135) describe three aspects of material; permanence/durability, temporality/newness and quality:

- permanence: a written text can exist for seconds or millennia. Cook (2014) distinguishes functional permanence in which permanence is dictated by the intended use, say manhole covers and street-name signs, from asserted permanence in which the endurance and respectability of the sign and its owner is proclaimed through brass-plates, metal engravings, letters carved in stone and the like. Permanence is shown by the choice of material - stone or brass versus paper or plastic - and is often associated with serif Roman capital letters on stone; indeed a twentieth-century stone sign in Colchester with relief letters proclaims <LAW COVRTS>.

- temporality/newness. Many written textshave a short life, typically handwritten in ink on paper. They are disposable not only physically but also indexically in that they also imply a limited time period whether <Closed>, <Special clearance> or job offers.

- quality. In part quality is shown by the means through which writing is produced. Quality is a function of the skill involved in production: carving letters on slate, engraving letters on brass and painting a sign are skilled and expensive activities compared to scribbling with a felt-tip or printing out on a PC printer. To go back to the condolence letter from Gordon Brown, writing it by hand gave it quality - if not legibility.

Materials and the forms of letters thus express an identity, whether the impersonal identity of the permanent sign in quality materials or the individualism of many painted signs. The material is the message, or at least part of it.


This background chapter has tried to weave together some of the threads from the chapters. They suggest how diverse, rich and interesting the tapestry of the English writing system can be and how important it is both to the study of English and to the study of language in general. A recent book showed how writing enabled one individual, Ewan Clayton, to span the artistic community of Eric Gill, the world of monastic calligraphy, and the cutting edge technological community of PARC (Palo Alto Research Centre) (Clayton 2014). Research into the English writing system is not arcane academic description; writing permeates every aspect of our lives.

