Universal Grammar considered as a parser that acquires language
Written circa 1988; unpublished

Vivian Cook

Universal Grammar and Parsing

The Universal Grammar (UG) theory claims that the same principles are incorporated in the grammars of all languages; variation between languages amounts to differences in the settings for a limited number of parameters. The principles and parameters involved are couched in terms of the framework familiar in Chomskyan work of the 1980s (Chomsky, 1988; Cook, 1988), usually known as Government/Binding (GB) theory. This framework replaces construction-specific rules with principles and parameters that potentially affect all constructions. Conventional rules are examples of the interaction of principles and parameter settings for one construction rather than having status in their own right. The English passive for example reflects an interaction between at least syntactic movement, the Projection Principle, Case Theory, and Theta-theory, all of which are used in many other areas of the grammar rather than being unique to one construction. To be validly based on GB syntax, a parser must then give syntactic representation in the form of principles and parameters rather than rules; it cannot deal with, say, passive sentences or relative clauses as separate issues but has to show how these constructions arise in a particular language through the interaction of principles and particular settings for parameters. This premise has indeed been accepted by the small band of researchers using GB for parsing such as Barton (1984), Wehrli (1984), Sharp (1985), and Kuhns (1986).

The grammar for a particular language consists of a list of parameter settings rather than of the principles themselves; English has a grammar with the head parameter set one way and with the pro-drop parameter set another way; Japanese has a grammar with the parameters set differently; both incorporate the same language principles. Each is one of the finite number of grammars possible in human languages by setting the parameters of UG in particular ways; human languages are limited to the 'finitely many (in fact relatively few) possible core grammars' (Chomsky, 1982, p.17). A parser based on principles and parameters is not specific to one language - a parser of French or Arabic; the same principles of Universal Grammar are utilised regardless of the language being parsed; parsers of French or Arabic differ in the values of parameters that they incorporate, rather then in principles, let alone 'rules'. It is not meaningful to talk of a GB parser for French, only of a GB parser with parameter values set for French, which might equally be a parser of Arabic if the parameters were set differently. Yet so far the design of parsers has still concentrated on language-specific parsers rather than language-independent parsers with resettable parameters; a title such as 'A Government/Binding Parser for French' (Wehrli, 1984) is then paradoxical; a proper GB parser should consist of a uniform set of principles valid for any language along with a number of parameters whose settings vary from one language to another; it will be neither construction-specific nor language-specific.

But how are the values for parameters set? UG theory is above all a model of language acquisition; the principles of GB are built-in to the human mind; the parameters are set from the data the child encounters. Positive evidence of actually occurring sentences of the language triggers the values for parameters, proceeding from the initial zero state SO to the final steady state Ss of mature adult competence. Putting to one side everything but core grammatical knowledge, the difference between a person who knows French and one who knows Japanese is that they have encountered different language evidence that has set the parameters of UG differently. The SO of both the GB parser and the child consists of principles subject to parametric variation; acquiring a grammar or constructing a GB parser for English, Arabic, or Japanese means setting the parameters appropriately rather than devising them from scratch. The logical problem of language acquisition can in a sense be duplicated by the logical problem of language parsing - how does the GB parser know which way to set the parameters?

A proper GB parser has then a strong resemblance to a model of language acquisition. The linguist postulates principles of syntax in the mind of the native speaker, Ss, and deduces from the evidence plausibly available to the child that these must be built-in to the initial state SO - the poverty of the stimulus argument; the linguist builds the same principles of syntax into a GB parser and sets the values for parameters for particular languages. It would be perfectly possible for the linguist simply to preset the values for parameters, by specifying, say, that English is head-first, non-pro-drop, and so forth. But this goes against the claim of UG theory that the combination of built-in principles and variable parameters triggered by language evidence suffices to give the grammar of the language; the form of the principles is motivated by their role in language acquisition. A GB parser should potentially be able to acquire settings for parameters from positive evidence; if not, there is something wrong either with the UG model itself or with the ways in which the GB principles and parameters are described in the theory or with their implementation in the parser. Given the idealisation of the logical problem of language acquisition, setting values for the parameters of a parser is not so much simulation of acquisition as the same activity.

This paper tests this reasoning on one area of GB, namely the phrase structure captured by the X-bar principles and head parameter, as far as possible without reference to the rest of the grammar. The aim is to implement the Chomskyan version of UG literally as a computer program that ideally moves from SO where it contains nothing but X-bar principles and the unset head parameter to SS where it contains the grammar for the actual language it has been exposed to. The concrete task of implementing one area of GB into a parameter-setting parser is instructive since it shows up assumptions implicit in the UG approach, some familiar, some unexpected. The approach here is basically practical. It differs from most existing computer simulations of acquisition in relying on linguistic principles rather than general learning principles; in not assuming that the semantic representation of the sentence (Anderson, 1983) or its thematic structure (Berwick, 1985) is known by the child before parsing takes place (Langley & Carbonell, 1987); in seeing grammar as a system of principles and parameters rather than as rules or procedures; and in restricting its scope to one syntactic area of the grammar. While there are similarities to the approach of Berwick (1985) in the syntactic areas covered and in the relationship it sees between parsing and learning, it differs in its top-down direction of parsing, in its lack of reliance on stacks and buffers for storing information temporarily, and above all in its avoidance of the idea of 'rules'.

Since the GB framework is highly interconnected, taking the single sub-theory of X-bar syntax may not be a fair test. It is however a simplifying assumption known in learnability work as the Independence Principle (Wexler & Manzini, 1987); setting the value for one parameter can logically be treated in isolation from the setting of other parameters. There may also be biasing factors in the choice of X-bar syntax as a test case since the motivation is partly that it mostly avoids the problems of handling movement; other parts of the GB theory may be less successful. Nevertheless the principles and parameters of X-bar syntax are a central area of the theory, often used in discussions of language acquisition; a successful attempt here would seem a reasonable justification for some of the theory's central claims. The head parameter is also being investigated in other areas, namely the acquisition of Micro-Artificial Languages (Cook, to appear, a), transcripts of L1 children (Cook, to appear, b), and experiments with L2 learners (Cook, in progress).

The main principles of X-bar syntax can be summarised as follows, using a standard version with the Barriers extension to I'' and C'' assumed (Chomsky, 1986b):

XP --> ... X ...

This incorporates the axiom that a phrase always has a lexical head of the same type: a NP has an N, a VP a V, and so on.

X'' --> specifier X'

A two-bar phrase X'' has a head that is either one bar less or the same number of bars, and possible specifiers, no order being given for head and specifier. Thus a N'' has a specifier that may be a determiner "the", and a N', "destruction of the city".

X' --> X complements

A single-bar phrase has a lexical category and any complements that are part of the projection of the lexical entry, again in no particular order. Thus the V' "likes beer" has a V and a complement N'' "beer"; the P' "on the table" has a P and a complement N'' "the table".

Parametric variation between languages comes down to the head parameter - whether complements precede or follow lexical heads in the phrases of the language: languages may be head-first:

X' --> X complements

or head-last:

X' --> complements X

English as a head-first language has verbs before complements, and prepositions before complements; Japanese as a head-last language has verbs after complements, postpositions after complements, and so on. In addition the relative positions of specifiers and heads need to be stated, either taken as a separate parameter as in Huang (1984), or as the generalisation that specifiers occur on the opposite side to complements. A typical quotation to put the head parameter in an acquisition context can be:

'.. the possible phrase structures of a language are fixed by general principles and invariant among languages, but there are some switches to be set. In English for example, nouns, verbs, adjectives and prepositions precede their objects; in Japanese, the comparable elements follow their objects. English is what is called a 'head-first' language, Japanese a 'head-last' language. These facts can be determined from very simple sentences, for example the sentence "John ate an apple" (in English) or "John an apple ate" (in Japanese). To acquire a language, the child's mind must determine how the switches are set, and simple data must suffice to determine the switch settings, as in this case.' (Chomsky, 1987)

Since a language-independent GB parser is based on non-construction-specific X-bar principles rather than on rules for particular types of phrase, it is unnecessary to deal with each phrase type separately. There is no need for a separate Verb Phrase rule that a V' "likes a drink" has a V coming before a direct object N'', and a Noun Phrase rule that a N'' "a drink" has a determiner coming before an N', and a Prepositional Phrase rule that a P' "after work" has a P coming before a N'', and a sentence rule that a I'' "John likes a drink after work" has a specifier N'' as Subject coming before the VP; all we need are the X-bar principles, as argued for instance in Chomsky (1982), Berwick & Weinberg (1984, p.209), and Cook (1988). The SO of a parser conceived in these terms contains only the X-bar principles, without any rules of syntax; the variation in the position of the head must be derivable from data given to the parser. It is a computer Language Acquisition Device (LAD) that acquires ordering in phrases from evidence. This is not unlike the approach of Berwick and Weinberg (1984, 204-210) except that their discussion deals only with certain phrase types, sees acquisition as rule-building rather than parameter-setting, and deals with the head parameter only incidentally; the syntactic analysis used here also differs from the version of the head parameter used in Berwick (1985), which utilises a single-level of Head, Arguments, and Specifier rather than the two levels used here.

Program for Acquiring Language (PAL)

The parser designed on these lines is called PAL - Program for Acquiring Language. Its overall form utilises the grammar rules that form part of PROLOG2, an IBM PC compatible version close to the standard Edinburgh PROLOG. To demonstrate the operation of PAL let us describe a brief interaction between user and computer. PAL starts with the zero state, SO, knowing no rules or vocabulary but incorporating the X-bar principles. From vocabulary it encounters and from sentences that are given to it, PAL arrives at the settings for the head parameter for the major phrase types of the language. PAL first asks for the language to be named:

What is the language called? English.

and then requests some vocabulary:

Type a noun: man.

Type a verb: see.

Type a preposition: in.

Type a relative pronoun: who.

Type an article: the.

At this point PAL displays a choice for the user:

Do you want    (a.) to test

(b.) to produce a sentence

(c.) to add vocabulary

(d.) to stop?

If (a.) is chosen the existing lexicon is displayed and a request is displayed:

Now type a test sentence in English using only the above words.

A typical test sentence using the vocabulary given might be:

The man see the man.

If PAL successfully parses this sentence, it produces three types of information:

i) the head parameter settings required for the particular sentence, using "l" for "head-last", "f" for "head-first", and "0" for no information. In the version of PAL described here the head parameter is separated into five actual parameters - I'' (Subject specifier), N'' (specifier, i.e. determiners), N' (complements, here relative pronouns), V' (complements, i.e. NPs and PPs)), and P' (complements, i.e. NPs). Hence for this sentence it produces the message:

Settings: i/two=l v/one=f n/two=l n/one=0 p/one=0

ii) tree-diagrams for the phrases in the sentence, in this case:

         I''                                      N''                                      V''

specifier      I'                      specifier   N'                               V'

              I     complement                   N  complement     V    complement

iii) messages reporting on the structure of the sentence and on the subcategorisation of the verb:

i bar two OK with current [l] setting. Subject before VPs

v bar one OK with current [f] setting. Objects after Verbs

Transitive Verb correctly has object see tr 1

n bar two OK with current [l] setting. Articles before Nouns

iv) running totals for the head position in each type of phrase:

i bar two tot head first=0 last=1

v bar one tot head first=1 last=0

n bar two tot head first=0 last=1

n bar one tot head first=0 last=0

p bar one tot head first=0 last=0

The final screen display after the successful parse is as follows:

________________________________________________________________________

PROGRAM for ACQUIRING LANGUAGE ________________________________________________________________________

Language: english Settings: i/two=l v/one=f n/two=l n/one=0 p/one=0

Sentence: [the, man, see, the, man,.]

SUCCESSFUL PARSE

i bar two OK with current [l] setting. Subject before VPs

v bar one OK with current [f] setting. Objects after Verbs

Transitive Verb correctly has object see tr 1

n bar two OK with current [l] setting. Articles before Nouns

________________________________________________________________________

LEXICON RUNNING TOTALS

[man] i bar two tot head first=0 last=1

[see] v bar one tot head first=1 last=0

[in] n bar two tot head first=0 last=1

[who] n bar one tot head first=0 last=0

[the] p bar one tot head first=0 last=0

________________________________________________________________________

This displays not only the parse for the sentence but also the current settings for each phrase and the running totals from which they are derived.

Suppose the next sentence the parser encounters does not fit its existing settings, say:

See the man man.

As before the parser produces trees for the phrases; it then produces messages:

i bar two wrong with current [l] setting but OK with opposite [f]

setting

v bar one OK with current [f] setting. Objects after Verbs

Transitive Verb correctly has object see tr 2

n bar two OK with current [l] setting. Articles before Nouns

The running totals take account of the new example:

i bar two tot head first=1 last=1

v bar one tot head first=2 last=0

n bar two tot head first=0 last=2

n bar one tot head first=0 last=0

p bar one tot head first=0 last=0

And so on, as it gets more sentences and sets values for parameters for the other phrases. Thus PAL continually updates the way it parses the sentence according to its experience, resetting parameters when necessary.

How accurately does PAL reflect the X-bar principles? The central section of the program is given in the Appendix for ease of reference. One minor point is the seeming contradiction between the PROLOG implementation involving grammar rules of the form:

XP --> ... X ...

and the initial argument for principles rather than 'rules'. The same apparent contradiction exists in the linguistic theory, where the X-bar principles are given in the form of rewrite rules. This problem is however terminological; the meaning of 'rule' attacked above was of a construction-specific statement about, say, the form of the passive or of the relative clause. The X-bar 'rules' are generalisations about the general form of constructions, not about a particular construction, i.e. not 'rules' in the rejected sense. The same is true of the PROLOG implementation in PAL where the X-bar rules apply to all phrases not only to NPs, VPs, etc; while the PROLOG grammar rules are used here for convenience, they may as always be translated to 'proper' PROLOG declarative clauses in the usual way.

Some adaptation has also been made to the X-bar principles in PAL for purposes of practicality. Firstly the program only recognises the words in the form in which they have been entered; it does not cater for morphological changes, or for case markers. This is to isolate X-bar syntax from other sub-theories such as Case Theory as much as possible. For the same reason the relative pronoun is here treated as representing the entire clause, so that relative clauses can be handled within phrase structure without involving wh-movement. Secondly P''s are attached to V's as complements to V and relative clauses are attached to N's as complements to N, though both of these are strictly speaking adjuncts or modifiers in X-bar theory rather than complements, not being projections of the lexical properties of the head; so far as the relative clause is concerned PAL is using the head direction of Principal Branching Direction (Lust, 1983) rather than the head parameter itself. The reasons were on the one hand to allow the parser to deal with a range of phrase types (I'', N'', V'', and P''), and on the other to make the input complete sentences rather than phrases in isolation. Hence the P'' and the relative clause had to be incorporated somewhere within a unified structure of the sentence, impossible in X-bar terms without the complication of adjuncts and modifiers. Thirdly the parser does not assume consistency of head position, (though this can be introduced as a simple variation); rather it has separate head parameters for each phrase type and bar-level, for reasons discussed below. Nor does it utilise the three unmentioned levels of structure, namely I', V'', and P'', or the C constituent, although they are present, since these are unnecessary to its main purpose. Given these adaptations and simplifications, PAL incorporates the principles essentially word for word. While it may be possible to devise a program that stays closer to the letter of X-bar theory, it is not felt that these alterations are significant to the discussion that follows.

The first principle:

XP --> ... X ...

converts into a PROLOG rule that any X acting as a head in a particular category must be a word in the lexicon belonging to the appropriate lexical category. Additionally the PROLOG rule directly incorporates subcategorisation information necessary for the Projection Principle that is crucial to GB:

'lexical structure must be represented categorially at every syntactic stage' (Chomsky, 1986a)

The subcategorisation for the head is taken from information in the lexical entry. Given the current simplified form of PAL, this projection from the entry is restricted to the complements of Verbs, i.e. it distinguishes intransitive from transitive verbs according to the sentences it has encountered.

The second principle:

X'' --> specifier X'

falls into two PROLOG rules stating that an XP with two bars can be either "head-first" or "head-last", i.e. that an X'' consists either of an X' followed by a specifier or of a specifier followed by X'. Because of the simplifications of PAL, specifiers are defined as empty for V'' and P'' but are allowed for N''. Given GB syntax, a specifier N'' is compulsory for I'', i.e. a subject for the sentence is needed in order to incorporate the Extended Projection Principle that all sentences have subjects (Chomsky, 1982, p.10). The head parameter as affecting specifiers is implemented as a choice between two rules rather than as a single unified rule, a programming convenience that does not affect its logical status.

The third principle:

X' --> X complements

translates in a similar fashion into two PROLOG rules embodying the two possible orders for the X' - either X plus complement or complement plus X, i.e. the head parameter proper. In PAL the I' must have V'' as complement; the N' may have a relative pronoun; the P' must have N'', and the V' must have N'' or P''; or any of these complement or specifier positions may be empty (with the exception of the complements to I' and the EPP compulsory subject). In other words, in addition to the X-bar principles above, PAL incorporates information about the nature of complements; they are only maximal projections, that is to say V'', N'', and P''; more specifically, the complement of I' must be V'', of P' N'', of V'' either N'' or P'' or both, and of N' a relative pronoun standing for a relative clause.

So far all possible settings for the head parameter are permitted. We now need to build in parameter-setting, in PAL a choice between the two versions of the specifier and complement rules. The crucial technique borrows a distinction from cryptography between decoding (working out what a coded message means according to an already known code) and codebreaking (working out the unknown code from a coded message) (Cook, 1977). PAL can make two passes at both levels of each phrase. At the first pass it decodes the phrase by using the head parameter setting it has already stored; if this succeeds, its parse is successful and it does not bother with a second parse. If the first pass fails, it tries a second pass in which it codebreaks the phrase by treating the setting as unknown and seeing whether it can find a possible phrase using another setting for the parameter than the one it already has. If this pass succeeds, it produces the correct alternative structure for the phrase; if it fails, no parse is possible. The SO setting for the head parameter for any phrase starts as unknown, "0"; it is reset to "l" (last) or "f" (first) after a successful parse is over, according to which of the running totals for "l" or "f" is higher. Hence in SO the first example sentence fails on the first decoding pass because all the settings are "0"; it must then undergo a second codebreaking pass; one sentence can trigger all the settings, as Chomsky claims for the child, namely, using the current vocabulary for 'english':

the man who see the man on man.

Subsequent sentences are first decoded phrase by phrase according to existing settings, "l" or "f"; any phrase that fails is codebroken using the alternative setting. Thus the earlier sentence:

see the man man.

was decoded according to the existing I'' "l" setting, failed and was successfully codebroken with the opposite "f" setting, that is to say Subject initial rather than Subject final. The other phrases however were successfully decoded without a change of setting. The program keeps running totals for each parameter; permanent (i.e. decoding) settings change whenever there are more counterexamples than examples.

The subcategorisation information needed for the projections from the Verb is achieved by similar means; the entry that PAL creates for a Verb shows whether it is "tr" or "int" based on the number of occasions it has been 'transitive' or 'intransitive'. Each time a particular Verb is followed by an Object NP within the VP, PAL adds one to the "tr" count; each time it occurs without an Object NP, PAL deducts one. Given the input sentences so far, the entry for "see" now records that it is "tr" based on 3 occurrences of Verb with Object, i.e.:

word(verb,see,tr,3).

Hence if the user types in:

man sees.

PAL produces a message:

Transitive Verb needs an object. see tr 2

This simplifies Verb subcategorisation by assuming that a Verb is either transitive or not, rather than having all the complex shadings found in English ranging from "deplore" to "eat" to "faint".

From this account it may be seen that PAL is a proper GB parser in that it uses principles rather than rules, it is language independent, and settings for its parameters are triggered by evidence. PAL indeed works successfully in the way that a LAD is supposed to work. Run in uncompiled form on an IBM PC compatible it usually parses a sentence with known settings in about 3 seconds, a sentence with a major resetting in about 30 seconds; occasionally it gets trapped in a loop. While in many ways it is a simplification, so is the logical problem of language acquisition, and so for that matter is the real child. In addition to the basic principles and parameters of X-bar syntax, PAL includes the Projection Principle and the Extended Projection Principle, both usually taken to be part of the child's SO. The only other built-in information concerns the specifiers and complements as detailed above; this seems not to be of tremendous relevance to the central issue being pursued here. The fact that PAL is written in PROLOG might distort the picture, perhaps through the use of grammar rules for instance; concealed legerdemain might conceal the fact that the built-in assumptions of PROLOG are doing the work rather than the actual principles; because of the possibility of translating the grammar rules into PROLOG clauses and because of PROLOG's minimalist built-in assumptions, this however seems unlikely. PROLOG does however provide some programming problems. At a trivial level counting routines for parameter totals and verb subcategorisation represent a type of procedural number-calculation for which PROLOG is not convenient. Similarly, while sequence of clauses is not supposed to matter in PROLOG, in fact PROLOG2 handles them in a linear order; of necessity the possibilities are considered in a fixed sequence. At a more serious level, as an essentially left-to-right language, PROLOG encounters more problems with left-branching structures. One unresolved problem for example that causes the looping mentioned above is an interaction between unfilled complements and specifiers in codebreaking left-branching languages.

PAL and the evidence necessary for triggering parameters

Let us now see what the design of PAL tells us about the acquisition problem. To give a further quotation:

'The parameters must have the property that they can be fixed by quite simple evidence, because this is what is available to the child; the value of the head parameter for example, can be determined from such sentences as John saw Bill (versus John Bill saw)' (Chomsky, 1986a, p.146)

To what extent is this actually true? Like the child, PAL uses only positive evidence of sentences it encounters - text presentation rather than informant presentation; again this allows the problem of acquisition to be put in a rigorous form. Let us start by looking at the additional information other than the simple sequence of words that PAL requires to able to codebreak the input - what has now come to be called 'bootstrapping' (Wanner & Gleitman, 1982; Pinker, 1984). For the head parameter to be set from input alone, at least three types of bootstrapping are required:

i) knowledge of the syntactic categories to which words belong. PAL has a built-in five way categorisation into N, V, P, determiner and relative pronoun. Furthermore PAL has to know which category each word belongs to; its principles do not work if the lexical category of a word is unknown; "man see" might be a Verb "man" followed by a Noun "see" so far as it is concerned. The same point is made by Weinberg (1988, p.261): 'In order to apply the phrase-structure rules of a grammar (or the principles from which they derive) we have to know the category associated with items in the input stream'. PAL solves the problem of assigning words to syntactic classes by having the user categorise the word as it is entered into the lexicon. The UG model of acquisition presumes the child knows major lexical categories; without this knowledge the child would be just as helpless as PAL, as illustrated in Pinker (1984, p.38-39) and Gleitman (1984, p.572). Developmentally the child may assign words to syntactic classes through semantic bootstrapping by recognising names and things as opposed to actions or changes of state, in order to get the categories of Noun and Verb (Pinker, 1984), or by using distributional clues, discovering, say, that determiners precede Nouns but not Verbs (Valian, 1986); Micro-Artificial Languages have been shown to be learnable provided the learner has access to semantic information or syntactic 'markers' (Meier & Bower, 1986). However, as Pinker (1984) points out, this is a matter of how the child assigns words to categories, not of how the categories themselves originate. The X-bar principles assume prior knowledge of the syntactic categorisation of words, whether in acquisition or in PAL; it is not enough just to hear "John saw Bill" to know that the head "saw" comes on the left of the Object NP without knowing the input is a sequence of Noun-Verb-Noun. PAL therefore assumes that knowledge of categories precedes setting of parameters; the syntactic categories used in X-bar principles are not derivable from syntactic structure.

ii) knowledge of lexical head categories versus non-lexical categories. The specifiers used in PAL are determiners and N''; the complements are either relative pronouns or maximal projections (X'') in the form of P'' or N''; determiners and relative pronouns never occur as heads of the phrase but only fill other parts of the structure of the phrase. For the X-bar principles to work, again PAL needs to know that heads are distinct from non-heads, i.e. introducing determiners and relative clauses only in specifier and complement positions. The child similarly needs information about this distinction, first demonstrated in Martin Braine's analysis of the two-word phase as open words (head-words) and pivots (non-head-words) (Braine, 1963); this may be phonologically derived at least in English where closed-class non-lexical categories have different stress and reduction characteristics from open-class lexical categories (Gleitman, 1984). However, in the X-bar analysis prepositions are treated as an open-class lexical category like Nouns and Verbs, whereas by Gleitman (1984) and others they are treated as closed-class items since they are not only restricted in number but also share many of the phonological properties of non-lexical categories. Bootstrapping the difference between lexical and non-lexical categories from phonological characteristics would not seem to work for prepositions. One scenario for development might be that the child starts with only lexical heads, as argued in Cook (to appear, b) from L1 developmental evidence; the well-known telegraphic stage consists of the child producing strings of 'content' words rather than lexical 'function' words (Brown, 1973). A counterargument from X-bar theory might be the crucial absence of one of the lexical categories from telegraphic speech, namely prepositions. Or are they absent? Looking at the order of acquisition for grammatical morphemes in Brown (1973), "in" and "on" are in fact the first such items to be learnt; Valian (1986) found children between 2:2 and 2:5 'frequently leave out Determiners when Determiners are required, but they almost never leave out a Preposition when a Preposition is required.' Furthermore both PAL and the child need not just to distinguish head categories and non-lexical categories but also to make a distinction between the two types of non-head that fill different functional categories in the X-bar principles of specifier and complement. To sum up, discovering the correct structure for the second NP in "John ate an apple" means knowing that "an" is a specifier, "apple" a lexical head.

iii) A more complex form of bootstrapping is involved in deciding the main order of Subject, Verb, and Object in the sentence. Again this is not so much discovering the category of Subject itself as finding out which part of the sentence is the Subject. In Barriers (Chomsky, 1986a) syntax the specifier of I'' is the Subject of the sentence, i.e. a N''; the complement is the V''; I itself is needed to carry features of AGR, Tense, etc, and for V to move into in the case of V-movement (Chomsky, 1986b). Chomsky has often insisted on a two-way distinction between the Subject and the rest of the sentence, rather than a three-way SVO division, claiming that it

'is empirical, therefore controversial, but it appears to be well supported by crosslinguistic evidence of varied types' (Chomsky, 1986a, p.59).

Indeed he suggests that, since it could not be learnt from positive evidence, the distinction must be innate;

'UG must restrict the rules of phrase structure so that only the VP analysis is available at the relevant level of representation' (ibid, p.62).

The problem is not so much building the two-way division into PAL as ensuring that it knows which NP is actually the subject of the sentence. The EPP requires every sentence to have a Subject, incorporated in PAL as making the specifier of I'' always N''. From a codebreaking point of view any sentence with more than one NP has more than one potential Subject. At least the following possibilities exist;

John saw Bill (Subject Verb Object versus Object Verb Subject)

John Bill saw (Subject Object Verb versus Object Subject Verb)

Saw Bill John (Verb Subject Object versus Verb Object Subject)

In GB syntax the notion of Subject is defined in terms of configurations; it is the 'NP of S (i.e. NP immediately contained in S)' (Chomsky, 1986a, p.59). To establish the Subject of the sentence, PAL needs to know the configuration and the setting for the head parameter, insofar as it applies to the Subject as a specifier of I''. But this is precisely what the codebreaking process cannot assume. Again some form of bootstrapping is required to give the requisite information. This might take several forms:

A) situational or semantic information. The child identifies the Subject by seeing who is carrying out the action in the situation, or who is the animate actor, or other related procedures, as argued in Pinker (1984). However feasible for children, this avenue is not open for PAL without bringing in the usual problem of building the world into the computer. Or the problem may be circumvented by assuming the parser already has a thematic representation for the sentence; Berwick (1985) allows the parser to derive the correct representation for "John kissed Sally" by virtue of knowing the Predicate is "kiss", the Agent "John", and the Affected Object "Sally".

B) sequence of development. Given that sentences are supplied in some order, it might be that the program could tackle intransitive single-NP sentences before transitive double-NP sentence. Thus it might ignore transitive sentences until it met:

"man faint"

Then it sets the I'' parameter to head-last and so knows that double-NP sentences such as:

"man see man"

have Subject Verb Object order. But clearly it is dangerous to make a logical account of acquisition depend upon a developmental history of this kind. In terms of actual child development, English children encounter the deceptive imperative construction:

"Help John."

early on and would need to be aware that this was not Verb Subject. Children indeed progress in a sense from single-element sentences to those with more and more elements (Crystal et al, 1975). But their early productions do not show much sign of relying on intransitive verbs rather than transitive ones.

C) access to bracketted input. A further quotation goes as

follows:

'To set the value of the parameter for Spanish for example, it suffices to observe three-word sentences such as:

3. Juan [habla ingles].

Juan [speaks English]

such evidence suffices to establish that the value of the parameter is head first and, in the absence of explicit evidence to the contrary, to establish the head-complement order throughout the language .. Thus in (3) the structure is as indicated by the brackets, which demarcate a VP ...' (Chomsky, 1988, p.70)

Chomsky is perfectly correct: the reader or the child needs precisely the information given by the bracketting of "Juan [habla ingles]" to know that "Juan" is the subject of the sentence, "ingles" the object. PAL could be modified to acquire the subject position from bracketted input. Morgan et al (1981) have shown the necessity so far as acquisition of syntax is concerned to know where sentences begin and end.

man see man see man

is only learnable if it is clearly:

man see. man see man.

or:

man. see man. see man.

or whatever. Morgan et al (1987) have argued for the importance of phonological clues such as pauses and intonation to the child. Surreptitiously PAL has incorporated such information by asking the user for a sentence and by insisting in the usual PROLOG fashion for a final full stop; sentences have in effect been bracketted by the user. It is however unclear whether the type of phonological support provided to the child's identification of sentence boundaries extends to the demarcation of the Subject; Brown (1975) noted the tendency for Subjects to have a separate tone-group but her data consisted of news bulletins rather than parental speech to children; Read & Schreiber (1982) argue for a phonological explanation for the ability of 7-year-old children to recognise multi-word Subjects but not single-word Subjects. And this explanation may be of great use for children learning languages where there are markers that clearly label Object and Subject such as Turkish (Slobin & Bever, 1982), and Japanese.

Part of the solution adopted in PAL is straightforward and unproblematic: Subjects come outside Verb Phrases. This is based on the implicit assumption of X-bar syntax, called by Randall (1985) the order principle, that 'within a maximal projection optional elements must follow obligatory elements'. In other words complements, being projections of the lexical entry, must come 'inside' specifiers that are not compulsory projections. This implies that (VO)S, S(VO), and (OV)S are normal, but OSV, VSO, and OSV are odd in that S interrupts the VP constituent. In terms of acquisition the order principle corresponds to Operating Principle D in Slobin (1973), 'avoid the interruption of linguistic units'. The classic X-bar account derives these languages by movement of the Verb from a final position: VSO is underlyingly SOV, and so does not break the order principle; 'there is evidence that in such languages the basic structure of the clause is NP VP and that the verb moves to the beginning of the clause' (Chomsky, 1988, p.71). This solution cannot be adopted in PAL without incorporating movement, which has not been attempted; if PAL therefore totally observed the order principle it would misinterpret VSO languages as VOS.

One way round this bias against VSO languages is to propose an additional requirement that, ignoring the Verb, Subject always comes before Object, on a par with Bever's NVN=Actor-Action-Object strategy (Bever, 1970). This universal SO sequence permits languages with SVO, SOV, and VSO but rules out languages in which Objects come before Subjects, e.g. OVS, OSV and VOS. The prediction is that OS order is highly marked and, as a consequence, rare in the world's language. Tomlin (1984) provides statistically valid figures for different word orders based on 999 representative languages; the percentage for all SO languages, i.e. SOV, SVO, and VSO, is 96%, for OS languages, i.e. VOS, OVS and OSV, is only 4%. One explanation for the rarity of OS languages may be that they are difficult to parse; the SO sequence predicts that these languages go against the usual tendency. Slobin & Bever (1982, p.240) also found an effect of this sequence in differences between Serbo-croat and English speaking children on a word order test. This universal SO sequence works successfully for the child or for a parser for 96% of the world's languages but implies a breach of the X-bar principles as stated. However, even if not adopted, some alternative explanation has to be found for the fact that the OVS and VOS languages, which fit the order principle, are found in 1% and 3% of the world's languages respectively, while VSO, which breaches the order principle but not the Subject before Object sequence, accounts for 9% (Tomlin, 1984). Both alternatives combine in OSV languages which are less than 1% of the world's languages (Tomlin, 1984).

While the preceding discussion has seen identifying the Subject as the prime need, it may be that identifying the Object is as relevant; Slobin and Bever (1982) show the facilitating influence for children learning Turkish of the presence of object markers. Nor is this problem confined to knowing the Subject of the sentence. Given a sequence:

"man in man"

PAL cannot decide except arbitrarily between whether "in" is a postposition or a preposition since it could equally belong with either "man".

PAL and memory systems

Ideally, as we have seen, the child could learn from one sentence; "John saw Bill" sets the value for the head parameter for all the phrases in the language from one example of a V'', "saw Bill"; let us call this 'one-time' setting. The current version of PAL does not however have one-time setting nor does it have a single head parameter; instead it has separate parameters for each level of each phrase continuously available for resetting. Consistency across all phrases or a predictible relationship between complement and specifier settings could simply be built-in to PAL rather than the current formulations; why then does PAL incorporate a series of counting systems that record how many phrases have been encountered of each type with head-first and head-last settings?

The main justification comes from language acquisition. Suppose the first appropriate sentence the child hears does not conform to the normal setting for the head parameter for that language; for example the title of a popular television series is "Murder she wrote", an Object Subject Verb sentence. If one-time setting is sufficient, a child who encounters a rare exception would be marred for life. The child must clearly be able to correct initial incorrect settings derived from insufficient data; children encounter performance which not only contains a number of sheer mistakes and relaxes some of the assumptions of core grammar but also contains deceptive false clues. Given one-time setting, a child would never recover from hearing "Pick your toys up", where the combination gives the appearance of English having Postpositions. Furthermore some languages are far from fixed in word order; in Turkish for instance, while the main order is SOV, all five of the other possible orders are also found (Slobin & Bever, 1982); in Hungarian too most orders are permissible, through with restrictions for particular combinations (Kiss, 1987). To succeed despite these handicaps, children need a robust acquisition system that incorporates some idea of frequency and hence a memory system for remembering instances of language; the canonical sentences of the language must outweigh the exceptions in some way.

As we saw, one way out of the dilemma about knowing the Subject is to assume the acceptable data is ordered from simple to complex and to wait for a suitable intransitive sentence; here too PAL implies that a memory system is involved. With Verb subcategorisation as well, PAL does not trust one sentence, say "John smokes", to decide that "smoke" is intransitive, but keeps a constant check on sentences such as "John smokes cigars" to find whether the Verb needs to be subcategorised as intransitive or transitive. It is hard to see how either of these could be accommodated by one-time setting and so memory systems are required.

A case for flexible rather than one-time setting of parameters emerges from second language acquisition. Inasmuch as bilingualism is a natural condition of much of mankind, the description of grammatical competence has to recognise that many speakers simultaneously know two settings for a parameter; for example a Japanese/English bilingual knows two opposite settings for the head parameter. Cook (1988) describes three possible models for L2 acquisition: direct access to UG, indirect access to UG via the L1, and no access to UG. The direct-access model suggests that L2 learners start from scratch with no setting for the head parameter presumed, except for any markedness condition present in L1 acquisition; the indirect-access model suggests L2 learners start from the L1 settings of the head parameter; the no-access model suggests that UG is irrelevant to L2 learning. Cook (1988) argues for the indirect-access model in which L2 learners start from the L1 settings for parameters. Often both settings may be active simultaneously, as in the case of code-switching - the way in which bilinguals speaking to other bilinguals change language in mid-sentence - which is perfectly normal speech behaviour among bilinguals (Grosjean & Soares, 1986). Japanese is head-last and so has Verbs after Objects, Postpositions after NPs, and so on: English has the opposite head-first setting. A code-switched sentence from a Japanese/English bilingual cited by Nishimura (1986) is:

I slept with her basement de.

(I slept with her in the basement)

The speaker has produced an English sentence with a head-first setting containing a Japanese PP with a head-last setting using the postposition "de" and an English Noun "basement"; effectively the opposite settings for the I'' and the V'' on the one hand, and the P'' on the other have been used. Or take another sentence from Nishimura (1986):

kaeri ni wa border de we got stopped eh?

return on topic marker border on we got stopped

(On the way home we got stopped on the border)

The speaker has produced a Japanese sentence with an initial Japanese topic phrase, then a PP with Japanese head setting and Postposition "de" and an English Noun, and finally an English Subject and Verb. Again not only are the head parameter settings for both languages simultaneously present in the mind but so are separate settings for P'' and I'', and so on, within the same sentence. An alternative account of code-switching within a GB framework is provided by Woolford (1983).

Neat as one-time setting of parameters may be, it seems an untenable simplification; PAL has to take into account certain cumulative properties of the sentences it encounters. This might be interpreted by seeing its input, not as a sentence, but as a representative sample of sentences, i.e. a text in the Gold sense (Gold, 1967); in a way PAL is simultaneously taking into account properties of languages, i.e. collections of sentences, rather than single sentences.

More mundanely PAL may be said to remember formal properties of past sentences and compare them with the properties of the present sentence, in other words to have memory systems that count head parameter settings and subcategorisation features. This argument suggests the logical acquisition model requires something similar; the model must allow the learner to profit from a sequence of sentences. Surreptitiously the UG model of acquisition has been relying on a memory system similar to Piaget's notion of a memory for instances (Piaget, 1973).

Conclusions

Though working in a simplified universe, PAL shows that a UG approach to GB parsing is indeed possible; the X-bar principles may be incorporated in a parser in such a form that the variation between parsers for different languages is derivable from simple input. We have seen that PAL is forced into various decisions, such as the SO sequence and the possession of counting systems, both with possible parallels in acquisition; more detailed modelling is envisaged and the testing of hypotheses derived from PAL with data from children's development (Cook, to appear, b). PAL has explored the no-growth model of UG in which the principles of language are assumed to be immutably present from birth even if not manifest in the child's speech rather than a growth model in which the principles themselves are subject to development. Further variants of PAL will relax this assumption to see the extent to which changes in the X-bar principles could be included, say for instance by starting with the flatter structure of non-configurational languages or by trying out the suggestions that certain types of phrase are in fact absent from children's speech (Radford, 1986, 1988) and that learning of phrase structure has the sequence lexical head followed by complements followed by specifiers (Cook, to appear, b). One intriguing point has so far been incidental to the discussion, namely the two-pass process of decoding followed by codebreaking, called in Berwick and Weinberg (1984, p.205-207) the parsing phase and the acquisition phase. Essentially the parser tries to decode the phrase according to its existing setting; if this fails, it tries to codebreak the sentence using the alternative setting: there are different ways of processing the sentence according to whether it fits the known grammar or not; in the terminology of Berwick (1985, p.18), 'if at any point during the analysis of a sentence no known grammar rule applies, the system attempts to build a new grammar rule to do the job'. If the child does not use one-time setting and relies on some memory system, something like the present distinction must be available to the child. A sentence is first handled by the child's existing grammar; when it fails, the child cannot simply discard the sentence as unsuccessful if its grammar is to develop, but has to explore alternative settings. Thus the child must have a two-pass parser in some form; as well as the existing grammar, the child has available a secondary grammar that allows alternative settings within the UG framework. Second language acquisition poses this in an acute form where the L2 learner has to reset parameters for each phrase, doing so rapidly and with ease, as we saw in the example of codemixing.

The closest approach to relating parsing and acquisition is the model presented by Berwick in various publications (Berwick and Weinberg, 1984; Berwick, 1985; Berwick, 1987). However in several ways the Berwick approach does not test the UG model of acquisition as formulated here. Firstly it largely concentrates on the building of rules rather than on the setting of parameters, as seen in typical statements such as 'The acquisition of new rules is prompted by rule failures during the parsing process' (Berwick and Weinberg, 1984, p.202). Partly this lack may be simply terminological, since a 'rule' in Berwick and Weinberg seems non-construction-specific rather than confined to one construction. More crucially perhaps there is the matter of 'type-transparency' raised by Berwick and Weinberg (1984) - 'the condition that the logical organisation of rules and structures incorporated in a grammar be mirrored rather exactly in the organisation of the parsing mechanism' (ibid, p.39). This condition they relate essentially to the mirroring of comprehension processes, i.e. a descendant of the Derivational Theory of Complexity. The UG theory is a model of language knowledge rather than a set of procedures for understanding or producing a sentence; it would be dubious to try to make a GB parser type-transparent in this sense. While the Berwick and Weinberg definition of type-transparency given above indeed concerns the relationship of parsing to the processes of performance, a wider definition would concern the relationship to the state of language knowledge in the mind, a variant of the psychological reality issue. A GB grammar answers the question 'Is this sentence grammatical?' by measuring it against the total body of language knowledge possessed by the speaker, SS; the grammar represents a state of knowledge, not a process. This is reminiscent of a declarative computing language such as PROLOG in which the program evaluates a query 'Is this statement true or false?' against the totality of its data-base, rather than to a procedural computing language which gives the computer a set of commands to execute. The processes by which the sentence is interpreted are no concern of GB; the sentence is effectively evaluated against the simultaneous operation of all the interacting principles and sub-theories of the grammar; it is barely meaningful to say Binding Theory or Case Theory comes first or last. Even the concept of movement, which appears to be procedurally based, is now seen as chains that express relationships rather than as process (Chomsky, 1986a; 1986b). Rather than proceeding in a sequential procedural fashion, a GB parser should reflect a simultaneous total execution in which sequences and steps are unimportant: it answers the question "Is this sentence grammatical?" by drawing on its knowledge as a whole, just as a PROLOG program evaluates whatever is given to it in terms of its entire data-base of related clauses; it is not simply that the processing is parallel but that it is effectively simultaneous. So attempts to make process-oriented parsers, such as Marcus parsers with look-ahead buffers (Marcus, 1980), particularly necessary to the Berwick and Weinberg (1984) model, or ATN parsers incorporating HOLD (Wanner & Maratsos, 1978), are inappropriate for a proper GB parser. A GB parser, since it reflects language knowledge in the mind, is type-transparent in a more appropriate sense than correspondence to psychological process; its correspondence is to an idealised declarative state of knowledge, rather than to the processes by which sentences are produced or understood. Such a final Ss must be derivable from an SO by utilising evidence to set parameters. Hence type transparency applied to a GB parser involves the logical problem of language acquisition. Since a GB grammar reflects a state of knowledge, it is related to the instantaneous model of acquisition, rather than to models of parsing or performance or indeed language development.

PAL does not then provide a series of steps for processing a sentence, even if in practice it has to go through a particular sequence. Hence it makes no use of temporary memory stores that look ahead in the sentence, such as the Marcus buffer, or suspend information temporarily until it can be dealt with, as in HOLD. Such temporary stores would be inappropriate in a declarative parser since they essentially depend upon the linear order of the sentence and the temporal sequence of processing, rather than on the simultaneous interaction of all relevant principles characteristic of GB.

However PAL does include two memory systems for setting parameters and for maintaining information in the lexical entry. Have we then slipped back into a procedural type transparency incompatible with a declarative model? The postulated memory systems are not components in speech processing but can be seen either as retaining certain information about phrases or lexical heads from one sentence to the next, or as generalisations about languages; they are not involved in decoding or codebreaking themselves but keep a record of some of their by-products.

At one level this paper has demonstrated that it is feasible to implement an important aspect of X-bar syntax in PROLOG; this obviously goes no further than showing that such a parser is indeed programmable, which seems hardly surprising, even if satisfying. At another level PAL has been used as a method of highlighting particular issues in the UG theory of acquisition. A parser can act as a heuristic aid to testing an acquisition model and throw up such points as the necessity for bootstrapping, the apparent need for memory systems, and the distinction between decoding and codebreaking. At a third level however it has been implicit all along that the presentation of the parser as a version of UG is not just one of the possible ways of incorporating GB theory into parsing; it is the only valid way in which it can be done. The GB parser must reflect the universal principles of language; it must have parameters that are set to different values in different languages; their values must potentially be capable of being set from positive evidence of sentences in the language. Any other use of GB in parsing represents a misuse of the central tenets of the theory. The problem of designing a GB parser and the problem of language acquisition are variations on the same theme.

References

Anderson, J. (1983), The Architecture of Cognition, Harvard University Press

Barton, G.E. (1984), 'Toward A Principle-Based Parser', MIT Artificial Intelligence Lab Report, A.I. no 788

Berwick, R.C. (1985), The Acquisition of Syntactic Knowledge, MIT Press

Berwick, R.C. (1987), Principle Based Parsing, in Shieber & Wasow 'The Processing of Linguistic Structure', MIT AI Lab

Berwick, R.C., & Weinberg, A.S. (1984), The Grammatical Basis of Linguistic Performance, MIT Press

Braine, M. (1963)), 'The ontogeny of English phrase structure: the first phrase', Language, 39, 1-13

Brown, G. (1975), Listening to Spoken English, Longman

Brown, R. (1973), A First Language: The Early Stages, London, Allen & Unwin

Chomsky, N. (1981), 'Principles and parameters in syntactic theory', in N. Hornstein and D. Lightfoot (eds.), Explanations in Linguistics, London. Longman

Chomsky, N. (1982), Some Concepts and Consequences of the Theory of Government and Binding, MIT Press

Chomsky, N. (1986a), Knowledge of Language: Its Nature, Origin and Use, New York, Praeger

Chomsky, N. (1986b), Barriers, M.I.T. Press

Chomsky, N. (1987), Kyoto Lectures, mimeo

Chomsky, N. (1988), Language and Problems of Knowledge: The Managua Lectures, MIT Press

Cook, V.J. (1977), 'Language processes and language teaching', Indian Journal of Applied Linguistics, III, 1, 19-27

Cook, V.J. (1988), Chomsky's Universal Grammar: An Introduction, Blackwells [Later edition 2007 with M. Newson]

Cook, V.J. (to appear, a= 1988), 'Language learners' extrapolation of word order in phrases of Micro-Artificial Languages,' Language Learning 38, 4, 497-529, 1988

Cook, V.J. (to appear, b=1990), 'Observational evidence and the UG theory of language acquisition', in I. Roca (ed.), Logical Issues in Language Acquisition, Foris

Cook, V.J. (in progress, = never published, bits have occurred in other publications), Setting the Head Parameter: Finding Empirical Support for Universal Grammar

Cromer, R.F. (1987), 'Language growth with experience without feedback,' Journal of Psycholinguistic Research, 16/3, 223-231

Crystal, D., Fletcher, P., & Garman, M. (1976), The Grammatical Analysis of Language Disability, Arnold

Gleitman, L. (1984), 'Biological predispositions to learn language', in P. Marler & H. Terrace (eds.), The Biology of Learning, Springer

Gold, E.M. (1967), 'Language identification in the limit', Information and Control, 10, 447-47

Hawkins, J.A. (1983), Word Order Universals, New York, Academic Press

Huang, C.-T.J. (1982), Logical Relations in Chinese and the Theory of Grammar, MIT Ph.D.

Hyams, N. (1986), Language Acquisition and the Theory of Parameters, Dordrecht, Reidel

Jackendoff, R. (1977), X'-Syntax: A Study of Phrase Structure, MIT Press

Kiss, K.E. (1987), Configurationality in Hungarian, Akademiai Kiado, Budapest

Kuhns, R.J. (1986), 'A PROLOG Implementation of GB Theory', Co.Ling, 546-550

Langley, P., & Carbonell, J.G. (1987), 'Language acquisition and machine learning', in B. MacWhinney (1987)

Lust, B. (1983), 'On the notion "Principal Branching Direction": a Parameter in Universal Grammar', in Otsu, Y., van Riemsdijk, H., Inoue, K., Kasimo, A., & Kawasaki, N. (eds.), Studies in Generative Grammar and Language Acquisition, International Christian University, Tokyo

MacWhinney, B. (ed.) (1987), Mechanisms of Language Acquisition, LEA, Hillsdale, New Jersey

Maratsos, M. (1982), 'The child's construction of grammatical categories', in Wanner, E., & Gleitman, L. (eds), Language Acquisition: The State of the Art, CUP

Marcus, M.P. (1980), Theory of Syntactic Recognition for Natural Languages, MIT Press

Meier, R.P., & Bower, G.H. (1986), 'Semantic reference and phrasal grouping in the acquisition of a miniature phrase structure language', J. Mem. Lang, 25, 492-505

Morgan, J.L. (1986), From Simple Input to Complex Grammar, MIT Press

Morgan, J.L., Meier, R.P., & Newport, E.L. (1987), 'Structural packaging in the input to language learning: contributions of prosodic and morphological marking of phrases to the acquisition of language', Cognitive Psychology, 19, 498-550

Morgan, J.L., & Newport, E.L. (1981) 'The role of constituent structure in the induction of an artificial language', JVLVB, 20, 67-85

Nishimura, M. (1986), 'Intrasentential codeswitching: the case of language assignment', in Vaid, J. (1986), Language Processing in Bilinguals: Psycholinguistic and Neuropsychological Perspectives, Lawrence Erlbaum Associates, Hillsdale, New Jersey

Piaget, J. (1973), Memory and Intelligence. 

Pinker, S. (1984), Language Learnability and Language Learning, Harvard University Press

Radford, A. (1986), 'Small children's small clauses', Bangor Research Papers in Linguistics, 1, 1-38

Radford, A. (1988), 'Small children's small clauses', Transactions of the Philological Society, 86, 1, 1-43

Randall, J.H. (1985), 'Indirect positive evidence: overturning generalisations in acquisition', mimeo

Read, S., & Schreiber, P. (1982), 'Why short subjects are harder to find than long ones', in Wanner, E., & Gleitman, L., Language Acquisition: the State of the Art, CUP

Sharp, R.M. (1985), 'A Model of Grammar Based on Principles of Government and Binding', Ph.D., University of British Columbia

Sinclair, H. & Bronckart, J. (1971), 'SVO a linguistic universal?', Journal of Experimental Child Psychology, 14, 329-348

Slobin, D.I., & Bever, T.G. (1982), 'Children use canonical sentence schemas: a crosslinguistic study of word order and inflection', Cognition, 12, 229-265

Smith, K.H. & Braine, M.D.S. 'Miniature languages and the problem of language acquisition', mimeo, no date

Tomlin, R.S. (1984) 'The frequency of basic constituent orders', Papers in Linguistics, 17, pp.163-196

Valian, V. (1986), 'Syntactic categories in the speech of young children', Developmental Psychology, 22/4, 562-579

Vernoy, D. (1987), 'Principle Based Parsing', Eurotra-Essex Internal Memo

Wanner, E., & Maratsos, M. (1978), 'An ATN approach to comprehension', in M. Halle, J. Bresnan, and G.A. Miller (eds.), Linguistic Theory and Psychological Reality, MIT

Wehrli, E. (1984), 'A Government-Binding Parser for French', Institut pour les etudes semantiques et cognitives, Universite de Geneve, Working Papers no 48

Weinberg, A. (1987), 'Modularity in the syntactic parser', in J.L. Garfield (ed.), Modularity in Knowledge Representation and Natural Language Understanding MIT

Wexler, K., & Manzini, M.R. (1987), 'Parameters and learnability', in T. Roeper & E. Williams (eds.), Parameters and Linguistic Theory, Dordrecht, Reidel

Woolford, E. (1983), 'Bilingual Code-switching and Linguistic Theory', 14, 3, Linguistic Inquiry, 520-536

Appendix. The X-bar Principles Implemented in PROLOG2

Sufficient of the program is given to show how the X-bar principles are implemented in a codebreaking/decoding framework together with some of the related clauses.

s --> xxp(i,two).

xxp(Cat,two) --> {setting(Cat,two,Setting)},xp(Cat,two,Setting).

xxp(Cat,two) --> xp(Cat,two,_).

xp(Cat,two,l) --> spec(Cat,l),xxp(Cat,one),{tot(Cat,two,l)}.

xp(Cat,two,f) --> xxp(Cat,one),spec(Cat,f),{tot(Cat,two,f)}.

xxp(Cat,one) --> {setting(Cat,one,Setting)},xp(Cat,one,Setting).

xxp(Cat,one) --> xp(Cat,one,_).

xp(Cat,one,f) --> x(Item,Cat,Subcat),comp(Cat,f),

{tot(Cat,one,f),projection(Item,Cat,Subcat)}.

xp(Cat,one,l) --> comp(Cat,l),x(Item,Cat,Subcat),

{tot(Cat,one,l),projection(Item,Cat,Subcat)}.

spec(i,Setting) --> xxp(n,two),{fill(i,two,Setting,f)}.

spec(n,Setting) --> x(Item,d,Subcat),{fill(n,two,Setting,f)}.

spec(n,Setting) --> [],{fill(n,two,Setting,e)}.

spec(v,Setting) --> [],{fill(v,two,Setting,e)}.

spec(p,Setting) --> [],{fill(p,two,Setting,e)}.

 

comp(i,Setting) --> xxp(v,two),{fill(i,one,Setting,f)}.

comp(n,Setting) --> x(Item,r,Subcat),{fill(n,one,Setting,f)}.

comp(n,Setting) --> [],{fill(n,one,Setting,e)}.

comp(v,Setting) --> [],{fill(v,one,Setting,e)}.

comp(v,Setting) --> xxp(n,two),{fill(v,one,Setting,f)}.

comp(v,Setting) --> xxp(p,two),{fill(v,one,Setting,f)}.

comp(v,Setting) --> xxp(n,two),xxp(p,two),{fill(v,one,Setting,f)}.

comp(p,Setting) --> xxp(n,two),{fill(p,one,Setting,f)}.

x(Item,i,Subcat) --> [].

x(Item,Cat,Subcat) --> [Item],{word(Cat,[Item],Subcat,Count)}.

tot(Cat,Bar,Setting) :- pos(Cat,Bar,Setting,f),

retract(temp(Cat,Bar,Old)), asserta(temp(Cat,Bar,Setting)).

tot(Cat,Bar,Setting) :- pos(Cat,Bar,Setting,e).

tot(Cat,Bar,Setting) :- true.

fill(Cat,Bar,Setting,Fullness) :-

retract(pos(Cat,Bar,Setting,X)),asserta(pos(Cat,Bar,Setting,Fullness)).

projection(Item,v,Subcat) :- posi(f),

word(v,[Item],Subcat,Count),

New_count is Count+1,

retract(word(v,[Item],Subcat,Count)),

ch_trans(f,Subcat),

make_trans(New_count,New_cat),

asserta(word(v,[Item],New_cat,New_count)).

projection(Item,v,Subcat) :- posi(e),

word(v,[Item],Subcat,Count),

New_count is Count-1,

retract(word(v,[Item],Subcat,Count)),

ch_trans(e,Subcat),

make_trans(New_count,New_cat),

asserta(word(v,[Item],New_cat,New_count)).

projection(Item,Cat,Subcat) :- true.

fill(Cat,Bar,Setting,Fullness) :-

retract(pos(Cat,Bar,Setting,X)),asserta(pos(Cat,Bar,Setting,Fullness)).