Here is a term paper on ‘Syntactic Language Processing’. Find paragraphs, long and short term papers on ‘Syntactic Language Processing’ especially written for school and college students.

Term Paper # 1. Introduction to Syntactic Language Processing:

Syntactic analysis is concerned with the structure of the sentence. Its role is to verify whether a given sentence is a valid construction within the language and to provide a representation of its structure, or to generate legal sentences. This is the process of parsing. There are a number of ways in which parsing can be done. But before this is perused it is worth reviewing the simplest option of syntactic analysis pattern matching.

Perhaps the simplest option is to use some form of pattern-matching. Templates of possible sentence patterns are stored, with variables to allow matching to specific sentences. For example, the template

< the ** rides ** >

ADVERTISEMENTS:

(where the wild card symbol ** matches anything) fits lots of different sentences, such as “the motor-bike rides a clear round” or “the girl rides her motor-bike”. These sentences have similar syntax (both are basically noun-phrase verb noun-phrase), so does this mean that template matching works? Not really. What about the sentence “the theme park rides are terrifying”?

This also matches the template but is clearly a very different sentence structure from the first two. In the first two sentences ‘rides’ is a verb, whereas in the third one it is a noun. This highlights the fundamental flaw in template matching. It has no representation of word ‘types’, which essentially means that template matching cannot ensure that words are correctly sequenced and put together.

Template matching is the method used in ELIZA, which, fails to cope with ambiguity and so can accept (and generate) garbage. These are problems inherent in the approach: it is too simplistic to deal with a language of any complexity. However, it is a simple approach which may be useful in very constraint environments.

Term Paper # 2. Methods of Parsing in Syntactic Language Processing:

i. Transition Networks:

ADVERTISEMENTS:

The transition network is a method of parsing which represents the grammar as a set of finite state machines. A finite state machine is a model of computational behaviour where each node represents an internal state of the system and the arcs are the means of moving between the states.

They are used in Automata Theory, to represent grammar. In the case of parsing natural language, the nodes in the networks represent either a terminal or a non-terminal symbol. Rules in the grammar correspond to a path through a network. Each non-terminal is represented by a different network.

To illustrate this we will represent the grammar fragment which we created earlier in Fig. 10.3, using transition networks Fig. 10.6. All rules are represented but to save space only some lexical categories are included. Needless to say that others would be represented similarly.

In the Fig. 10.6., each network represents the rules for one non-terminal as paths from the initial state (I) to the final state (F). So, whereas we had three rules for NP in our grammar, here we have a single transition network, with three possible paths through it representing the three lines.

The move from one state to the next through the network the parser tests the label on the arc. If it is a terminal symbol, the parser will check whether it matches the next word in the input sentence. If it is a non-terminal symbol, the parser moves to the network for that symbol and attempts to find a path through that.

If it finds a path through that network it returns to the higher-level network and continues. If the parser fails to find a path at any point it backtracks and attempts another path. If it succeeds in finding a path, the sentence is a valid one.

So to parse the sentence “Who belongs to a college”? the parser would start at the sentence network and find that the first part of a sentence is RelP. It would therefore go to the RelP network and test the first word in the input sentence ‘who’ against the terminal symbol on the arc.

ADVERTISEMENTS:

These match, so that network has been traversed successfully and the parser returns to the sentence network above to cross the arc RelP. Parsing of the sentence continues in this fashion until the top level sentence network is successfully traversed. The full details of the network for this sentence is shown in Fig. 10.7.

The transition network allows each non-terminal to be represented in a single network rather than by numerous rules, making this approach more concise than grammars. However, as we can see from the network for just two sentences, the approach is not really tenable for large languages since the networks would become unworkable.

Another disadvantage over grammars is that the transition network does not produce a parse tree for sentences and tracing the path through the network can be unclear for complex sentences. However, the transition network is a powerful tool for a better tool.

ADVERTISEMENTS:

ii. Recursive Transition Networks:

To extend the capacity of finite state transition diagrams to handle more general context-free language, a recursion mechanism was added to the FSTD and it becomes a Recursive Transition Network (RTN). This extension is somewhat like adding a subroutine capability at each arc.

Now, in addition to containing a terminal (word), an arc may contain the name of a sub-network to which control may be transferred to parse, for instance, a prepositional phrase. When the sub-network is finished, control returns to the subsequent node and processing continues. This subroutine structure is illustrated in Fig. 10.8.

The capability for recursion gives the RTN a considerable advantage over its predecessor, the transition networks. There is no restriction on the level of hierarchy or recursion with RTNs. That is, a sub-network may call another sub-network which may, in turn, call another and so on. Recursion is supported by allowing a sub­ network to call itself.

There is also no restriction on the number of sub-networks which may be called from a certain arc. Several sub-structures may be investigated simultaneously and thus RTNs have the capability of parallel processing. Often, of course, many of the parallel sub-networks will fail to parse the phrase they are examining because it may not fit the structure specified by the RTN.

However, even context-free grammars cannot handle all English sentence structure. To extend the capabilities of syntactic parsers, the Augmented Transition Network (ATN) was invented. We discuss them a little later, but before ATN, we undertake context sensitive grammar.

Term Paper # 3. Context-Sensitive Grammars Used in Syntactic Language Processing:

The grammars considered so far are context-free grammars. They allow a single non-terminal on the left-hand side of the rule. The rule may be applied to any instance of that symbol, regardless of context. So the rule will match an occurrence of A whether it occurs in the string ABC or in ZAB.

The context-free grammar cannot restrict this to only instances where A occurs surrounded by Z and B. In order to interpret the symbol in context, context-sensitive grammar is required. This allows more than one symbol on the left-hand side and insists that the right-hand side must contain at least as many as or more symbols than the left-hand side.

A→B

That is in the rewrite rules of the form

X→ Y

the allowed productions are of the form:

UXV → UYV

where, X = single non terminal symbol

U, V = arbitrary strings including the null string

Y = non-empty string over vocabulary V

This form of production (rewrite rule) is equivalent to saying ‘X can be replaced by Y in the context U, V”.

Consider for example, the grammar:

where,

S = start symbol

A, B, C = variables

a, b, c = terminal symbols

To derive one of the sentences generated by this grammar, we substitute according to rewrite rules

It is easily shown that this particular grammar generates strings of the pattern abc, aa bb cc, aaa bbb ccc…

Context-free grammars are not sufficient to represent natural language syntax. For example, they cannot distinguish between plural and singular nouns or verbs. So in a context-free grammar if we have a set of simple definitions.

we would be able to generate the sentences, “THE DOG GUIDES” and “THE DOGS GUIDE”, both legal English sentences. However, we would also be able to generate sentences such as a dogs guides, which is clearly not an acceptable sentence. This illustrates an important point in computational linguistics. Syntax is concerned only as the structure and not with meaning.

By incorporating the context of agreement into the left-hand side of the rule we can provide a grammar which can resolve this kind of problem.

The use of the, symbols ‘Sing’ and ‘Plur’, to indicate agreement, does not allow generation of sentences which violate consistency rules. For example, using the grammar in Fig. 10.9, we can derive the sentence “a dog guides” but not “a dogs guides”. The derivation of the former is shown using the following substitutions.

Unfortunately context sensitivity increases the size of the grammar considerably, making it a complex method for a language of any size. Feature sets and augmented transition networks are alternative approaches to solving the context problem.

Term Paper # 4. Feature Sets Used during Syntactic Language Processing:

Another approach to incorporating context in syntactic processing is the use of features sets. Feature sets provide a mechanism for sub-classifying syntactic categories (noun, verb, etc.) in terms of contextual properties such as number agreement and verb tense. The descriptions of the syntactic categories are framed in terms of constraints.

There are many variations of feature sets, but here we shall use one approach to illustrate the general principle that of Pereira and Warren’s Direct Clause Grammar. In this grammar each syntactic category has an associated feature set, together with constraints which indicate what context is allowable.

For example:

S → NP (agreement = ? a) VP (agreement = ? b) : a = b

Feature sets are a relatively efficient mechanism for representing syntactic context. Augmented transition networks provide an approach which begins to bridge the gap between syntactic and semantic processing.

Term Paper # 5. Augmented Transition Networks in Syntactic Language Processing:

The augmented transition network provides context without an unacceptable increase in complexity Luger and Subtle field (1993). It is a transition network which allows procedures to be attached to arcs to test for matching context. All terminals and non-terminals have frame-like structures associated with them which contain their contextual information.

By adding the following additional features to a RTN, it becomes an Augmented Transition Network (ATN):

a. Add a set of registers which store information such as partially formed derivation trees (a parsed sentence with words as leaves).

b. Allow arcs to be executed conditionally, i.e., tests whether the number or person (1st, 2nd, 3rd) conditions are satisfied and accordingly allow or fail a transition before the arc is taken.

c. Attach certain actions to arcs, usually in the form of modifying the data structure returned.

These new features greatly expand the ability of ATNs to handle natural language.

In fact, an ATN has the power of a Turning machine in which the class of labels, which can be attached to the arcs which define transitions between states, has been augmented Arcs may be labelled with an arbitrary combination of the following:

a. Specific words, such as ‘in’.

b. Word categories such as ‘noun’.

c. Pushes to other networks which recognise significant components of a sentence. For example, a network designed to recognise a Prepositional Phrase (PP) may include arc which asks for (pushes for) a noun phrase.

d. Procedures which perform arbitrary tests on both the current input and on the sentence components which have already been identified.

e. Procedures which build structures which will form part of the final parse.

For instance the ATN which may be employed for parsing part of the sentence ‘the elephant’ is presented below. Suppose that the grammatical characteristics or features of the words ‘NP’, ‘the’ and ‘elephant’ are available in a data dictionary, as shown in Fig. 10.10.

Also assume that the procedures are attached to the arcs of the article and noun, as described below:

Now, for parsing the string sequence: ‘the elephant’, suppose we use the ATN for NP, at state S0 of NP, it first assigns ART: = the and checks whether the part-of- speech ∼ ART. If yes, it assigns NP.DET: = ART; otherwise it fails and the control returns to wherefrom it was called. Generally, we start, with the sentence S which contains NP and VP. So, the control from NP ATN should return to starting state of the ATN for S.

Thus, if we have the structural definitions of the ATNs for sentence, NP, VP, data dictionary definitions of the words ‘the’, ‘elephant’ and ‘gnawed’ and known procedures at the arcs of the transition nets for NP and VP, we shall be able to parse the tree employing them. An example is given below to parse the tree for the sentence: the elephant gnawed.

The parse tree for the sentence ‘the elephant gnawed’ can now be drawn (Fig. 10.12(c)) following the above definitions.

In constructing the ATN parsers throughout our discussion, we used conceptual graphs. Other techniques for constructing ATNs include frames, script and logic representations.

Augmented transition networks can be used to provide semantic information as well as syntactic, since information about meaning can also be stored in the structures. They are therefore a bridge between syntactic processing and the semantic processing immediately next stage in the process of natural language processing.

Because of this great power and flexibility, ATNs have been used in a number of successful Natural Language Processing (NLP) programs. Two such programs are Bolt Beranek and Newman’s HWIM (Hear What I Mean) speech recognition program and William Wood’s LUNAR program for information retrieval. The only failure of the ATN technique is in interpreting correctly ungrammatical utterances which may; nonetheless, contain meaning.

Term Paper # 6. LUNAR Used in Syntactic Language Processing:

LUNAR was developed by William Woods of Bolt Beranek and Newman (BBN) as a NLP program for retrieving and analyzing geological information obtained from the Apollo-11 mission to the moon. LUNAR used a query language based on predicate calculus, augmented transition networks for translation, and a dictionary of approximately 3,500 words.

It was one of the first “real world” NLP programs (as opposed to previous ‘toy’ programs) and could correctly understand and respond to queries such as:

(WHAT IS THE AVERAGE CONCENTRATION OF ALUMINUM IN HIGH ALKALI ROCKS?)

(WHAT SAMPLES CONTAIN P205?)

(GIVE ME THE MODAL ANALYSES OF P205 IN THOSE SAMPLES)

(GIVE ME THE DETERMINATIONS IN SAMPLES WHICH CONTAIN ILM)

The three steps used in processing these requests were:

I. Syntactic analysis using an ATN parser and heuristic information to generate the most probable derivation tree.

II. Semantic interpretation to produce a representation of the meaning in a formal query language.

III. Execution of the query language expression on the database to generate the answer to the request.

While the domain of LUNAR was restricted to information on the geology and chemistry of moon rocks, the program did display an impressive ability for understanding a fairly broad range of English commands. The success of BBN. This speech understanding system applies the ATN approach at the phoneme level to interpret instructions spoken to the computer through a microphone.

Term Paper # 7. Unification Grammar for Syntactic Language Processing:

ATN grammars have substational procedural components. The grammar describes the order in which constituents must be built. Variables are explicitly given values and the must have been assigned a value before they can be referenced.

This limits the effectiveness of ATN grammars in some cases, for example, in speech processing where some later parts of the sentence may have been recognised clearly while earlier parts could not be known or in systems which want to use the same grammar to support both understanding and generation. Herein, we describe a declarative approach to represent grammars.

When a parser applies grammar rules to a sentence, it performs two major kinds of operations:

I. Matching (of sentences constituents to grammar rules)

II. Building structure (corresponding to the result of combining constituents)

Unification was on knowledge representation, as a theorem proving tool. Matching and structure building are operations which performs unification. So, unification grammar is nothing but defining a unification operator on some structure, Directed Acyclic Graphs (DAG).

Each DAG represents a set of attribute-value pairs. For example, the graphs corresponding to the words ‘the’ and ‘file’ are:

Both words have a lexical category (CAT) and a lexical entry but the word ‘file’ has an additional value SING for the NUMBER attribute.

The result of combining these words is to form a simple NP, also described by a graph:

This new structure can be represented as a graph. In order to do that we label a piece of graph which can then be called else where in the graph. So we let {n} for any value of n, to be a label for the next constituent following in the graph. The constituent may be empty and even in that case, the label functions very much like a variable and will be treated so in the unification process. So the label used for DAG can be of degenerative nature, in order to describe the NP rule,

NP→ DET N

Written as the following graph:

This rule reads like this:

Two constituents described in the sub-graphs labelled CONSTITUENT 1 and CONSTITUENT 2, are to be combined. The first must be of CAT DET. Its lexical entry is bound to the label (1). The second constituent must be of CAT N. Its lexical entry will be bound to label (2), and its number is bound to the label (3).

The result of combining these two constituents is described in the sub-graph labelled BUILD. The result will be a graph corresponding to an NP with three attributes: DET, HEAD and NUMBER. The values for all these attributes are to be taken from the appropriate pieces of graphs which are being combined by the rule.

The unification operator which can be applied to the graphs is similar to logical unification, two graphs unify, if, recursively, all their sub-graphs unify. The result of a successful unification is a graph which is composed of union of the sub-graphs of two inputs, with all binding made as indicated.

The difference between logical unification and graphical unification:

The inputs to the logical unification are treated as logical formulas. Order matters, i.e., f(g(a), h(b) is different than f(h(b), g(a)). But inputs to the graph unification, must be treated as sets, since the order in which attribute-value pairs are stated does not matter.

For example, if a rule describes a constituent as:

We should be able to match this constituents as

A simple parser can use the method to apply a grammar rule by unifying CONSTITUENT 1 with a proposed first constituent. If that succeeds, then CONSTITUENT 2 is unified with a proposed second constituent. If that also succeeds, then a new constituent corresponding to the value of BUILD IS PRODUCED.

If there are variables in the value of BUILD which were bound during the matching of constituents, then those binding will be used to build the new constituent. Variations in the method of notation, appear in Knight (1989).