Citizendia
Your Ad Here

Top-down parsing is a strategy of analyzing unknown data relationships by hypothesizing general parse tree structures and then considering whether the known fundamental structures are compatible with the hypothesis. A parse tree or concrete syntax tree is an (ordered rooted tree that represents the syntactic structure of a string according to some It occurs in the analysis of both natural languages and computer languages. A language is a dynamic set of visual auditory or tactile Symbols of Communication and the elements used to manipulate them

Top-down parsing can be viewed as an attempt to find left-most derivations of an input-stream by searching for parse-trees using a top-down expansion of the given formal grammar rules. A parse tree or concrete syntax tree is an (ordered rooted tree that represents the syntactic structure of a string according to some In Formal semantics, Computer science and Linguistics, a formal grammar (also called formation rules) is a precise description of a Formal Tokens are consumed from left to right. Inclusive choice is used to accommodate ambiguity by expanding all alternative right-hand-sides of grammar rules. Ambiguity (Am-big-u-i-ty is the property of being ambiguous, where a Word, term notation sign Symbol, Phrase, sentence, or any [1]

Simple implementations of top-down parsing do not terminate for left-recursive grammars, and top-down parsing with backtracking may have exponential time complexity with respect to the length of the input for ambiguous CFGs [2] . In Computer science, left recursion is a special case of Recursion. In Formal language theory, a context-free grammar ( CFG) is a grammar in which every production rule is of the form V &rarr However, more sophisticated top-down parsers have been created by Frost, Hafiz, and Callaghan [3] [4] which do accommodate ambiguity and left recursion in polynomial time and which generate polynomial-sized representations of the potentially-exponential number of parse trees.

See also: Parsing
See also: Bottom-up parsing

Contents

Programming language application

A compiler parses input from a programming language to assembly language or an internal representation by matching the incoming symbols to Backus-Naur form production rules. In Computer science and Linguistics, parsing, or more formally syntactic analysis, is the process of analyzing a sequence of tokens to Bottom-up parsing (also known as shift-reduce parsing) is a strategy for analyzing unknown data relationships that attempts to identify the most fundamental units first and A compiler is a Computer program (or set of programs that translates text written in a computer language (the source language) into another In Computer science, Backus–Naur Form ( BNF) is a Metasyntax used to express Context-free grammars that is a formal way to describe Formal An LL parser, also called a top-bottom parser or top-down parser, applies each production rule to the incoming symbols by working from the left-most symbol yielded on a production rule and then proceeding to the next production rule for each non-terminal symbol encountered. An LL parser is a top-down Parser for a subset of the Context-free grammars It parses the input from L eft to right and constructs a '''L'''eftmost In this way the parsing starts on the Left of the result side (right side) of the production rule and evaluates non-terminals from the Left first and, thus, proceeds down the parse tree for each new non-terminal before continuing to the next symbol for a production rule.

For example:

would match A \rightarrow aBC and attempt to match B \rightarrow c|cd next. Then C \rightarrow df|eg would be tried. As one may expect, some languages are more ambiguous than others. Ambiguity (Am-big-u-i-ty is the property of being ambiguous, where a Word, term notation sign Symbol, Phrase, sentence, or any For a non-ambiguous language in which all productions for a non-terminal produce distinct strings: the string produced by one production will not start with the same symbol as the string produced by another production. A non-ambiguous language may be parsed by an LL(1) grammar where the (1) signifies the parser reads ahead one token at a time. For an ambiguous language to be parsed by an LL parser, the parser must lookahead more than 1 symbol, e. g. LL(3).

The common solution is to use an LR parser (also known as bottom-up or shift-reduce parser). In Computer science, an LR parser is a Parser for Context-free grammars that reads input from L eft to right and produces a '''R'''ightmost


Accommodating Left Recursion in Top-down Parsing

A formal grammar that contains left recursion cannot be parsed by a naive recursive descent parser unless they are converted to a weakly equivalent right-recursive form. In Formal semantics, Computer science and Linguistics, a formal grammar (also called formation rules) is a precise description of a Formal In Computer science, left recursion is a special case of Recursion. In Computer science and Linguistics, parsing, or more formally syntactic analysis, is the process of analyzing a sequence of tokens to A recursive descent parser is a top-down parser built from a set of mutually-recursive procedures (or a non-recursive equivalent where each such Procedure However, recent research demonstrates that it is possible to accommodate left-recursive grammars (along with all other forms of general CFGs) in a more sophisticated top-down parser by use of curtailment. In Formal language theory, a context-free grammar ( CFG) is a grammar in which every production rule is of the form V &rarr A recognition algorithm which accommodates ambiguous grammars and curtails an ever-growing direct left-recursive parse by imposing depth restrictions w. Ambiguity (Am-big-u-i-ty is the property of being ambiguous, where a Word, term notation sign Symbol, Phrase, sentence, or any r. t. input length and current input position, is described by Frost and Hafiz in 2006 [5] . That algorithm was extended to a complete parsing algorithm to accommodate indirect (by comparing previously-computed context with current context) as well as direct left-recursion in polynomial time, and to generate compact polynomial-size representations of the potentially-exponential number of parse trees for highly-ambiguous grammars by Frost, Hafiz and Callaghan in 2007 [3]. In Computer science and Linguistics, parsing, or more formally syntactic analysis, is the process of analyzing a sequence of tokens to In Mathematics, a polynomial is an expression constructed from Variables (also known as indeterminates and Constants using the operations The algorithm has since been implemented as a set of parser combinators written in the Haskell programming language. Haskell is a standardized Purely functional Programming language with non-strict semantics, named after the Logician Haskell Curry The implementation details of these new set of combinators can be found in a paper [4] by the above-mentioned authors, which was presented in PADL'08. The X-SAIGA site has more about the algorithms and implementation details.

Time and Space Complexity of Top-down Parsing

When top-down parser tries to parse an ambiguous input w. Ambiguity (Am-big-u-i-ty is the property of being ambiguous, where a Word, term notation sign Symbol, Phrase, sentence, or any r. t. an ambiguous CFG, it may need exponential number of steps (w. r. t. the length of the input) to try all alternatives of the CFG in order to produce all possible parse trees, which eventually would require exponential memory space. The problem of exponential time complexity in top-down parsers constructed as sets of mutually-recursive functions has been solved by Norvig in 1991 [6]. His technique is similar to the use of dynamic programming and state-sets in Earley's algorithm (1970), and tables in the CYK algorithm of Cocke, Younger and Kasami. The Earley parser is a type of Chart parser mainly used for parsing in Computational linguistics, named after its inventor Jay Earley. The Cocke-Younger-Kasami (CYK algorithm (alternatively called CKY determines whether a string can be generated by a given Context-free grammar and if so how it The key idea is to store results of applying a parser p at position j in a memotable and to reuse results whenever the same situation arises. Frost, Hafiz and Callaghan [3] [4] also use memoization for refraining redundant computations to accommodate any form of CFG in polynomial time (Θ(n4) for left-recursive grammars and Θ(n3) for non left-recursive grammars). In Computing, memoization is an optimization technique used primarily to speed up Computer programs by having function calls avoid repeating In Mathematics, a polynomial is an expression constructed from Variables (also known as indeterminates and Constants using the operations In mathematics big O notation (so called because it uses the symbol O) describes the limiting behavior of a function for very small or very large arguments In mathematics big O notation (so called because it uses the symbol O) describes the limiting behavior of a function for very small or very large arguments Their top-down parsing algorithm also requires polynomial space for potentially exponential ambiguous parse trees by 'compact representation' and 'local ambiguities grouping'. Their compact representation is comparable with Tomita’s compact representation of bottom-up parsing [7]. Bottom-up parsing (also known as shift-reduce parsing) is a strategy for analyzing unknown data relationships that attempts to identify the most fundamental units first and



See also

External links

References

  1. ^ Aho, A. V. , Sethi, R. and Ullman ,J. D. (1986) " Compilers: principles, techniques, and tools. " Addison-Wesley Longman Publishing Co. , Inc. Boston, MA, USA.
  2. ^ Aho, A. V and Ullman, J. D. (1972) " The Theory of Parsing, Translation, and Compiling, Volume 1: Parsing. " Englewood Cliffs, N. J. : Prentice-Hall.
  3. ^ a b c Frost, R. , Hafiz, R. and Callaghan, P. (2007) " Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars . " 10th International Workshop on Parsing Technologies (IWPT), ACL-SIGPARSE , Pages: 109 - 120, June 2007, Prague.
  4. ^ a b c Frost, R. , Hafiz, R. and Callaghan, P. (2008) " Parser Combinators for Ambiguous Left-Recursive Grammars. " 10th International Symposium on Practical Aspects of Declarative Languages (PADL), ACM-SIGPLAN , Volume 4902/2008, Pages: 167-181, January 2008, San Francisco.
  5. ^ Frost, R. and Hafiz, R. (2006) " A New Top-Down Parsing Algorithm to Accommodate Ambiguity and Left Recursion in Polynomial Time. " ACM SIGPLAN Notices, Volume 41 Issue 5, Pages: 46 - 54.
  6. ^ Norvig, P. (1991) “Techniques for automatic memoisation with applications to context-free parsing. ” Journal - Computational Linguistics Volume 17, Issue 1, Pages: 91 - 98.
  7. ^ Tomita, M. (1985) “Efficient Parsing for Natural Language. ” Kluwer, Boston, MA.

© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic