# LR parsing

**LR parsing** A bottom-up parsing technique, LR standing for *L*eft-to-right *R*ightmost derivation sequence. Originally developed by D. E. Knuth, it is the most powerful left-to-right, no backtracking parsing method for context-free grammars.

An LR parser consists of a pushdown stack, a parsing table, and a driving routine. The driving routine is the same for all grammars. The stack is manipulated by the driving routine using the information contained in the top stack element and the next *k* symbols in the input stream (called the *k lookahead*); *k* is an integer ≥0, but for most practical purposes *k* = 1. The stack consists of a string *s*_{0}*X*_{0}*s*_{1}*X*_{1}…*s*_{n}*X*_{n}*s*_{n+1}

where each *X*_{i} is a symbol of the input grammar and each *s*_{i} is called a *state*.

The parsing table is indexed by pairs (*s*,*a*) where *s* is a state and *a* is the lookahead. Each entry in the table has two parts: (a) an action, which may be shift, reduce *p* (for some production *p*), accept, or error, and (b) a state, called the *goto state*. When the action is shift, the next input symbol and goto state are pushed onto the stack (in that order). When the action is reduce *p* the top 2*l* elements of the stack will spell the right-hand side of *p* but with goto states interspersed, where *l* is the length of this right-hand side. These 2*l* elements are popped from the stack and replaced by the left-hand side of *p* and the new goto state. This operation corresponds to adding a new node to the parse tree for the input string. The accept action is only encountered when the start symbol *S* is the only symbol on the stack (i.e. the stack contains *s*_{0}*Ss*_{1} for some states *s*_{0} and *s*_{1}) and the lookahead is the end-of-input symbol. It signifies that parsing has been successfully completed. On the other hand an error entry in the parse table indicates an error in the input string.

A grammar that can be parsed by an LR parser using *k*-symbol lookahead is called an *LR(k) grammar*. The power of the LR parsing method comes from the fact that the LR(1) grammars properly include other grammar types like precedence grammars and LL(1) grammars (see LL parsing). This and its efficiency make it a popular choice of parsing method in compiler-compilers. If a grammar is not LR(1) this will be evidenced as multiply defined entries in the parsing tables called *shift-reduce conflicts* or *reduce-reduce conflicts*.

Many different parsing tables may be constructed for one grammar, each differing in the number of states it defines. The so-called *canonical LR table* tends to be too long for practical purposes and it is commonly replaced by an *SLR* (simple LR) or *LALR* (lookahead LR) table. A grammar that is LR(1) may not however be SLR(1) or LALR(1).