Parser
Regular Language
- The weakest formal languages widely used
- Many applications are simply not regular languages
-
${(^{i})^{i}} i >= 0$ - Regular language can never parse balance brackets
- Nested
if ... fi
structures
if ... if ... fi fi
Parser
-
- Inputs: list of tokens
- Outputs: parse tree of the program (AST )
Context-Free Grammars
- Not all strings of tokens are programs
- Parser must distinguish between valid and invalid strings of tokens
- We need
- A language for describing valid strings of tokens
- A method for distinguishing valid from invalid strings of tokens
- Programming languages have recursive structure
- An
EXPR
is
if EXPR then EXPR else EXPR fi
whie EXPR loop EXPR pool
- Context-free grammars are a natural notation for this recursive structure
- A CFG consists of
- A set of terminals
- should be tokens of the language
- A set of non-terminals
- A start symbol
- A set of productions (产生式)
X -> Y1...Yn
- A set of terminals
Derivations
- A derivation is a sequence