Module parser :: Class Lexer
[hide private]
[frames] | no frames]

Class Lexer

source code

Class for lexical analyser to use with the parser

Instance Methods [hide private]
 
__init__(self, rules_list)
By now lexer is kept as simple as possible, so order is really essential: i.e.
source code
 
scan(self, string)
Performs the lexical analysis on string
source code
 
scanOneRule(self, rule, st)
Scans space st according only one rule
source code
 
scanUnknown(self, st)
Scans the resulting structure making Unknown strings
source code
 
readscan(self)
Scans a string read from stdin
source code
Instance Variables [hide private]
dictionary operators
precedence and associativity for operators
  rules
lexical rules
Method Details [hide private]

__init__(self, rules_list)
(Constructor)

source code 

By now lexer is kept as simple as possible, so order is really essential: i.e. if a keyword is substring of another its rule must appear after the larger keyword for the obvious reasons...

Parameters:
  • rules_list - contains pairs (re,funct,op?) where:

    re: is an uncompiled python regular expression

    funct: the name of a funcion that returns the pair (TOKEN, SPECIAL_VALUE), where TOKEN is the token to be used by the parser and SPECIAL_VALUE an eventual associated value. The argument is the matched string. If funct equals "" the token is ignored. This can be used for delimiters.

    op: if present, is a tuple with operador information: (TOKEN,PRECEDENCE,ASSOC) where PRECEDENCE is an integer and ASSOC the string 'left' or 'right'.

scan(self, string)

source code 

Performs the lexical analysis on string

Returns:
a list of tokens (pairs (TOKEN , SPEcial_VALUE )), for recognized elements and ("@UNK", string ) for the others

scanOneRule(self, rule, st)

source code 

Scans space st according only one rule

Parameters:
  • rule - one rule (re,fun,op)
  • st - is a list of strings and already matched structures

scanUnknown(self, st)

source code 

Scans the resulting structure making Unknown strings

Unknown parts will be of the form ("@UNK", string )