dns :: tokenizer :: Tokenizer :: Class Tokenizer
[hide private]
[frames] | no frames]

Class Tokenizer

source code

object --+
         |
        Tokenizer

A DNS master file format tokenizer.

A token is a (type, value) tuple, where type is an int, and value is a string. The valid types are EOF, EOL, WHITESPACE, IDENTIFIER, QUOTED_STRING, COMMENT, and DELIMITER.

Instance Methods [hide private]
 
__init__(self, f=sys.stdin, filename=None)
Initialize a tokenizer instance.
source code
string
_get_char(self)
Read a character from input.
source code
(string, int) tuple. The first item is the filename of the input, the second is the current line number.
where(self)
Return the current location in the input.
source code
 
_unget_char(self, c)
Unget a character.
source code
int
skip_whitespace(self)
Consume input until a non-whitespace character is encountered.
source code
Token object
get(self, want_leading=False, want_comment=False)
Get the next token.
source code
 
unget(self, token)
Unget a token.
source code
(int, string)
next(self)
Return the next item in an iteration.
source code
(int, string)
__next__(self)
Return the next item in an iteration.
source code
 
__iter__(self) source code
int
get_int(self)
Read the next token and interpret it as an integer.
source code
int
get_uint8(self)
Read the next token and interpret it as an 8-bit unsigned integer.
source code
int
get_uint16(self)
Read the next token and interpret it as a 16-bit unsigned integer.
source code
int
get_uint32(self)
Read the next token and interpret it as a 32-bit unsigned integer.
source code
string
get_string(self, origin=None)
Read the next token and interpret it as a string.
source code
string
get_identifier(self, origin=None)
Read the next token and raise an exception if it is not an identifier.
source code
dns.name.Name object
get_name(self, origin=None)
Read the next token and interpret it as a DNS name.
source code
string
get_eol(self)
Read the next token and raise an exception if it isn't EOL or EOF.
source code
 
get_ttl(self) source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Instance Variables [hide private]
dict delimiters
The current delimiter dictionary.
bool eof
This variable is true if the tokenizer has encountered EOF.
file file
The file to tokenize
string filename
A filename that will be returned by the where method.
int line_number
The current line number
int multiline
The current multiline level.
bool quoting
This variable is true if the tokenizer is currently reading a quoted string.
string ungotten_char
The most recently ungotten character, or None.
(int, string) token tuple ungotten_token
The most recently ungotten token, or None.
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, f=sys.stdin, filename=None)
(Constructor)

source code 

Initialize a tokenizer instance.

Parameters:
  • f (file or string) - The file to tokenize. The default is sys.stdin. This parameter may also be a string, in which case the tokenizer will take its input from the contents of the string.
  • filename (string) - the name of the filename that the where method will return.
Overrides: object.__init__

_unget_char(self, c)

source code 

Unget a character.

The unget buffer for characters is only one character large; it is an error to try to unget a character when the unget buffer is not empty.

Parameters:
  • c (string) - the character to unget
Raises:

skip_whitespace(self)

source code 

Consume input until a non-whitespace character is encountered.

The non-whitespace character is then ungotten, and the number of whitespace characters consumed is returned.

If the tokenizer is in multiline mode, then newlines are whitespace.

Returns: int

get(self, want_leading=False, want_comment=False)

source code 

Get the next token.

Parameters:
  • want_leading (bool) - If True, return a WHITESPACE token if the first character read is whitespace. The default is False.
  • want_comment (bool) - If True, return a COMMENT token if the first token read is a comment. The default is False.
Returns: Token object
Raises:

unget(self, token)

source code 

Unget a token.

The unget buffer for tokens is only one token large; it is an error to try to unget a token when the unget buffer is not empty.

Parameters:
  • token (Token object) - the token to unget
Raises:

get_int(self)

source code 

Read the next token and interpret it as an integer.

Returns: int
Raises:

get_uint8(self)

source code 

Read the next token and interpret it as an 8-bit unsigned integer.

Returns: int
Raises:

get_uint16(self)

source code 

Read the next token and interpret it as a 16-bit unsigned integer.

Returns: int
Raises:

get_uint32(self)

source code 

Read the next token and interpret it as a 32-bit unsigned integer.

Returns: int
Raises:

get_string(self, origin=None)

source code 

Read the next token and interpret it as a string.

Returns: string
Raises:

get_identifier(self, origin=None)

source code 

Read the next token and raise an exception if it is not an identifier.

Returns: string
Raises:

get_name(self, origin=None)

source code 

Read the next token and interpret it as a DNS name.

Returns: dns.name.Name object
Raises:

get_eol(self)

source code 

Read the next token and raise an exception if it isn't EOL or EOF.

Returns: string
Raises:

Instance Variable Details [hide private]

multiline

The current multiline level. This value is increased by one every time a '(' delimiter is read, and decreased by one every time a ')' delimiter is read.
Type:
int