dns :: tokenizer :: Tokenizer :: Class Tokenizer
[hide private]
[frames] | no frames]

Class Tokenizer

source code

object --+
         |
        Tokenizer

A DNS master file format tokenizer.

A token object is basically a (type, value) tuple. The valid types are EOF, EOL, WHITESPACE, IDENTIFIER, QUOTED_STRING, COMMENT, and DELIMITER.

file: The file to tokenize

ungotten_char: The most recently ungotten character, or None.

ungotten_token: The most recently ungotten token, or None.

multiline: The current multiline level. This value is increased by one every time a '(' delimiter is read, and decreased by one every time a ')' delimiter is read.

quoting: This variable is true if the tokenizer is currently reading a quoted string.

eof: This variable is true if the tokenizer has encountered EOF.

delimiters: The current delimiter dictionary.

line_number: The current line number

filename: A filename that will be returned by the where() method.

Instance Methods [hide private]
 
__init__(self, f=sys.stdin, filename=None)
Initialize a tokenizer instance.
source code
 
_get_char(self)
Read a character from input.
source code
 
where(self)
Return the current location in the input.
source code
 
_unget_char(self, c)
Unget a character.
source code
 
skip_whitespace(self)
Consume input until a non-whitespace character is encountered.
source code
 
get(self, want_leading=False, want_comment=False)
Get the next token.
source code
 
unget(self, token)
Unget a token.
source code
 
next(self)
Return the next item in an iteration.
source code
 
__next__(self)
Return the next item in an iteration.
source code
 
__iter__(self) source code
 
get_int(self, base=10)
Read the next token and interpret it as an integer.
source code
 
get_uint8(self)
Read the next token and interpret it as an 8-bit unsigned integer.
source code
 
get_uint16(self, base=10)
Read the next token and interpret it as a 16-bit unsigned integer.
source code
 
get_uint32(self)
Read the next token and interpret it as a 32-bit unsigned integer.
source code
 
get_string(self, origin=None)
Read the next token and interpret it as a string.
source code
 
get_identifier(self, origin=None)
Read the next token, which should be an identifier.
source code
 
get_name(self, origin=None)
Read the next token and interpret it as a DNS name.
source code
 
get_eol(self)
Read the next token and raise an exception if it isn't EOL or EOF.
source code
 
get_ttl(self)
Read the next token and interpret it as a DNS TTL.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, f=sys.stdin, filename=None)
(Constructor)

source code 

Initialize a tokenizer instance.

f: The file to tokenize. The default is sys.stdin. This parameter may also be a string, in which case the tokenizer will take its input from the contents of the string.

filename: the name of the filename that the where() method will return.

Overrides: object.__init__

where(self)

source code 

Return the current location in the input.

Returns a (string, int) tuple. The first item is the filename of the input, the second is the current line number.

_unget_char(self, c)

source code 

Unget a character.

The unget buffer for characters is only one character large; it is an error to try to unget a character when the unget buffer is not empty.

c: the character to unget raises UngetBufferFull: there is already an ungotten char

skip_whitespace(self)

source code 

Consume input until a non-whitespace character is encountered.

The non-whitespace character is then ungotten, and the number of whitespace characters consumed is returned.

If the tokenizer is in multiline mode, then newlines are whitespace.

Returns the number of characters skipped.

get(self, want_leading=False, want_comment=False)

source code 

Get the next token.

want_leading: If True, return a WHITESPACE token if the first character read is whitespace. The default is False.

want_comment: If True, return a COMMENT token if the first token read is a comment. The default is False.

Raises dns.exception.UnexpectedEnd: input ended prematurely

Raises dns.exception.SyntaxError: input was badly formed

Returns a Token.

unget(self, token)

source code 

Unget a token.

The unget buffer for tokens is only one token large; it is an error to try to unget a token when the unget buffer is not empty.

token: the token to unget

Raises UngetBufferFull: there is already an ungotten token

next(self)

source code 

Return the next item in an iteration.

Returns a Token.

__next__(self)

source code 

Return the next item in an iteration.

Returns a Token.

get_int(self, base=10)

source code 

Read the next token and interpret it as an integer.

Raises dns.exception.SyntaxError if not an integer.

Returns an int.

get_uint8(self)

source code 

Read the next token and interpret it as an 8-bit unsigned integer.

Raises dns.exception.SyntaxError if not an 8-bit unsigned integer.

Returns an int.

get_uint16(self, base=10)

source code 

Read the next token and interpret it as a 16-bit unsigned integer.

Raises dns.exception.SyntaxError if not a 16-bit unsigned integer.

Returns an int.

get_uint32(self)

source code 

Read the next token and interpret it as a 32-bit unsigned integer.

Raises dns.exception.SyntaxError if not a 32-bit unsigned integer.

Returns an int.

get_string(self, origin=None)

source code 

Read the next token and interpret it as a string.

Raises dns.exception.SyntaxError if not a string.

Returns a string.

get_identifier(self, origin=None)

source code 

Read the next token, which should be an identifier.

Raises dns.exception.SyntaxError if not an identifier.

Returns a string.

get_name(self, origin=None)

source code 

Read the next token and interpret it as a DNS name.

Raises dns.exception.SyntaxError if not a name.

Returns a dns.name.Name.

get_eol(self)

source code 

Read the next token and raise an exception if it isn't EOL or EOF.

Returns a string.

get_ttl(self)

source code 

Read the next token and interpret it as a DNS TTL.

Raises dns.exception.SyntaxError or dns.ttl.BadTTL if not an identifier or badly formed.

Returns an int.