Package dns :: Module tokenizer :: Class Tokenizer
[show private | hide private]
[frames | no frames]

Type Tokenizer

object --+
         |
        Tokenizer


A DNS master file format tokenizer.

A token is a (type, value) tuple, where type is an int, and value is a string. The valid types are EOF, EOL, WHITESPACE, IDENTIFIER, QUOTED_STRING, COMMENT, and DELIMITER.
Method Summary
  __init__(self, f, filename)
Initialize a tokenizer instance.
  __iter__(self)
(int, string) tuple get(self, want_leading, want_comment)
Get the next token.
string get_eol(self)
Read the next token and raise an exception if it isn't EOL or EOF.
int get_int(self)
Read the next token and interpret it as an integer.
dns.name.Name object get_name(self, origin)
Read the next token and interpret it as a DNS name.
string get_string(self, origin)
Read the next token and interpret it as a string.
  get_ttl(self)
int get_uint16(self)
Read the next token and interpret it as a 16-bit unsigned integer.
int get_uint32(self)
Read the next token and interpret it as a 32-bit unsigned integer.
int get_uint8(self)
Read the next token and interpret it as an 8-bit unsigned integer.
(int, string) next(self)
Return the next item in an iteration.
int skip_whitespace(self)
Consume input until a non-whitespace character is encountered.
  unget(self, token)
Unget a token.
(string, int) tuple. The first item is the filename of the input, the second is the current line number. where(self)
Return the current location in the input.
string _get_char(self)
Read a character from input.
  _unget_char(self, c)
Unget a character.
    Inherited from object
  __delattr__(...)
x.__delattr__('name') <==> del x.name
  __getattribute__(...)
x.__getattribute__('name') <==> x.name
  __hash__(x)
x.__hash__() <==> hash(x)
  __new__(T, S, ...)
T.__new__(S, ...) -> a new object with type S, a subtype of T
  __reduce__(...)
helper for pickle
  __reduce_ex__(...)
helper for pickle
  __repr__(x)
x.__repr__() <==> repr(x)
  __setattr__(...)
x.__setattr__('name', value) <==> x.name = value
  __str__(x)
x.__str__() <==> str(x)

Instance Variable Summary
dict delimiters: The current delimiter dictionary.
bool eof: This variable is true if the tokenizer has encountered EOF.
file file: The file to tokenize
string filename: A filename that will be returned by the where method.
int line_number: The current line number
int multiline: The current multiline level.
bool quoting: This variable is true if the tokenizer is currently reading a quoted string.
string ungotten_char: The most recently ungotten character, or None.
(int, string) token tuple ungotten_token: The most recently ungotten token, or None.

Method Details

__init__(self, f=<epydoc.imports._DevNull instance at 0x2aaaae08f200>, filename=None)
(Constructor)

Initialize a tokenizer instance.
Parameters:
f - The file to tokenize. The default is sys.stdin. This parameter may also be a string, in which case the tokenizer will take its input from the contents of the string.
           (type=file or string)
filename - the name of the filename that the where method will return.
           (type=string)
Overrides:
__builtin__.object.__init__

get(self, want_leading=False, want_comment=False)

Get the next token.
Parameters:
want_leading - If True, return a WHITESPACE token if the first character read is whitespace. The default is False.
           (type=bool)
want_comment - If True, return a COMMENT token if the first token read is a comment. The default is False.
           (type=bool)
Returns:
(int, string) tuple
Raises:
dns.exception.UnexpectedEnd - input ended prematurely
dns.exception.SyntaxError - input was badly formed

get_eol(self)

Read the next token and raise an exception if it isn't EOL or EOF.
Returns:
string
Raises:
dns.exception.SyntaxError -

get_int(self)

Read the next token and interpret it as an integer.
Returns:
int
Raises:
dns.exception.SyntaxError -

get_name(self, origin=None)

Read the next token and interpret it as a DNS name.
Returns:
dns.name.Name object
Raises:
dns.exception.SyntaxError -

get_string(self, origin=None)

Read the next token and interpret it as a string.
Returns:
string
Raises:
dns.exception.SyntaxError -

get_uint16(self)

Read the next token and interpret it as a 16-bit unsigned integer.
Returns:
int
Raises:
dns.exception.SyntaxError -

get_uint32(self)

Read the next token and interpret it as a 32-bit unsigned integer.
Returns:
int
Raises:
dns.exception.SyntaxError -

get_uint8(self)

Read the next token and interpret it as an 8-bit unsigned integer.
Returns:
int
Raises:
dns.exception.SyntaxError -

next(self)

Return the next item in an iteration.
Returns:
(int, string)

skip_whitespace(self)

Consume input until a non-whitespace character is encountered.

The non-whitespace character is then ungotten, and the number of whitespace characters consumed is returned.

If the tokenizer is in multiline mode, then newlines are whitespace.
Returns:
int

unget(self, token)

Unget a token.

The unget buffer for tokens is only one token large; it is an error to try to unget a token when the unget buffer is not empty.
Parameters:
token - the token to unget
           (type=(int, string) token tuple)
Raises:
UngetBufferFull - there is already an ungotten token

where(self)

Return the current location in the input.
Returns:
(string, int) tuple. The first item is the filename of the input, the second is the current line number.

_get_char(self)

Read a character from input.
Returns:
string

_unget_char(self, c)

Unget a character.

The unget buffer for characters is only one character large; it is an error to try to unget a character when the unget buffer is not empty.
Parameters:
c - the character to unget
           (type=string)
Raises:
UngetBufferFull - there is already an ungotten char

Instance Variable Details

delimiters

The current delimiter dictionary.
Type:
dict

eof

This variable is true if the tokenizer has encountered EOF.
Type:
bool

file

The file to tokenize
Type:
file

filename

A filename that will be returned by the where method.
Type:
string

line_number

The current line number
Type:
int

multiline

The current multiline level. This value is increased by one every time a '(' delimiter is read, and decreased by one every time a ')' delimiter is read.
Type:
int

quoting

This variable is true if the tokenizer is currently reading a quoted string.
Type:
bool

ungotten_char

The most recently ungotten character, or None.
Type:
string

ungotten_token

The most recently ungotten token, or None.
Type:
(int, string) token tuple

Generated by Epydoc 2.1 on Sun Dec 10 12:46:08 2006 http://epydoc.sf.net