Parsers

Newick

Parser for trees represented in newick format.

The main functions are loads() and dumps(), which read/write a tree from/to its newick text representation.

When reading a newick file, the argument parser=... specifies which kind of parser to use.

The classical ones in ete are the following:

Format

Description

Example

0 (*)

internal nodes with support (flexible)

((D:2,E:5)1.0:9,(F:6,G):7);

1

internal nodes with names (flexible)

((D:2,E:5)B:9,(F:6,G):7);

2

internal w/ support, all values present

((D:2,E:5)1.0:9,(F:6,G:3)1.0:7);

3

internal w/ names, all values present

((D:2,E:5)B:9,(F:6,G:3)C:7);

4

names and lengths for leaves only

((D:2,E:5),(F:6,G:3));

5

leaf names and all lengths

((D:2,E:5):9,(F:6,G:3):7);

6

leaf names and internal lengths

((D,F):6,(B,H):8);

7

all names and leaf lengths

((D:2,E:5)B,(F:6,G:3)C);

8

all names (leaves and internal nodes)

((D,E)B,(F,G)C);

9

leaf names only

((D,E),(F,G));

100

topology only

((,),(,));

where the example tree would look (more or less) like:

       ╭╴D:2
 ╭╴B:9╶┤
 │     ╰╴E:5
╶┤
 │     ╭╴F:6
 ╰╴C:7╶┤
       ╰╴G:3

There are other valid values for parser:

  • 'name', same as 1

  • 'support', same as 0

  • 'multisupport', internal nodes look like ((X:5)80/100:7)..., that is, have multiple values of support separated by /

More generally, parser can be a dictionary that specifies in detail how to read/write each field. It must say, for leaf and internal nodes, what p0:p1 means (which properties they are, including how to read and write them). For example, the default parser looks like:

PARSER_DEFAULT = {
    'leaf':     [NAME,    DIST],  # ((name:dist)x:y);
    'internal': [SUPPORT, DIST],  # ((x:y)support:dist);
}

where NAME and DIST are “property dicts”, that have all the information for a property (pname) to know which function to apply to read/write from/to a string. For example, DIST is:

DIST = {'pname': 'dist', 'read': float, 'write': lambda x: '%g' % float(x)}
exception NewickError
content_repr(node, props=None, parser=None)

Return content of a node as represented in newick format.

dump(tree, fp, props=None, parser=None, format_root_node=True, is_leaf_fn=None)
dumps(tree, props=None, parser=None, format_root_node=True, is_leaf_fn=None)

Return newick representation of the given tree.

error(text)
get_extended_props(unicode text)

Return a dict with the properties extracted from the text in NHX format.

Example: ‘&&NHX:x=foo:y=bar’ -> {‘x’: ‘foo’, ‘y’: ‘bar’}

load(fp, parser=None)
loads(unicode text, parser=None, tree_class=Tree)

Return tree from its newick representation.

make_parser(parser=None, name='%s', dist='%g', support='%g')

Return parser changing the format of properties name, dist or support.

prop_repr(prop)

Return a newick-acceptable representation of the given property.

quote(name, escaped_chars=" \t\r\n()[]':;,")

Return the name quoted if it has any characters that need escaping.

read_content(unicode text, long pos, endings=u', );')

Return content starting at position pos in text, and where it ends.

read_node(unicode text, long pos, dict parser, tree_class=Tree, check_req=True)

Return a node and the position in the text where it ends.

read_nodes(unicode text, long pos, dict parser, tree_class=Tree)

Return a list of nodes and the position in the text where they end.

read_props(unicode text, long pos, is_leaf, dict parser, check_req)

Return the properties from the content of a node, and where it ends.

Example (for the default format of a leaf node):

‘abc:123[&&NHX:x=foo]’ -> {‘name’: ‘abc’, ‘dist’: 123, ‘x’: ‘foo’}

repr_short(obj, max_len=50)

Return a representation of the given object, limited in length.

skip_content(unicode text, long pos, endings=u', );')

Return the position where the content ends.

skip_quoted_name(unicode text, long pos)

Return the position where a quoted name ends.

skip_spaces_and_comments(unicode text, long pos)

Return position in text after pos and all whitespaces and comments.

unquote(name)

Return the name unquoted if it was quoted.

Nexus

Read trees from a file in nexus format.

exception NexusError[source]
apply_translations(translate, newick, parser=None)[source]

Return newick with node names translated according to the given dict.

get_commands(text_section)[source]

Return a dict that for each command has a list with its arguments.

get_section(text, section_name)[source]

Return commands ({name: [args]}) that correspond to the given section.

get_sections(text)[source]

Return {section: commands} read from the full text of a nexus file.

get_trees(text, parser=None)[source]

Return trees as {name: newick} with all the name transformations done.

load(fp, parser=None)[source]
loads(text, parser=None)[source]