Parsers¶
Newick¶
Parser for trees represented in newick format.
The main functions are loads()
and dumps()
, which read/write a tree
from/to its newick text representation.
When reading a newick file, the argument parser=...
specifies
which kind of parser to use.
The classical ones in ete are the following:
Format |
Description |
Example |
---|---|---|
0 (*) |
internal nodes with support (flexible) |
((D:2,E:5)1.0:9,(F:6,G):7); |
1 |
internal nodes with names (flexible) |
((D:2,E:5)B:9,(F:6,G):7); |
2 |
internal w/ support, all values present |
((D:2,E:5)1.0:9,(F:6,G:3)1.0:7); |
3 |
internal w/ names, all values present |
((D:2,E:5)B:9,(F:6,G:3)C:7); |
4 |
names and lengths for leaves only |
((D:2,E:5),(F:6,G:3)); |
5 |
leaf names and all lengths |
((D:2,E:5):9,(F:6,G:3):7); |
6 |
leaf names and internal lengths |
((D,F):6,(B,H):8); |
7 |
all names and leaf lengths |
((D:2,E:5)B,(F:6,G:3)C); |
8 |
all names (leaves and internal nodes) |
((D,E)B,(F,G)C); |
9 |
leaf names only |
((D,E),(F,G)); |
100 |
topology only |
((,),(,)); |
where the example tree would look (more or less) like:
╭╴D:2
╭╴B:9╶┤
│ ╰╴E:5
╶┤
│ ╭╴F:6
╰╴C:7╶┤
╰╴G:3
There are other valid values for parser
:
'name'
, same as 1'support'
, same as 0'multisupport'
, internal nodes look like((X:5)80/100:7)...
, that is, have multiple values of support separated by/
More generally, parser
can be a dictionary that specifies in
detail how to read/write each field. It must say, for leaf and internal
nodes, what p0:p1
means (which properties they are, including how
to read and write them). For example, the default parser looks like:
PARSER_DEFAULT = {
'leaf': [NAME, DIST], # ((name:dist)x:y);
'internal': [SUPPORT, DIST], # ((x:y)support:dist);
}
where NAME
and DIST
are “property dicts”, that have all the
information for a property (pname
) to know which function to apply
to read/write from/to a string. For example, DIST
is:
DIST = {'pname': 'dist', 'read': float, 'write': lambda x: '%g' % float(x)}
- exception NewickError¶
- content_repr(node, props=None, parser=None)¶
Return content of a node as represented in newick format.
- dump(tree, fp, props=None, parser=None, format_root_node=True, is_leaf_fn=None)¶
- dumps(tree, props=None, parser=None, format_root_node=True, is_leaf_fn=None)¶
Return newick representation of the given tree.
- error(text)¶
- get_extended_props(unicode text)¶
Return a dict with the properties extracted from the text in NHX format.
Example: ‘&&NHX:x=foo:y=bar’ -> {‘x’: ‘foo’, ‘y’: ‘bar’}
- load(fp, parser=None)¶
- loads(unicode text, parser=None, tree_class=Tree)¶
Return tree from its newick representation.
- make_parser(parser=None, name='%s', dist='%g', support='%g')¶
Return parser changing the format of properties name, dist or support.
- prop_repr(prop)¶
Return a newick-acceptable representation of the given property.
- quote(name, escaped_chars=" \t\r\n()[]':;,")¶
Return the name quoted if it has any characters that need escaping.
- read_content(unicode text, long pos, endings=u', );')¶
Return content starting at position pos in text, and where it ends.
- read_node(unicode text, long pos, dict parser, tree_class=Tree, check_req=True)¶
Return a node and the position in the text where it ends.
- read_nodes(unicode text, long pos, dict parser, tree_class=Tree)¶
Return a list of nodes and the position in the text where they end.
- read_props(unicode text, long pos, is_leaf, dict parser, check_req)¶
Return the properties from the content of a node, and where it ends.
- Example (for the default format of a leaf node):
‘abc:123[&&NHX:x=foo]’ -> {‘name’: ‘abc’, ‘dist’: 123, ‘x’: ‘foo’}
- repr_short(obj, max_len=50)¶
Return a representation of the given object, limited in length.
- skip_content(unicode text, long pos, endings=u', );')¶
Return the position where the content ends.
- skip_quoted_name(unicode text, long pos)¶
Return the position where a quoted name ends.
- skip_spaces_and_comments(unicode text, long pos)¶
Return position in text after pos and all whitespaces and comments.
- unquote(name)¶
Return the name unquoted if it was quoted.
Nexus¶
Read trees from a file in nexus format.
- apply_translations(translate, newick, parser=None)[source]¶
Return newick with node names translated according to the given dict.
- get_commands(text_section)[source]¶
Return a dict that for each command has a list with its arguments.
- get_section(text, section_name)[source]¶
Return commands ({name: [args]}) that correspond to the given section.