Latest Ncbi-Taxonomist Docker Image Can Be Pulled from Registry.Gitlab.Com/Janpb/ Ncbi-Taxonomist:Latest
Total Page:16
File Type:pdf, Size:1020Kb
ncbi-taxonomist Documentation Release 1.2.1+8580b9b Jan P Buchmann 2020-11-15 Contents: 1 Installation 3 2 Basic functions 5 3 Cookbook 35 4 Container 39 5 Frequently Asked Questions 49 6 Module references 51 7 Synopsis 63 8 Requirements and Dependencies 65 9 Contact 67 10 Indices and tables 69 Python Module Index 71 Index 73 i ii ncbi-taxonomist Documentation, Release 1.2.1+8580b9b 1.2.1+8580b9b :: 2020-11-15 Contents: 1 ncbi-taxonomist Documentation, Release 1.2.1+8580b9b 2 Contents: CHAPTER 1 Installation Content • Local pip install (no root required) • Global pip install (root required) ncbi-taxonomist is available on PyPi via pip. If you use another Python package manager than pip, please consult its documentation. If you are installing ncbi-taxonomist on a non-Linux system, consider the propsed methods as guidelines and adjust as required. Important: Please note If some of the proposed commands are unfamiliar to you, don’t just invoke them but look them up, e.g. in man pages or search online. Should you be unfamiliar with pip, check pip -h Note: Python 3 vs. Python 2 Due to co-existing Python 2 and Python 3, some installation commands may be invoked slighty different. In addition, development and support for Python 2 did stop January 2020 and should not be used anymore. ncbi-taxonomist requires Python >= 3.8. Depending on your OS and/or distribution, the default pip command can install either Python 2 or Python 3 packages. Make sure you use pip for Python 3, e.g. pip3 on Ubuntu. 1.1 Local pip install (no root required) $: pip install ncbi-taxonomist --user 3 ncbi-taxonomist Documentation, Release 1.2.1+8580b9b On Linux, ncbi-taxonomist will be installed to $HOME/.local/bin. If you cannot invoke ncbi-taxonomist from the command line, its’ likely $HOME/.local/bin is not in your $PATH (check echo $PATH). In such a case, choose one of the following possibilities: • add $HOME/.local/bin to your $PATH: – echo "export PATH=${PATH}:$HOME/.local/bin" >> ~/.bashrc • add an alias: – see man bash or https://www.tldp.org/LDP/abs/html/aliases.html • use $HOME/.local/bin/ncbi-taxonomist implicitly 1.2 Global pip install (root required) $: pip install ncbi-taxonomist ncbi-taxonomist should be now in /usr/local/bin and in you $PATH. 4 Chapter 1. Installation CHAPTER 2 Basic functions All ncbi-taxonomist commands have the following underlying structure: ncbi-taxonomist <command> <options> This section shows the basic usage of ncbi-taxonomist. More complex examples, inlcuding data extraction with jq can be found here. The output is a single JSON object or XML tree per line for each queried taxid, name, or accessions. The examples show pretty printed single results for clarity only. Contents • Collect – Output format * JSON output * XML output • Map – Taxids and names – Mapping accession – Supported access Entrez databases – Output format * JSON output · Single mapping result · Multiple mapping results * XML output · Single mapping result 5 ncbi-taxonomist Documentation, Release 1.2.1+8580b9b · Multiple mapping results • Resolve – Taxids and names – Accessions – Output format * JSON output · Single mapping result · Multiple mapping results * XML output · Single mapping result · Multiple mapping results • Import – Local database schema – Import taxa via collect – Import taxa via resolve – Import accessions • Subtree – Collecting subtrees * Between two given ranks * Collect one specific rank * Collect from a given rank to root and print XML * Collect from a given rank to lowest rank – Output format * JSON output * XML output • Group – Creating a group – Retrieve a group 2.1 Collect The collect command fetches taxa from the Entrez database. If Taxids or names sharing parts of the same lineage, these taxa are printed only once. 6 Chapter 2. Basic functions ncbi-taxonomist Documentation, Release 1.2.1+8580b9b 2.1.1 Output format The output describes the collected taxa, one per line. A single taxon has the following structure, for example chim- panzee (tx9598): { "taxid" : 9598, "rank" : "species", "parentid" : 9596, "name" : "Pan troglodytes", "names" : { "Pan troglodytes" : "scientific_name", "chimpanzee" : "GenbankCommonName" } } Collecting taxa for chimpanzee and human: ncbi-taxonomist collect -n chimpanzee human JSON output {"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"}, ,!"parentid":null,"name":"cellular organisms"} {"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid ,!":131567,"name":"Eukaryota"} {"taxid":33154,"rank":"clade","names":{"Opisthokonta":"scientific_name"},"parentid ,!":2759,"name":"Opisthokonta"} {"taxid":33208,"rank":"kingdom","names":{"Metazoa":"scientific_name"},"parentid ,!":33154,"name":"Metazoa"} {"taxid":6072,"rank":"clade","names":{"Eumetazoa":"scientific_name"},"parentid":33208, ,!"name":"Eumetazoa"} {"taxid":33213,"rank":"clade","names":{"Bilateria":"scientific_name"},"parentid":6072, ,!"name":"Bilateria"} {"taxid":33511,"rank":"clade","names":{"Deuterostomia":"scientific_name"},"parentid ,!":33213,"name":"Deuterostomia"} {"taxid":7711,"rank":"phylum","names":{"Chordata":"scientific_name"},"parentid":33511, ,!"name":"Chordata"} {"taxid":89593,"rank":"subphylum","names":{"Craniata":"scientific_name"},"parentid ,!":7711,"name":"Craniata"} {"taxid":7742,"rank":"clade","names":{"Vertebrata":"scientific_name"},"parentid ,!":89593,"name":"Vertebrata"} {"taxid":7776,"rank":"clade","names":{"Gnathostomata":"scientific_name"},"parentid ,!":7742,"name":"Gnathostomata"} {"taxid":117570,"rank":"clade","names":{"Teleostomi":"scientific_name"},"parentid ,!":7776,"name":"Teleostomi"} {"taxid":117571,"rank":"clade","names":{"Euteleostomi":"scientific_name"},"parentid ,!":117570,"name":"Euteleostomi"} {"taxid":8287,"rank":"superclass","names":{"Sarcopterygii":"scientific_name"}, ,!"parentid":117571,"name":"Sarcopterygii"} {"taxid":1338369,"rank":"clade","names":{"Dipnotetrapodomorpha":"scientific_name"}, ,!"parentid":8287,"name":"Dipnotetrapodomorpha"} {"taxid":32523,"rank":"clade","names":{"Tetrapoda":"scientific_name"},"parentid ,!":1338369,"name":"Tetrapoda"} {"taxid":32524,"rank":"clade","names":{"Amniota":"scientific_name"},"parentid":32523, ,!"name":"Amniota"} (continues on next page) 2.1. Collect 7 ncbi-taxonomist Documentation, Release 1.2.1+8580b9b (continued from previous page) {"taxid":40674,"rank":"class","names":{"Mammalia":"scientific_name"},"parentid":32524, ,!"name":"Mammalia"} {"taxid":32525,"rank":"clade","names":{"Theria":"scientific_name"},"parentid":40674, ,!"name":"Theria"} {"taxid":9347,"rank":"clade","names":{"Eutheria":"scientific_name"},"parentid":32525, ,!"name":"Eutheria"} {"taxid":1437010,"rank":"clade","names":{"Boreoeutheria":"scientific_name"},"parentid ,!":9347,"name":"Boreoeutheria"} {"taxid":314146,"rank":"superorder","names":{"Euarchontoglires":"scientific_name"}, ,!"parentid":1437010,"name":"Euarchontoglires"} {"taxid":9443,"rank":"order","names":{"Primates":"scientific_name"},"parentid":314146, ,!"name":"Primates"} {"taxid":376913,"rank":"suborder","names":{"Haplorrhini":"scientific_name"},"parentid ,!":9443,"name":"Haplorrhini"} {"taxid":314293,"rank":"infraorder","names":{"Simiiformes":"scientific_name"}, ,!"parentid":376913,"name":"Simiiformes"} {"taxid":9526,"rank":"parvorder","names":{"Catarrhini":"scientific_name"},"parentid ,!":314293,"name":"Catarrhini"} {"taxid":314295,"rank":"superfamily","names":{"Hominoidea":"scientific_name"}, ,!"parentid":9526,"name":"Hominoidea"} {"taxid":9604,"rank":"family","names":{"Hominidae":"scientific_name"},"parentid ,!":314295,"name":"Hominidae"} {"taxid":207598,"rank":"subfamily","names":{"Homininae":"scientific_name"},"parentid ,!":9604,"name":"Homininae"} {"taxid":9605,"rank":"genus","names":{"Homo":"scientific_name"},"parentid":207598, ,!"name":"Homo"} {"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human": ,!"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"} {"taxid":9596,"rank":"genus","names":{"Pan":"scientific_name"},"parentid":207598,"name ,!":"Pan"} {"taxid":9598,"rank":"species","names":{"Pan troglodytes":"scientific_name", ,!"chimpanzee":"GenbankCommonName"},"parentid":9596,"name":"Pan troglodytes"} XML output <taxon><taxid>131567</taxid><rank>no rank</rank><name>cellular organisms</name> ,!<parentid>None</parentid><names><name type="scientific_name">cellular organisms</ ,!name></names></taxon> <taxon><taxid>2759</taxid><rank>superkingdom</rank><name>Eukaryota</name><parentid> ,!131567</parentid><names><name type="scientific_name">Eukaryota</name></names></ ,!taxon> <taxon><taxid>33154</taxid><rank>clade</rank><name>Opisthokonta</name><parentid>2759</ ,!parentid><names><name type="scientific_name">Opisthokonta</name></names></taxon> <taxon><taxid>33208</taxid><rank>kingdom</rank><name>Metazoa</name><parentid>33154</ ,!parentid><names><name type="scientific_name">Metazoa</name></names></taxon> <taxon><taxid>6072</taxid><rank>clade</rank><name>Eumetazoa</name><parentid>33208</ ,!parentid><names><name type="scientific_name">Eumetazoa</name></names></taxon> <taxon><taxid>33213</taxid><rank>clade</rank><name>Bilateria</name><parentid>6072</ ,!parentid><names><name type="scientific_name">Bilateria</name></names></taxon> <taxon><taxid>33511</taxid><rank>clade</rank><name>Deuterostomia</name><parentid>33213 ,!</parentid><names><name type="scientific_name">Deuterostomia</name></names></taxon> <taxon><taxid>7711</taxid><rank>phylum</rank><name>Chordata</name><parentid>33511</ ,!parentid><names><name type="scientific_name">Chordata</name></names></taxon>