A tool kit for accessing NCBI's GenBank
A tool kit for downloading and curating collections of genomes retrieved from the National Center for Biotechology Information’s public database, GenBank.
Requires rsync. Tested only with rsync version 3.1.2 protocol version 31.
pip install ncbitk
git clone https://github.com/andrewsanchez/NCBITK.git python setup.py install
Regardless of which installation method you choose, I recommend using a virtual environment.
Download all GenBank bacteria:
ncbitk [directory] --update
If you have already run NCBITK, the above will also update your local collection, i.e. remove old genomes no longer in the assembly summary and download the latest assembly versions.
Get the status of your collection:
ncbitk [directory] --status
This will tell you how many genomes you have, what is missing from your collection, and how many deprecated genomes are present.