Skip to main content
Warning: You are using the test version of PyPI. This is a pre-production deployment of Warehouse. Changes made here affect the production instance of TestPyPI (
Help us improve Python packaging - Donate today!

Process Duplex Sequence

Project Description
# ProDuSe
An analysis pipeline, helper scripts and Python classes to **Pro**cess **Du**plex **Se**quence data

## Description

## Installation

### Dependencies

You will need to install the following tools before installing the ProDuSe package:

* `python==2.7`
* `bwa==0.7.12`
* `samtools==1.3.1`

To install the ProDuSe package run the following command:

pip install ProDuSe

## Running ProDuSe

### The Analysis Pipeline

You will first need to retrieve two configuration files:

#### `config.ini`
* command line arguments for each stage in the analysis pipeline
* retrieve a sample config.ini file [here](

#### `sample_config.ini`
* paired fastq files for all samples you wish to run the analysis pipeline on
* retrieve a sample sample_config.ini file [here](

To run the analysis pipeline you simply need to run the following command:

produse analysis_pipeline
-c config.ini
-sc sample_config.ini
-r /path/to/ref.fa
-o /path/to/output

Once the above command was executed successfully, you will want to change to the following directory:

cd /path/to/output/produse_analysis_directory

This directory includes a subdirectory for each sample listed in `sample_config.ini` as well as a Makefile. To run the analysis pipeline run:

make -j 4

You can tweak `-j 4` taking into consideration the number of available cores as well as the number of samples to run.

### Helper Scripts

The ProDuSe package includes a variety of helper scripts to aid in the analysis of duplex sequencing data.

All scripts included in the current package can be found by running the following:

produse -h

#### produse adapter_predict

If you need to confirm the expected adapter sequence of a sample you should run the following command:

produse adapter_predict -i input1.fastq input2.fastq

This tool will print a predicted adapter sequence based off of ACGT abundances at each position. It uses these observed abundances and finds the closest expected abundance for an IUPAC unambiguous or ambiguous base.

### Python Classes

Two major python classes are included with ProDuSe.

#### The Alignment Class

The first is the alignment class. This linearly processes reads from a BAM file until both read pairs have been identified, at which point the first yield to the developer occurs.

#### The Position Class

The second if the position class. This class aims to create a duplex sequencing ready mpileup class.

Full descriptions of two python classes can be retrieved here
Release History

Release History

This version
History Node


History Node


History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
ProDuSe-0.1.6.tar.gz (39.9 kB) Copy SHA256 Checksum SHA256 Source Nov 18, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting