PacMin

Assembler for PacBio reads.

Methods

We'll overlap the PacBio reads using the MinHash sketch method proposed in:

Berlin, Konstantin, et al. "Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing." bioRxiv (2014): 008003.

Once the reads are overlapped, we will assemble the reads into a string graph. String graphs are described in:

Myers, Eugene W. "The fragment assembly string graph." Bioinformatics 21.suppl 2 (2005): ii79-ii85.

We do not assume that reads are "correct"; instead, we will maintain "probabilistic" overlaps between the fragments in the string graph. Once we have obtained these probabilistic overlaps, we can estimate the ploidy of each overlap by normalizing the overlap coverage by length and can then apply traditional genotyping methods (e.g., the likelihood estimation stages used in SAMTools) to find the concensus sequences at each overlap.

Getting Started

Building PacMin

PacMin uses Maven to build. To build PacMin, cd into the repository and run "mvn package".

Running PacMin

ADAM is packaged via appassembler and includes all necessary dependencies

You might want to add the following to your .bashrc to make running adam easier:

alias pacmin=". $PACMIN_HOME/pacmin-cli/target/appassembler/bin/pacmin"

$PACMIN_HOME should be the path to where you have checked PacMin out on your local filesystem. To change any Java options (e.g., the memory settings --> "-Xmx4g", or to pass Java properties) set the $JAVA_OPTS environment variable. Additional details about customizing the appassembler runtime can be found here.

Once this alias is in place, you can run adam by simply typing pacmin at the commandline.

Getting In Touch

License

PacMin is released under an Apache 2.0 license.

Distribution

Snapshots of PacMin are available from the Sonatype OSS repository:

<groupId>org.bdgenomics.pacmin</groupId>
<artifactId>pacmin-core</artifactId>
<version>0.0.1-SNAPSHOT</version>

Once we've got a release, we will publish to Maven Central.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
bin		bin
docs		docs
pacmin-cli		pacmin-cli
pacmin-core		pacmin-core
pacmin-distribution		pacmin-distribution
scripts		scripts
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE_header.txt		LICENSE_header.txt
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PacMin

Methods

Getting Started

Building PacMin

Running PacMin

Getting In Touch

License

Distribution

About

Releases

Packages

Languages

License

bigdatagenomics/PacMin

Folders and files

Latest commit

History

Repository files navigation

PacMin

Methods

Getting Started

Building PacMin

Running PacMin

Getting In Touch

License

Distribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages