Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Latest commit

 

History

History
219 lines (194 loc) · 11.7 KB

software.md

File metadata and controls

219 lines (194 loc) · 11.7 KB
title layout excerpt sitemap permalink
HLi Lab - Software
textlay
Software and Resources
false
/software

Software and Resources

Our lab developed several alignment and assembly algorithms critical to high-throughput sequence analysis. These include samtools, BWA, minimap2 and hifiasm, with each cited for 1000+ times per year. We also explore a variety of algorithms related to variant calling (e.g. longcallR and longcallD), pangenome analysis (e.g. minigraph and pangene), protein alignment (e.g. miniprot), full-text indexing (e.g. ropebwt3), immunology (e.g. Immuannot and T1K), evolution (e.g. psmc and compleasm) and high-performance data structures in general (e.g. bedtk and BGT). Most of our tools work years after their initial publications and are often well received.

Software

Current

  • minipileup: simple pileup-based variant caller, unpublished
  • seqtk: a small toolkit for manipulating sequences in FASTA/FASTQ, unpublished
  • gfatools: a toolkit for working with graphs in the GFA format, unpublished
  • miniwfa: a reimplementation of the wavefront alignment algorithm at low memory. Unpublished but used in minigraph.
  • jstreeview: interactive phylogenetic tree viewer/editor in JavaScript, unpublished

Developed by past members or maintained by others

Old but functional

  • dna-nn: model and predict short DNA sequence features with neural networks, published in Li (2019).
  • hickit: 3D modeling for single-cell Hi-C, developed for Tan et al (2018). It was not used in this paper but used in Longzhi Tan's later work.
  • BGT: fast and lightweight genotype query across many samples, published in Li (2016).
  • fermi, fermi2 and FermiKit: short-read assembler, published in Li (2012) and Li (2015).
  • fermi-lite: a library in C for short-read assembly in small regions, adapted from FermiKit
  • BFC: correcting sequencing errors in short reads, published in Li (2015).
  • bioawk: BWK awk modified for biological data, unpublished
  • psmc: infer historical population sizes from a diploid genome, published in Li and Durbin (2011).

Graveyard

  • MAQ: short-read aligner, published in Li et al (2008). It is still working but there is no point to use it now.

Resources

Updated resources since publication

Unpublished resources

Graveyard