Bug 229330 - [NEW PORT] biology/hisat2: Graph-based alignment of sequencing reads to genomes
Summary: [NEW PORT] biology/hisat2: Graph-based alignment of sequencing reads to genomes
Status: In Progress
Alias: None
Product: Ports & Packages
Classification: Unclassified
Component: Individual Port(s) (show other bugs)
Version: Latest
Hardware: Any Any
: --- Affects Some People
Assignee: Yuri Victorovich
Depends on: 223273
  Show dependency treegraph
Reported: 2018-06-25 09:15 UTC by Motomichi Matsuzaki
Modified: 2019-10-07 16:14 UTC (History)
3 users (show)

See Also:

svn diff (9.14 KB, patch)
2018-06-25 09:15 UTC, Motomichi Matsuzaki
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Motomichi Matsuzaki 2018-06-25 09:15:50 UTC
Created attachment 194628 [details]
svn diff

HISAT2 is a fast and sensitive alignment program for mapping next-generation
sequencing reads to a population of genomes as well as to a single reference.
HISAT2 is a successor to HISAT and TopHat2, both of which are spliced alignment
program for mapping RNA-seq reads; additionally, HISAT2 is designed to map
sequencing reads from genomic DNA of generic human population having SNPs.

WWW: https://ccb.jhu.edu/software/hisat2/

HISAT2 requires SSE2 instructions as often other modern alignment programs do.
We can build it on i386 machines; however, CPUTYPE must be appropriately defined
in /etc/make.conf

It will detect POPCNT instruction runtime and use for acceleration,
but the code is able to build only for amd64 architecture.

NOTE: HISAT2 has SRA data access ability but it requires NCBI's NGS/VDB libraries,
which port is currently pending (bug #223273).

Tested by: poudriere testport (amd64/i386 11.1/10.4)
Comment 1 Walter Schwarzenfeld freebsd_triage 2019-09-04 19:18:27 UTC
Comment 2 Jason W. Bacon freebsd_committer 2019-10-07 16:14:18 UTC
biology/hisat2 already exists.

Sorry if I overlooked this PR, but it had been pending in my wip collection since 2017.  It's a good idea to look there before creating any biology ports:


The SRA Tools issue is a complicated one.  Currently SRA Tools can only download *some* files, while others require Aspera (closed-source but the ascp command it includes should work on FreeBSD) or Fusera.  It appears that SRA Tools will still play a role in processing data downloaded by other means, but the documentation is very confusing.  I'm currently working on a TOPMed project and dealing with this directly.

Maybe this PR can be closed and we can revisit the SRA features after SRA Tools is commmitted?