wiki:Tutorial

SeqAn Tutorials

The SeqAn tutorials are the best way to get started with learning how to develop using SeqAn. In contrast, the  API Documentation gives more comprehensive but less verbose documentation about the library while the How-Tos are strictly task driven and narrower in scope.

The main audience of the tutorials are graduate students and professionals who want to learn how to use SeqAn. Previous programming knowledge is required, knowledge of C++ is recommended.


Introduction

These tutorials show you how to get started with SeqAn, including the installation. Then, you can learn about the background and motivation of SeqAn. You should then definitely start your engines and read the First Steps in SeqAn tutorial to see an example highlighting many important concepts in the SeqAn library.

  • Getting Started This tutorial will walk you through the installation of SeqAn and its dependencies. Then, you will create your first minimal SeqAn application!
  • First Steps in SeqAn This tutorial gives practical examples and applications of the most important basic techniques. You should read this tutorial if you are starting out with SeqAn.
  • Background and Motivation This tutorial gives an overview over the design aims and principles of SeqAn and a motivation for the employed mechanisms.

We highly recommend you to follow the Getting Started instructions if you are starting out with SeqAn. Note that it is also possible to use SeqAn strictly as a library with your own build system. The How To: Use SeqAn In Your Projects contains detailed information about this.


A Stroll Through SeqAn

If you enter this part of the tutorial we introduce you to the content of SeqAn. We grouped the content into (somewhat arbitrary) categories, following those of the SeqAn book, namely:

Sequences

  • Sequences This tutorial introduces you to the basics of fundamental concept of sequences, namely Strings and Segments.
  • Alphabets This tutorial introduces you to SeqAn's alphabets, or in other words, the contained types of sequences.
  • StringSets This tutorial introduces you to SeqAn's StringSet, an efficient data structure to store a set of sequences.
  • Sequences In-Depth In this tutorial you will learn how to optimize the work with sequences, using different specializations of Strings and different overflow strategies for capacity changes.

Iterators

  • Iterators This tutorial explains how to use iterators in SeqAn, illustrated on containers.

Alignments

  • Alignment Representation This section of the tutorial introduces you to the data structures that are used to represent alignments in SeqAn.
  • Pairwise Sequence Alignment In this part of the tutorial we demonstrate how to compute pairwise sequence alignments in SeqAn. It shows the use of different scoring schemes, and which parameters can be used to customize the alignment algorithms.
  • Multiple Sequence Alignment In the last section of this tutorial we show how to compute multiple sequence alignments in SeqAn using a scoring matrix.

Indices

  • Indices This tutorial introduces you to the various indices in SeqAn like extended suffix arrays or k-mer indices.
  • Index Iterators This tutorial introduces you to the various index iterators with which you can use indices as if traversing search trees or tries.
  • q-gram Index This tutorial introduces you to SeqAn's q-gram index.

Pattern Matching

  • Pattern Matching This section of the tutorial introduces you to the algorithms in SeqAn for exact and approximate pattern matching.

Graphs

  • Graphs This section of the tutorial introduces you to the graph type in SeqAn. We will discuss the various graph specializations and show you how to create directed and undirected graphs as well as HMMs, how to store additional information for edges and vertices and last but not least how to apply standard algorithms to the graphs.

I/O Basics

  • Basic Sequence I/O This tutorial explains how to use the high-level API for reading and writing sequence files.
  • Indexed FASTA I/O This tutorial explains how to use FASTA index files for quick random access within FASTA files: Read contigs or just sections without having to read through whole FASTA file.
  • Basic SAM and BAM I/O This tutorial explains how to use the high-level API for reading and writing SAM and BAM files.
  • new VCF I/O This tutorial explains how to use the high-level API for reading and writing VCF files.
  • new BED I/O This tutorial explains how to use the high-level API for reading and writing BED files.
  • new GFF and GTF I/O This tutorial explains how to use the high-level API for reading and writing GFF and GTF files.

Modifiers

  • Modifiers Modifiers can be used to change the elements of a container without touching them. Here you will see, what modifiers are available in SeqAn.

Randomness

  • Randomness This chapter shows module random that provides pseudo random number generation functionality.

Seed-And-Extend

  • updated Seed-And-Extend In this part of the tutorial we will introduce SeqAn's seed class, demonstrate seed extension and banded alignment with seeds, and finally show the usage of seed chaining algorithms.

Parsing Command Line Arguments

Genome Annotations

More I/O

These tutorials explain how to use the I/O functionality in SeqAn beyond the basic sequence, SAM/BAM and indexed FASTA I/O from above. The tutorials are targeted at developers that either want to use the lower level I/O routines in SeqAn or write their own parsers. We recommended to start out reading the I/O Overview and then jump to the chapter that interests you most.

  • I/O Overview This article gives an overview of the I/O functionality in SeqAn. After reading, you will have a better understanding of the different bits in this section of the library.

The following tutorials introduce the lower level I/O routines for specific file formats.

  • Reading Sequence Files This tutorial explains the RecordReader- and Stream-based interface for reading sequence files.
  • SAM and BAM I/O This tutorial explains the lower level API for reading and writing SAM and BAM files.

Read the following tutorials to learn how to write your own I/O routines.

  • File I/O This chapter shows how to use the file I/O facilities of SeqAn, including streams, compressed streams and memory mapped files.
  • Lexical Casting This tutorial explains the  lexicalCast and  lexicalCast2 functions that allow to convert strings representing numbers into their numeric values.
  • Parsing In this part of the tutorial, you will be introduced to the parsing and tokenizing functionality using the RecordReader class. You will get the necessary information to write your own file parsers.

Motif Finding

  • Motif Finding This section of the tutorial introduces you to the algorithms in SeqAn for motif finding.

Advanced Tutorials

Fragment Store
This tutorial shows how to use the fragment store which is a database for read mapping, sequence assembly or gene annotation. It supports to read/write multiple read alignments in SAM or AMOS format and access and modify them. It supports to read/write gene annotations in GFF/GTF and UCSC format, to create custom annotation types, and to traverse and modify the annotation tree.
Simple RNA-Seq
In this tutorial you will learn how to implement a simple RNA-Seq based gene quantification tool, that computes RPKM expression levels based on a given genome annotation and RNA-Seq read alignments.
Simple Read Mapping
This tutorial shows how to implement a simple read mapping program based on the SWIFT filter and online Hamming finder for verification.
Mini-Bowtie
Mini-Bowtie is a very basic read aligner that is inspired by the well known Bowtie program (Langmead, 2009). It serves as an example to show that you can write sophisticated programs with SeqAn using few lines of code.
Data Journaling
In this tutorial we demonstrate how you can handle multiple large sequence in main memory while the data structures themself support a certain parallel sequence analysis.
SeqAn KNIME Nodes
Here you can learn how to use SeqAn apps in KNIME.

Developers Corner

First, congratulations on becoming an offical SeqAn developer! After you went through the tutorials and before you actually start to develop your own application with SeqAn you might want to learn how we: document code or write tests, write ticket, submit patches, etc. In addition we follow a SeqAn specific Style Guide. Information like this can be found on the HowTo site. There are plenty of information completing your knowledge about SeqAn so have a look!

Frequently used Software Techniques

We assume that the user is acquainted with the basic data types of SeqAn, the introductory example and the demo programs. Also you should be acquainted with the STL and template programming. In this Section we introduce the three main techniques of programming in SeqAn, namely the global function interface, the use of Metafunctions, and the concept of Template subclassing.

STL and template programming
Here we remind you of the basics of template programming and the use of the STL.
Metafunctions
In this section you find an introductory explanation how Metafunctions are used in SeqAn to obtain information about data types used which will only be instantiated at compile time.
Template subclassing
In this section you find a short example that illustrates the power of template subclassing.
Global function interface
In this section you find a useful piece of code that shows you the flexibility of the global function interface.