SeqAn Tutorial
Table of Contents
This tutorial is aimed at graduate students and professionals that want to get acquainted with programming in SeqAn. We assume standard familiarity with C++ and the STL.
Getting Started
In order to set up SeqAn please follow the getting started instructions. Note that the setup described there is specific to compiling the tutorial demos and SeqAn apps within the SeqAn CMake framework. Refer to How To: Use SeqAn In Your Projects for information how to use SeqAn as a library.
Basic Techniques
We assume that the user is acquainted with the basic data types of SeqAn, the introductory example and the demo programs. Also you should be acquainted with the STL and template programming. In this Section we introduce the three main techniques of programming in SeqAn, namely the global function interface, the use of Metafunctions, and the concept of Template subclassing.
- STL and template programming
- Here we remind you of the basics of template programming and the use of the STL.
- Metafunctions
- In this section you find an introductory explanation how Metafunctions are used in SeqAn to obtain information about data types used which will only be instantiated at compile time.
- Template subclassing
- In this section you find a short example that illustrates the power of template subclassing.
- Global function interface
- In this section you find a useful piece of code that shows you the flexibility of the global function interface.
A Stroll Through SeqAn
If you enter this part of the tutorial we introduce you to the content of SeqAn. We grouped the content into (somewhat arbitrary) categories, following those of the SeqAn book, namely:
- Basics
- This section of the tutorial introduces you into the concepts of containers and values, memory allocation, alphabet types and different iterators in SeqAn.
- Sequences
- This section of the tutorial introduces you to the central container in SeqAn, the sequence. We will discuss the types string and String set, and briefly demonstrate how to manipulate sequences.
- Modifiers
- Modifiers can be used to change the elements of a container without touching them. Here you will see, what modifiers are available in SeqAn.
- Alignments
- This section of the tutorial introduces you to the algorithms in SeqAn for computing pairwise and multiple alignments. You will learn about the two data structures available for representing alignments, about scoring matrices, parameter settings, and about computing suboptimal alignments.
- Pattern Matching
- This section of the tutorial introduces you to the algorithms in SeqAn for exact and approximate pattern matching.
- Motif Finding
- This section of the tutorial introduces you to the algorithms in SeqAn for motif finding.
- Indices
- This section of the tutorial introduces you to the various indices in SeqAn like extended suffix arrays or k-mer indices.
- Graphs
- This section of the tutorial introduces you to the graph type in SeqAn. We will discuss the various graph specializations and show you how to create directed and undirected graphs as well as HMMs, how to store additional information for edges and vertices and last but not least how to apply standard algorithms to the graphs.
- Seed-And-Extend
- In this part of the tutorial we will introduce SeqAn's seed class, demonstrate seed extension and banded alignment with seeds, and finally show the usage of seed chaining algorithms.
- Randomness
- This chapter shows module random that provides pseudo random number generation functionality.
- File I/O 2.0
- This chapter shows how to use the file I/O facilities of SeqAn, including streams, compressed streams and memory mapped files. Furthermore, the RecordReader class and the lexical casting routines are introduced.
- Reading Sequence Files
- This chapter shows how to use read sequence files (e.g. FASTA and FASTQ) in SeqAn.
- Parsing
- In this part of the tutorial, you will be introduced to the parsing and tokenizing functionality. You will get the necessary information to write your own file parsers.
- SAM and BAM I/O
- Most modern read mapper programs allow the creation of SAM and BAM files. This chapter, familiarizes you with record-wise reading and writing SAM and BAM files in SeqAn.
- Writing Tests
- Writing tests is essential for creating robust software. This chapter introduces you to SeqAn's test system.
Advanced Tutorials
- Fragment Store
- This tutorial shows how to use the fragment store which is a database for read mapping, sequence assembly or gene annotation. It supports to read/write multiple read alignments in SAM or AMOS format and access and modify them. It supports to read/write gene annotations in GFF/GTF and UCSC format, to create custom annotation types, and to traverse and modify the annotation tree.
- Simple Read Mapping
- This tutorial shows how to implement a simple read mapping program based on the SWIFT filter and online Hamming finder for verification.
Before you start programming
- Testing and Documenting in SeqAn
- In this section we show you how to use the internal documentation system in SeqAn and also how to write tests for your code.
- I/O for commonly used data
- In this section we show you how you efficiently get your data in and out of SeqAn.
How Tos
This section of the tutorial will be a growing list of popular code pieces and examples that have proven quite useful for the SeqAn team and developers.
File I/O of SeqAn types:
Indices:
General Graphs:
Alignment Graphs:
Misc
If you want to edit this tutorial (and are allowed to do it)
In this case please read this first Tutorial Instructions
Information For Developers
The following pages focus on specific points for Seqan developers. They have a more reference than tutorial like character:
- How To: Write Tickets
- How To: Submit Patches
- How To: Document Code
- How To: Write Tests
- [HowTo/StyleGuide The SeqAn Style Guide]
