TACG

TACG is a command-line program that performs many of the common routines in pattern matching in biological strings. It was originally designed for restriction enzyme analysis, and was expanded to fill more roles: if you are a Unix user, it might be considered a 'grep' for DNA. TACG searches the input Nucleic Acid string for matches to sequences stored in a database. TACG accepts all IUPAC degeneracies (yrswkmbdhv) and performs all possible operations on that sequence. It treats degeneracies in the input sequence in one of 2 ways depending on the -D flag. It either strips all letters other than `a','c','g', or `t' and analyzes the sequence as `pure' using a fast incremental hashing algorithm or it treats it as degenerate and analyses it via a slower algorithm. By default, it treats it as `pure' unless it detects IUPAC degeneracies (yrswkmbdhv) in it in which case it will adaptively switch back and forth between the fast and slow hashing routines.

If you use TACG, please cite: Mangalam, H. (2002) "tacg - a grep for DNA" BMC Bioinformatics 3:8.

Manual: http://tacg.sourceforge.net/tacg-4.3-manual.html

INPUT = Nucleic Acid Sequences.

Input file tacg_in.txt

Output file: tacg_out.txt