Multalin supported sequence formats


Multalin/fasta

The MultAlin and Fasta formats are very similar.

	> SeqName the sequence name is the
        > first word of the first comment line 
        > max: 8 letters 
        > comment lines begin with >
        AAAACCGTTAAA...
        > SeqNam2 the 2nd sequence beginning  
        > shows the end of the first one 
        AAACCTGGAC...
actual sequence set

>CCPC50       129
     1  QDGDAAKGEK EFNKCKACHM IQAPDGTDII KGGKTGPNLY GVVGRKIASE
    51  EGFKYGEGIL EVAEKNPDLT WTEADLIEYV TDPKPWLVKM TDDKGAKTKM
   101  TFKMGKNQAD VVAFLAQNSP DAGGDGEAA
>CCRF2S       124
     1  QEGDPEAGAK AFNQCQTCHV IVDDSGTTIA GRNAKTGPNL YGVVGRTAGT
    51  QADFKGYGEG MKEAGAKGLA WDEEHFVQYV QDPTKFLKEY TGDAKAKGKM
   101  TFKLKKEADA HNIWAYLQQV AVRP
>CCRF2C       116
     1  GDAAKGEKEF NKCKTCHSII APDGTEIVKG AKTGPNLYGV VGRTAGTYPE
    51  FKYKDSIVAL GASGFAWTEE DIATYVKDPG AFLKEKLDDK KAKTGMAFKL
   101  AKGGEDVAAY LASVVK
>CCQF2R       112
     1  EGDAAAGEKV SKKCLACHTF DQGGANKVGP NLFGVFENTA AHKDNYAYSE
    51  SYTEMKAKGL TWTEANLAAY VKNPKAFVLE KSGDPKAKSK MTFKLTKDDE
   101  IENVIAYLKT LK

This format works with the auto and mul sequence input format settings.


GenBank


	LOCUS      SeqName  
        any lines  
        ORIGIN     anything              
        1 aggtcccttt tgtgttgttt
        //
The sequence name is the first word after the LOCUS key-word. The sequence begins on the line following the ORIGIN key-word. The next sequence information begins with the LOCUS key-word.

actual sequence set


LOCUS       S65070         90 bp    mRNA            INV       01-NOV-1993
DEFINITION  VD1/RPD2 alpha peptide {alternatively spliced, clone AMP6} [Lymnaea
            stagnalis=snails, central nervous system, mRNA Partial, 90 nt].
ORIGIN
GACATGTATG AGGTGGCTAC AACGAGAATT GGTACAGGGG GACTAGCTGG 
GCGTTGTCAA CATCATCCAC GGAACTGTCC TGGATTTAAT 
//
LOCUS       S65071         57 bp    mRNA            INV       01-NOV-1993
DEFINITION  VD1/RPD2 alpha peptide {alternatively spliced, clone AMP7} [Lymnaea
            stagnalis=snails, central nervous system, mRNA Partial, 57 nt].
ORIGIN
GACATGGGAC TAGCTGGGCG TTGTCAACAT CATCCACGGA ACTGTCCTGG ATTTAAT
//
LOCUS       S65072         48 bp    mRNA            INV       01-NOV-1993
DEFINITION  VD1/RPD2 alpha peptide {alternatively spliced, clone AMP8} [Lymnaea
            stagnalis=snails, central nervous system, mRNA Partial, 48 nt].
ORIGIN
GACATGGGGC GTTGTCAACA TCATCCACGG AACTGTCCTG GATTTAAT
//
LOCUS       S65078         45 bp    mRNA            INV       01-NOV-1993
DEFINITION  VD1/RPD2 alpha peptide {alternatively spliced, clone AMP9} [Lymnaea
            stagnalis=snails, central nervous system, mRNA Partial, 45 nt].
ORIGIN
GACATGCGTT GTCAACATCA TCCACGGAAC TGTCCTGGAT TTAAT
//

This format only works with the auto sequence input format.


EMBL - SwissProt


	ID   SeqName  
        any lines 
        SQ   anything  
        aauccagug gagaucaaag          
        any sequence lines  
        //
actual sequence set

ID   CYC_HORSE      STANDARD;      PRT;   104 AA.
KW   MITOCHONDRION; ELECTRON TRANSPORT; RESPIRATORY CHAIN; HEME;
KW   ACETYLATION; 3D-STRUCTURE.
SQ   SEQUENCE   104 AA;  11702 MW;  82586B67 CRC32;
GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGFTYTD 
ANKNKGITWKEETLMEYLENPKKYIPGTKMIFAGIKKKTEREDLIAYLKK 
ATNE
//
ID   CYC_BOVIN      STANDARD;      PRT;   104 AA.
AC   P00006;
SQ   SEQUENCE   104 AA;  11572 MW;  9DBE33E7 CRC32;
GDVEKGKKIFVQKCAQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGFSYTD 
ANKNKGITWGEETLMEYLENPKKYIPGTKMIFAGIKKKGEREDLIAYLKK 
ATNE
//
ID   CYC_EUGGR      STANDARD;      PRT;   102 AA.
RX   MEDLINE; 76039443.
SQ   SEQUENCE   102 AA;  11210 MW;  8B46A67F CRC32;
GDAERGKKLFESRAAQCHSAQKGVNSTGPSLWGVYGRTSGSVPGYAYSNA 
NKNAAIVWEEETLHKFLENPKKYVPGTKMAFAGIKAKKDRQDIIAYMKTL 
KD
//
ID   CYC_HUMAN      STANDARD;      PRT;   104 AA.
DE   CYTOCHROME C.
RC   SPECIES=HUMAN; TISSUE=HEART;
CC --------------------------------------------------------------------------
CC   This SWISS-PROT entry is copyright. It is produced through a collaboration
CC   between  the Swiss Institute of Bioinformatics  and the  EMBL outstation -
CC   the European Bioinformatics Institute.  There are no  restrictions on its
CC   use  by  non-profit  institutions as long  as its content  is  in  no way
CC   modified and this statement is not removed.  Usage  by  and for commercial
CC   entities requires a license agreement (See http://www.isb-sib.ch/announce/
CC   or send an email to license@isb-sib.ch).
CC --------------------------------------------------------------------------
SQ   SEQUENCE   104 AA;  11617 MW;  3B7A76B0 CRC32;
GDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTA 
ANKNKGIIWGEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKK 
ATNE
//

This format only works with the auto sequence input format setting.