What Will MSAReveal Do For You?

Given one or more protein amino acid sequences, aligned or not:
Advanced Features:

How To Use MSAReveal:

  1. Collect amino acid sequences, e.g. from UniProt.Org. instructions are provided.
  2. Align sequences. Instructions are provided using free, straightforward, powerful Jalview. MSAReveal does not align sequences.
  3. Save the alignment in a file in FASTA format.
  4. Display the alignment, copy, and paste into MSAReveal.
  5. Press the button Process Sequences. Voila!

Still Learning 1-Letter Amino Acid Codes?

No problem! MSAReveal shows you the 3-letter abbreviation in a tooltip whenever you touch a one-letter code in the color scheme options, or in the sequence alignment listing. When you touch a one-letter code column header in the statistics table, the full name of the amino acid is shown.

And here is a handy reference chart.

How To Download FASTA Sequences

We recommend downloading FASTA sequences from UniProt.Org:

  1. At UniProt.Org, use the search slot at the top to describe a sequence. Examples: "yeast gal4", "sulfurreducens pila", "human pla2g6".
  2. In the list of hits, click on the Entry code (in the left column of the table) for the sequence you want. (We recommend viewing the entire entry to confirm this is what you want.)
  3. Click on the blue Sequence button at the left side of the page.

  4. For a single sequence:
  5. Click on the blue FASTA button.
  6. Open your browser's File menu, and click Save Page As.
  7. You may wish to rename the file to add the name of the protein or taxon. Keeping the file type ".fasta" is a good idea.
    For a group of sequences:
  1. Click on the blue button Add to basket.
  2. When you have added all the desired sequences to your basket, scroll to the top of the page and click on the blue Basket button.
  3. In the box that opens, click on Download.

  4. Select Uncompressed and click Go.
  5. Select Save File and click OK.

You can now open your saved FASTA file (a plain text editor would be ideal, see below), select all, copy, and paste into MSAReveal.

NOTE that your sequences are not yet aligned. See How To Align Sequences.

FASTA files are plain text. You can edit them with a plain text editor, for example to separate or gather sequences. A plain text editor is one which does not "mark up" the text with formatting codes. In Windows, use Notepad. In Mac, use the free program TextWrangler. If you use WordPad, Word, TextEdit, or other "word processor" programs, it is often tricky to force the program to save as plain text.

How To Align Sequences

We recommend the free program Jalview because it is straightforward, and preserves the full UniProt headers (including genus and species). Jalview requires that free Java be installed on your computer. Alignments done in UniProt suffer from FASTA headers that have only the UniProt Accession Number, without the taxon (genus and species). Instructions for Jalview:

  1. You will need files containing FASTA sequences that have been saved on your computer. See How To Download FASTA Sequences.
  2. Run Jalview.
  3. Drag a file containing one or more FASTA sequences and drop into Jalview. A window should appear that displays the sequence(s) at the top.
  4. Drag additional files into the SAME window if you wish to add more sequences.
  5. At the top of the window containing your sequences, click on Web Service and then click on Alignment.
  6. Choose an alignment algorithm (such as MAFFT, MUSCLE, or TCOFFEE) and click on with defaults.
  7. A second window opens and the alignment is performed. If you have many or long sequences, this might take a while.

  8. A third window titled "So and so alignment" opens when the alignment is completed.
  9. Open the File menu at the top left of the third window, and "Save As". You may want to double-click on Desktop to save it there temporarily. Use FASTA format, and name the file appropriately.
  10. Your saved alignment is now ready to open (a plain text editor would be good), select all, copy and paste into MSAReveal.



Options (preferences) are remembered automatically between sessions, unless you have disabled "cookies" in your browser.

Sequences: Finding Sequence Fragments: Headers: Output: Consensus:

A consensus is shown below the sequence alignment. Touching any position (column) in the consensus reports the frequencies of amino acids and dashes in that column in a tooltip.

Statistics: Browser-Specific Behavior:

Errors Detected and Reported

The following conditions are detected and reported. Each of these can be demonstrated with one of the Demo tests provided.

  1. No header. Demo: Header Missing.
  2. Illegal characters not representing amino acids. When present, a button appears offering to list all instances with links to jump to each one. Demo: Illegal Characters.
  3. Legal but ambiguous amino acid characters BJOUXZ. When present, a button appears offering to list all instances with links to jump to each one. Demo: 1: With Gaps, Ambiguous AA.
  4. Nucleic acid sequence instead of protein sequence. Demo: DNA/RNA.
  5. A single sequence containing gaps (dashes), hence not an alignment. Demo: 1: With Gaps, Ambiguous AA.
  6. Sequences contain gaps, hence apparently an alignment, but have different lengths. Demo: Mismatched Lengths.
  7. Header containing more than one distinct 6- or 10-character UniProt Accesion Numbers. Demo: Multiple accession numbers.

3D Structures (PDB Codes)

When a sequence has an empirical 3D structure in the Protein Data Bank, you may add "PDB=xxxx" to the header, where xxxx is the PDB accession code. Such PDB codes will appear in a "3D" column in the Statistics table, linked to display the corresponding structures in FirstGlance in Jmol. The addition must be before >> or >>>. Example: Demo "9: Pilins".

>>> & >>: Descriptions

Group Descriptions: If you add, for example, ">>> Aligned by MAFFT" to the end of a header, this will be displayed above the table of sequences, with a light green background. Such a group description would normally be added to only one header in a group of sequences. If several headers contain ">>>", the descriptions will be concatenated. Example: Gal4 Demo.

Sequence Descriptions: If you add, for example, ">> Mutant Y57W" to the end of a header, when you touch the Taxon in this row with the mouse, this sequence descripton will be shown above the table of sequences, with a pink background. Example: Gal4 Demo.

">>>" and ">>" can be in either order, but both must be at the end of the header.

Hyperlinks in descriptions: If you wish to include a hyperlink in a description, replace the space following "<a" with a vertical bar "|", so your anchor tag becomes "<a|href=...>linked text</a>". This avoids having the line broken on the space within the hyperlink, which causes the link to display incorrectly in the Full Headers section. Demo: "1: Gs pilA". The only place you will see the vertical bar is in the box where the Demo is pasted. It it replaced with a space (after wrapping) in the Results.