Bioinformatics Tutorial

Building Models

This section of the tutorial is for those who are particularly interested in protein structure, and in obtaining models of proteins whose structures are not yet known. If your interest in bioinformatics is primarily in sequence analysis and comparison, you can skip to the NEXT section.

For this part of the tutorial, you must be familiar with the molecular graphics program DeepView. If you are not, first learn how to use it by working through at least Sections 1-6 and 11 of the DeepView Tutorial. If you are new to DeepView or to molecular graphics, this will require a commitment of at least a few hours. If you make this commitment, you will gain a very friendly, powerful, and FREE tool for studying protein models, as well as for analyzing, comparing, and building them, and judging their quality. When you have completed the DeepView Tutorial, return here.


In this section, you will obtain a structural model of the mysterious peropsin from an automated server of homology models, and view it with DeepView.

How do you get models of proteins not in the PDB?

When you searched the PDB with the FASTA sequence of the red opsin, you found no structures of red opsin, nor of any other opsin, including peropsin. But if you wanted to try to infer as much as possible about the structure of peropsin, it's a good bet that it is similar to rhodopsin, the only opsin of known structure. If a sequence of interest to you has even one homolog in the PDB, you can build a model of your unknown protein by assuming that its structure is similar to that of the known protein. If two sequences share 25 or 30 percent homology, their three-dimensional structures (that is, their conformations) are almost certain to be very similar. Building a model of an unknown protein from a model of a homolog is called homology modeling. The protein whose sequence is known, but whose structure is unknown, is the target, and the known model is the template. Rigorous modeling processes employ two or more unique templates if available, and the modeling process will, in each region of model building, favor the template that is most like the target in sequence. Homology models can be built automatically, and a number of modeling servers provide homology models for all database sequences for which templates (homologs) exist in the PDB.

Now you will visit an automated modeling server, download a structural model of peropsin. You will use the flexible and powerful SWISS-MODEL, which allows modeling building for users of all levels of knowledge, from beginner to expert. As a beginner, you can produce a model of peropsin by a fully automated method. All you need is the UniProt entry codes, O14718.

Point your browser to, the home of SWISS-MODEL Workspace. Read about this resource, and then click [ login ] at the top of the page. Because this is your first visit, you will need to set up an account (if you already have an account, just log in and skip the rest of this paragraph). Below the boxes for login ID and password, click create your workspace, to begin setting up your workspace account. Follow the instructions, and then you will have a password-protected workspace for using SWISS-MODEL. Log in to your workspace.

Each model request you submit will be listed as a numbered Workunit in the main table of your workspace. Click [ Modelling ] near the top of the page. On the resulting page, click Automated Mode. Enter your email address (the same one you used to set up your account) and a project title (in this case, use Peropsin Automated) in the appropriate boxes. In the box labeled Provide a protein sequence or a UniProt AC Code: enter the UniProt code O14718 for peropsin. Notice that you could also specify a template, but leave that box blank, to get the PDB template that aligns best with this peropsin sequence. Click Submit Modelling Request.

To watch the status of your request, click [ Workspace ] at the top of the acknowledgment window. When the Workspace table indicates that your Workunit is finished, click the Workunit number to see the results. From the results page, you can examine many aspects of the model and the process that produced it. Most notable are the model itself and the template. To find out more about the template, click the four-character code next to based on template, and you will see an entry in the SWISS-MODEL TEMPLATE LIBRARY, with brief information about the template. To learn even more, click PDB on the library page.

To obtain your model for viewing and analysis, just below the image of the model, in the line that says download model: as pdb - as Deepview project - as text, click Deepview project. Save the downloaded file, named Model_1_project.pdb, to a convenient place. Start DeepView, and use the menu command File: Open PDB File... to open the file.

In the remainder of this section, I assume that you are familiar with DeepView, and I use the same conventions for specifying operations as you learned in the DeepView Tutorial. If the instructions seem sketchy, you might need to spend more time with the DeepView Tutorial.

SWISS-MODEL project files contain both the model and the template (or templates), superimposed on each other. The name of the model layer is TARGET, and the name of the template layer is the PDB code of the template. Blink (hold down ctrl and press tab repeatedly) to compare the models. With the TARGET layer active, display ribbons only, and note the colors of the ribbon model. Green signifies areas that aligned well with the template; the backbone of the model in green regions is practically identical to that of the template. Red signifies areas that could not be aligned well with the template. If you blink to compare the ribbon model of peropsin with the template, you will see that some red areas correspond to surface loops that are of quite different length in the two models. These areas of the peropsin model were built by various methods other than simple threading onto the template. One method is to search loop libraries for loops (in the PDB) that contain the same number of residues and the same distance between end points, and then try to fit them in. By whatever method the red areas are built, you should have far less confidence in their accuracy.

NOTE: Data relating to fit-to-template are in the B-factor column of the coordinate file. You can apply the same color scheme to any aspect of the model (such as backbone or surface) with Color:B-Factor.

If you plan to use a homology model to guide you in your research (such as helping you decide where an active site might be, or where to try site-directed mutagenesis in order to alter properties) you must learn how to assess the quality of a model. That subject is beyond the scope of this tutorial, but covered in depth in the section Judging the Quality of Models •Homology Models, in the DeepView Tutorial.

You have just scratched the surface of SWISS-MODEL WORKSPACE in this tutorial. You will find additional tutorial material and help at the workspace site, including guidance in how to control the template choice and many other aspects of the modeling. Sections of DeepView Tutorial and additional tutorials at DeepView Home provide much more information about homology modeling.

NOTE: On 2008/09/25 (to my surprise!) the template selected by SWISS-MODEL for this modeling task was PDB 2z73, a newly deposited squid rhodopsin. Recall that your PDB sequence search picked (and still does!) 1f88, bovine rhodopsin, as the top hit, but the statistics on 2z73 make it a very close second. This should tell you that search tools do not all use the same criteria for ordering the results. Remember that SWISS-MODEL gives you the option of selecting a template in the Automated, so if you prefer to base your choice on a search from another site, you can do so. In the less automated modes, SWISS-MODEL allows you to use multiple templates, as well as to use your own alignments of target and template. With DeepView, you can can make alignments with multiple templates; adjust the alignments (for example, in the light of other experimental information homologous residues); submit target, templates, and alignments as a Workunit to SWISS-MODEL; and retrieve the results, all without leaving DeepView.

If you plan to use homology modeling in your research, be sure to learn not to judge the quality of models (sometimes called model validation). Section 9 of the DeepView Tutorial guides you through validation for all types of protein models, and points you to advanced tutorials on this important subject.