HelpVersion2
From PathVisio Wiki
PathVisio 1.0
Overview
PathVisio is a tool for displaying and editing biological pathways. In a sense PathVisio lets you draw pathways as though you are drawing them in powerpoint. But the difference is that PathVisio can understand the biological context of a pathway, because you can link biological components (genes or proteins) in your pathways to biological data using database identifiers. This will let you map experimental data (e.g. microarray data) and visualize it on top of the pathway drawing (Note: mapping experimental data is not yet a feature of PathVisio 1.1, but you can do this in combination with GenMAPP or with additional plug-ins for PathVisio).
Here is a screenshot of the program in action:
Editing pathways
How do I ... Create a new pathway?
- Create a new pathway by pressing Ctrl+N or going to File > New.
- A blank drawing area will open where you can start adding and editing new elements. Specific editing features are described below.
Placing new objects
Add new objects by selecting one of the object icons in the toolbar, then drag and drop the object in the desired location in the drawing area. If you're unsure what a certain icon does, you can hover over it with the mouse cursor to get a description.
Selecting objects
Single or multiple objects can be selected in two ways:
- Click-and-drag anywhere in the drawing area and drag to select any number of objects using a rectangular selection tool.
- Click on any object to select it. Holding down the Ctrl key or Shift key and click on additional items to add these to the selection.
Moving objects
You can move objects in two ways:
- Click and drag: Select one or more objects and then click and drag the selection to the desired location.
- Arrow keys: Select one or more objects and click any of the keyboard arrow keys to move the selected objects. Holding down the shift key while pressing the arrow keys will move the objects with a larger increment.
Changing the size of objects
You can change the size of objects in two ways:
- Click on an object and drag the handles to resize or move objects.
- Open the properties tab in the panel on the right side of the window. Change the size of the object by entering information for Width and Height.
Grouping objects
PathVisio supports the notion of groups, to facilitate editing and for describing biological entities. Any selection of objects can be grouped in PathVisio:
- Select the group of objects to be grouped.
- Click Ctrl+G to group the objects. The objects can now be moved as a group and the group can also be linked to other objects. Clicking Ctrl+G again ungroups the group.
Aligning, scaling and stacking objects
To facilitate editing, objects can be aligned, scaled and stacked using a set of buttons in the toolbar.
- Align vertically: Center of objects are aligned vertically based on the location of the top selected object
- Align horizontally: Center of objects are aligned horizontally based on the location of the top selected object
- Set common width: Widths of selected objects are set to the width of the widest selected object
- Set common height: Heights of selected objects are set to the height of the tallest selected object
- Stack vertically: Objects are are arranged into a vertical stack based on the top selected object
- Stack horizontally: Objects are are arranged into a horizontal stack based on the top selected object
Zoom
Pathways can be zoomed using the zoom drop-down in the toolbar or by going to View > Zoom. The Zoom to fit option sizes the pathway to fit in the current drawing area.
Copy and paste
Copy and paste works for objects within a given pathway as well as between pathways open in different windows of PathVisio, and between PathVisio and WikiPathways. Copy and Paste can be accessed in the Edit menu as well as with shortcut keys, Ctrl+C copies selected objects and Ctrl+V pastes copied objects.
Linking two biological entities together
You can link any line endpoint to a shape, gene product or brace in the following way:
- Click on the line and select the end point handle
- Drag the end point to the object you want to the object you want to link it to. When you come near the object, little round "snap targets" appear. When you move close to a snap target, the line snaps to it and the target shows a green line. When this happens, release the handle.
The end point of the line is now tied to the object. If you move the object, the endpoint of the line follows. However, you can still move the line independently of the shape.
Changing line and arrow type
There are three possible line styles for connectors:
- Straight, a direct line
- Elbow, connects points with straight angles
- Curved, connects points with a smooth curve
Make the connection line or arrow first, then right-click to bring up the line style pop-up menu.
Pointing arrows to other arrows
It is create an arrow that points to another arrow. This is useful for e.g. one arrow representing a reaction and another arrow representing catalysis of that reaction by an enzyme.
First you need to create an anchor on the target line. There are two ways to do this:
- Select the line and press Ctrl+R
- Right-click on the line and select "add anchor" from the pop-up menu.
You can choose two different anchor styles,
- Circle, a big, visible dot.
- None, represented as a tiny dot.
To select an anchor style, right-click on it and choose from the anchor type menu.
Changing the drawing order
You can change the drawing order of objects in a Pathway by right-clicking on an element. However, in the current version of PathVisio the drawing order is lost after saving and re-opening a pathway. This is a known issue and we're working to resolve this.
Adding references to database identifiers
When creating a pathway, it is useful to link the genes to public database identifiers. To do this, you must first select one the synonym databases distributed by PathVisio:
- Download and install the synonym database you want to use (How?).
- Go to Data > Select Gene Database and choose the database file you want to use. The status bar at the bottom of the PathVisio window will show which gene database is currently used.
- Select the gene product for which you want to enter an identifier.
Now there are two ways to set the right identifier
- Double-click on the gene. This opens a dialog. Enter a gene identifier or (partial) gene name and click "search". If there is one or more result found, you can select it from a list.
- In the property panel on the right side of the PathVisio main window, enter the database identifier in the "Name" property, and select the database the identifier belongs to from the System Code property.
Certain genes may not be found in the synonym databases that come with PathVisio. You may still fill out the gene name and identifier of your choosing, but PathVisio will not be able to cross-reference this to online databases.
Adding Comments
To add comments to any element, double-click on it to open a properties dialog. Click on the "comments" tab, and click "add comment", to create a new comment. You can add unlimited separate comments to any element in a pathway.
Adding Literature References
To add a literature reference to any element, double click on it to open a properties dialog. Click on the "Literature" tab and click the "add" button. The easiest way to enter a reference is to enter a pubmed identifier if you know it, and click the "Query pubmed" button. PathVisio will connect with pubmed over the internet and obtain the full reference info. Alternatively, you can fill in the fields for Title, Year, Author and Journal manually.
Editing object properties
In the property panel on the right side of the PathVisio window you have precise control over all the properties of the object you have currently selected. For example, you can position objects by entering exact coordinates, change their color or size. You can also change the font and style of labels, or add notes to any object. For a lot of properties, editing them in the property panel is the only way to change them. Not all object types have the same set properties, for example, only Gene products have the "Gene ID" property.
To edit a certain property, double-click on the entry in the table of properties to make the field editable, then enter the desired value.
It is possible to select multiple objects at once, and edit one property for all of them.
Changing general pathway information
To access the general pathway properties, click on the info box that is always present in the top-left of every pathway. When the info box is selected, you can edit pathway properties in the property pane on the right. You can set the following properties:
- Pathway Name
- Author, you could fill out more than one author here if you like
- Email, email address of the contact person for this pathway (usually the maintainer)
- Maintainer, full name of the maintainer of this pathway
- Data-Source, source where this pathway is derived from such as another pathway database. May be left empty if there is no suitable data source.
- Organism, the organism that this pathway is designed for.
- Availability, this should describe the terms of usage for this pathway. We encourage the use of Creative Commons licenses for pathway data.
Importing and exporting to / from GenMAPP
You can import/export your pathway from and to the GenMAPP format (.mapp).
- To import a GenMAPP pathway, go to File > Import and select the GenMAPP file you want to import. The pathway will now be loaded in PathVisio where you can edit it and save it to the GPML format, or export it back to a GenMAPP file.
- To export your PathVisio pathway to a pathway file that can be read by GenMAPP, go to File > Export. Specify the GenMAPP file you want to save the pathway to.
NOTE: Importing / exporting GenMAPP format is only available on the Microsoft Windows operating system.
Browsing Pathways
Searching pathways for genes
PathVisio allows you to search for pathways that contain a given gene product. You can search pathways in two ways:
- Gene symbol (gene name)
- Gene identifier (as defined in the synonym database)
Search by gene symbol
This method searches for all pathways containing one or more gene products for which the gene symbol contains the specified search text. To find only gene products where the symbol exactly matches the search text, add "\b" before and after the text (e.g. "\bTP53\b" to search for exact matches of "TP53"). To start the search, go to the Pathway search tab in the side panel:
|
You can choose to search by gene symbol or gene id using the drop-down list (red arrow). This screen shot shows how to search by gene symbol:
|
Search by gene identifier
This method searches for all pathways containing one or more gene products that correspond to the given identifier. Gene products that are linked via cross references in the synonym database are also included. To start the search, go to the Search tab in the side panel:
|
You can choose to search by gene identifier using the drop-down list (red arrow). This screen shot shows how to search by gene identifier:
|
Browsing search results
The search results will be displayed in the side panel. For example, results of a gene symbol search for "TNF" in all human pathways will look like this:
|
The results table (yellow box) shows the pathway that have one or more gene-products with a symbol that contains "TNF". If you double-click on a pathway in the table, PathVisio will open this pathway and highlight the gene products that were found (here the gene products are highlighted with a green box). You can disable this highlighting by unchecking highlight found genes (red arrow). |
Retrieving Databases and Pathways
Downloading and Installing pathways
In the download section on the Main Page you can find a link to zip files containing available pathways converted from GenMAPP. After you download one or more of these zip files, unpack them to the PathVisio\pathways folder in your user directory (e.g C:\Documents and Settings\Username\Pathvisio\pathways on windows XP). This is the default folder where Pathvisio looks for pathway files.
Downloading and Installing synonym databases
Gene Databases provide PathVisio with cross references between gene identifiers in pathways and other database identifiers. Databases are downloaded and installed locally for use with PathVisio. In step 3 in the download section you can find a link to these databases. Download and unzip databases to the Pathvisio\gene databases directory in your user directory on your computer. Be warned: gene databases can take up a lot of disc space. For the human gene database you need at least 400 Mb of free disk space.
Available synonym databases
Our synonym databases contain the latest information from Ensembl. A synonym database for any species that is in Ensembl can be generated. You can get the available synonym databases from the download page. If you would like to have a PathVisio synonym database for another species that is present in Ensembl, please contact us.
Supported database systems
Each supported public database supported in PathVisio synonym databases has a specific system code that is used to identify which system a particular identifier belongs to in PathVisio. The following list displays the proper system codes for PathVisio:
system codes for genes and proteins:
| A | TAIR |
| Ag | Agilent |
| Bg | BioGrid |
| C | Cint |
| Cc | CCDS |
| D | SGD (S. cerevisiae) |
| E | EC (Enzyme Code) |
| Ec | Ecoli |
| Em | EMBL |
| En | Ensembl |
| F | FlyBase |
| G | GenBank |
| Ge | CodeLink |
| Gg | Gramene GenesDB |
| Gm | Gramene Pathway |
| Gp | GenPept |
| H | HUGO |
| Hs | HsGene |
| I | InterPro |
| Il | Illumina |
| Ip | IPI |
| Ir | IRGSP Gene |
| L | Entrez Gene |
| M | MGI (M. musculus) |
| Mb | miRBase |
| N | NASC Gene |
| Nw | NuGO Wiki |
| Om | OMIM |
| Pd | PDB |
| Pf | Pfam |
| Pl | PlantGDB |
| Q | RefSeq |
| R | RGD (R. norvegicus) |
| Rf | Rfam |
| S | Uniprot/TrEMBL |
| Sn | dbSNP |
| T | Gene Ontology |
| Ti | J. Craig Venter Institute (formerly TIGR) |
| U | UniGene |
| Uc | UCSC Genome Browser |
| W | WormBase |
| X | Affymetrix Probe Set ID |
| Z | ZFIN |
| O | Other, for use with a custom database |
system codes for metabolites:
| Ca | Chemical Abstracts (CAS) |
| Ce | ChEBI |
| Ch | HMDB |
| Cp | PubChem |
| Cs | Chemspider |
Other Features
Setting Preferences
Several preferences can be set in PathVisio to customize the appearance and usage of the program. The Preferences menu can be accessed from the View menu. Options in the Preferences menu are:
Display
- The initial size of the right side panel when PathVisio is first opened can be set. The default value for this is 30.
- Gene product rectangles (data nodes) can be displayed with rounded or squared edges. The default is squred edge display.
- Movement of lines on the drawing board can be either be free or lines can be set to "snap-to-angles" to facilitate drawing. The default is free movement. The degree of angles for "snap-to-angle" is user-defined as well, the default is 15.
- Support for MIMs (Molecular Interaction Maps) can be loaded automatically when PathVisio starts. MIMs support is not loaded by default.
- Advanced object attributes, such as references, can be displayed. Advanced attributes are hidden by default.
- Coloring preferences for managing the coloring of highlighted and selected objects, as well as options for coloring related to displaying data.
- Default color for "no criteria met"
- Default color for "gene not found"
- Default color for "no data found"
- Outline color for selected objects
- Outline color for highlighted objects
NOTE: You need additional plug-ins for displaying data on pathways with PathVisio 1.1
Directories
Preferences for file locations determine in which directories PathVisio will look for files by default. The following file locations can be set:
- gpml pathways
- Gene databases
- Expression Datasets
NOTE: You need additional plug-ins to display data on pathways in PathVisio 1.1
Files
The location of where the PathVisio log file will be stored can be set as a preference.
Database
The type of database connector for gene databases and expression datasets can be stored as a preference.
NOTE: PathVisio 1.1 requires additional plug-ins to support display of microarray data on pathways.
Keyboard shortcuts
Ctrl + N: Create a new pathway
Ctrl + O: Open a pathway
Ctrl + S: Save pathway
Ctrl + C: Copy
Ctrl + V: Paste
Ctrl + left mouse click: Add or remove object from selection
Ctrl + A: Select all gene products on the pathway
Ctrl + G: Create a group of selected objects
Ctrl + Z: Undo
Using Molecular Interaction Map styles
PathVisio has limited support for Molecular Interaction Map styles (aka Kohn Maps). See the following link for a legend of MIM symbols: [1]
See Also
PathVisio 2.0
Pathway statistics
PathVisio uses the statistical environment R to perform pathway statistics. Before you start check the following:
- Make sure you've installed R (>= v2.2.1). Download it here.
- Windows
- During the installation of R, be sure to check "Add version to registry" (this is the default setting)
- Linux (as always....slightly more complicated ;-)
- Make sure the R binary is in you PATH variable (check this by typing 'R' in the shell, if R starts then it's fine)
- Make sure that the directory containing libR.so is in your LD_LIBRARY_PATH variable
- When you perform pathway statistics for the first time, make sure you have sudo user rights, because PathVisio needs to install an R-package
Furthermore, make sure that:
PathVisio performs pathway statistics in two steps:
- Export the pathways and expression data to R
- Apply a statistical function
You will be guided through these steps by a wizard. Start this wizard by going to Data->Pathway statistics->Perform statistical test. After the progress bar is finished, you'll see a new window:
|
In this window you can choose between two options:
Let's assume you didn't already exported your data. In the Include pathways in directory field (green box), fill in the directory that contains the pathways you want to include in the statistical test. All pathways in this directory and sub-directories will be exported to R. You can optionally specify the variable names of the objects in R (purple arrows). In the Save as R data file field (red box), specify a file to which you want te data to be saved. This file can be re-used later to perform other statistical tests (if you select the Load previously exported data option). |
Display experimental data
- Select the right synonym database; (How?)
- Create an expression dataset; (How?)
- Create a color-set; (How?)
- Create a visualization; (How?)
- Open the pathway on which the data has to be displayed
- Apply the visualization (How?)
Create an expression dataset
Start data import wizard
- Go to Data->Create New Expression Dataset
- Be sure your raw data is in the right format ( How?)
- The gene database can be selected before importing the data (with Data->Select Gene Database), but this can still be adjusted during the wizard.
Pages during wizard
- File Locations: select location for the text file with expression data, select location to save expression dataset and, if necessary, change location of gene database
- Header Information and Delimiter: select the line at which the header starts and the line at which the data starts. If more headers are present, for example at row 1 and 2, select header' at line 1 and 'data' at line 3. Select delimiter present in the data.
- Column Information: Select column with gene ID's and column with system code. If no system code is present in the data, select the used system code manually.
- Create expression dataset: Finish the import
Preparing raw data for PathVisio
To visualize data on PathVisio pathways you need to convert your raw data to an Expression dataset. Such a dataset can be created from a text file, which can be prepared with most spreadsheet software (e.g. Microsoft Excel). This file must have the following structure:
- It can contain a header line, which provides a name for every column, but this is not necessary. The names of the columns define the names of the samples (variables) in your dataset.
- It must contain two columns that contain respectively the gene identifier and system code. If no System Code is present, it can also be manually selected during the wizard. Every other column contains data associated with the gene-products. The names of the columns containing the id and code is arbitrary, you can specify which column contains the id/code while creating the dataset.
An example of the structure of a raw data file:
| ID | Code | Sample 1 | ... | Sample n |
| ENSG0001 | En | 103 | ... | 264 |
| ... | ... | ... | ... | ... |
| 10000_at | X | 34 | ... | 326 |
Creating color-sets
With a color-set you can assign colors to your data that can be used for visualization on the pathway drawings. Color-set information will be stored together with your dataset, so before you can create a color-set you have to select or create an expression dataset (How?). To create or edit a color-set, go to Data->Color Set Manager or click the color-set button on the toolbar (screenshot). This will open a new window (screenshot). Click the Add button in the top-left of the window to add a new color-set. In the upper part of the window you can change the name of the color-set and choose colors for the following situations:
- No gene found
- The gene-product could not be found in the gene database
- No data found
- The dataset contains no dat for this gene-product
- No criteria met
- The data for this gene-product did not met one of the criteria that are defined in this color-set
You can assign color-codings by adding one or more criteria to the color-set. After adding or selecting a color-set, click the Add button on the bottom of the screen. A dialog will appear that asks you what type of criterion to create. You have two options:
- Color by gradient
- Use this option when you want to color a range of values. For example, the example gradient displayed below applies a color-shading from green to yellow to red for the data values between -1 and 1:
- Color by boolean expression
- Use this option when you want to assign a color to a specific data value. You can specify this data value by a Boolean expression, e.g. "[foldchange] > 2". This type of criterion is similar to the way GenMAPP applies color-codings.
After you have selected one of the options and clicked Ok, a new criterion will appear in the criterion list and options to configure the criterion will appear on the right of this list. For details on how to configure a criterion see here for the gradient criterion, and here for the boolean expresssion criterion.
You can add multiple color criteria to a color-set, which will all be displayed in the criteria list. Clicking a criterion in the list will allow you to modify its settings. When a color-set contains multiple criteria, the first criterion in the list will be evaluated last and override the previous criteria. The criteria in the criteria list can be dragged to another position to change the evaluation order.
Example
In this example, a gradient from blue to yellow is created that colors the data in a range from -3 to 3. For example, blue could represent down-regulated genes (foldchange <= -3), yellow upregulated (foldchange >= -3) and everything in between will have a shade between blue and yellow:
To create this color-set, go to Data->Color set manager. A new window opens, which is shown in the screenshot below.
|
|
You now see that the new gradient, named "New gradient", appears in the criteria list (red arrow). Also new items are added to the window that allow you to configure the gradient. The table in the green circle shows the colors and corresponding values for the gradient. We need a gradient with only two colors, so let's first delete one color. Do this by clicking the red color or value and then clicking "Remove color". |
|
Now we have to change the green color to blue. First select the green color by clicking on the green color box. Now a button appears (red arrow). Click this button and a dialog will appear where you can pick the color you want. Choose blue as color and click ok, the color of the box will now change to blue. |
Finally, the values to which the colors will be assigned have to be changed. We want to assign -3 (down-regulated) to blue and 3 (up-regulated) to yellow. To do this, click on the values right to the color boxes and type in the right number. Now the window should look like this:
Creating visualizations
To visualize your expression dataset, you need to create a visualization. A visualization is a collection of plug-ins, each responsible providing a different way of visualizing the data. A plug-in can show up at three locations in PathVisio: the pathway drawing, the tool tip and the visualization tab in the side panel. By selecting which plug-ins are active in what places, you can adapt the visualization to your dataset. How to create a visualization is best explained with an example which you can find below.
Example
Here is an example on how to create a simple visualization, that shows the gene-product label, colors the gene-product box according to your data and displays your data in a tool-tip. Here we use an example dataset that contains 5 samples, each corresponding to a tumor progression stage and containing the foldchange relative to the gene-expression in the normal sample.
Create the color-set
First a color-set has to be created to assign colors to the data. In this example we use the color-set that is described in the example of the section Creating color-sets. First create the color-set described in this example before you continue.
Create the visualization
Now you have created a color-set that assigns colors to the data, you have to create a visualization to specify how the data will be visualized. The best way to see how the visualization options affect how the data is shown is to open a pathway before creating the visualization. When you now change a setting in the visualization, you can immediately see the effect of it on the pathway. So open a pathway of the right organism (in case of the example data, a human pathway) before going to the next step (how?).
Data visualization is performed by plug-ins, that each show data in a different way. For example, there is a plug-in to color gene-boxes, a plug-in to show numerical data and a plug-in that displays the gene names. By creating a visualization, you specify which plug-ins should be activated and you tell the plug-ins what data they have to show.
Go to Data->Visualizations. This opens a new window where you can create and edit visualizations:
| Create a new visualization by clicking the Add button (green circle). |
A dialog shows up that asks you a name for the visualization. Let's call it "tumor progression" and click "Ok".
|
An item called "tumor progression" shows up in the visualizations list (green circle). Also the plug-in table appears, where you have to activate the plug-ins (purple box). This table shows the names of the available plug-ins (blue arrow), followed by three check-boxes (red arrow). Using these check-boxes you can activate or deactivate the plug-in at a location in the program. A plug-in can be active at three locations in the PathVisio:
So if you check the check-box in the column named "Drawing", the plug-in will visualize the data in the gene-boxes. |
We want to color the gene-box according to the data and also display the gene name in the gene-box, therefore two plug-ins need to be activated at the "drawing" location.
To color the gene-box, the "Color by expression" plug-in needs to be activated. This can be done by clicking the checkbox in the column called "Drawing". You now see that the previously greyed-out "Drawing object..." button becomes clickable. By clicking this button, you can change how the plug-in colors the gene-products. Click on this button and a new window apprears that shows a list of avaliable samples on the left, and an emtpty list called "Selected samples" on the right. Samples in the right list will be displayed in the gene-box. For our dataset, the available samples are the tumor progression stages. To show all progression stages, select them in the "Available samples" list and click the ">" button, to transfer them to the "Selected samples" list. To the right of the items in the "Selected samples" list, the name of the color-set that will be used to color this sample is shown. If you have multiple color-sets, you can click this name and select the color-set you want to use to color the corresponding sample. In this case we only have a single color-set that we want to use for all samples, so keep the default setting. For an explanation on the other settings in this window see here.
To see what effect the modifications on the visualization have on the pathway, shift the visualization window aside to see the pathway drawing. You can see that the gene-boxes on the pathway are colored when you add the progression stages to the "Selected samples" list:
| Before: | After: |
You see that the label in the gene-boxes is replaced by the visualization, which consists of the plug-in you activated. To show the label on the gene-box, you also have activate the "Gene product label" plug-in. To do this, close the configuration window of the "Color by expression" plug-in and check the checkbox in the "Drawing" column for the "Gene product label" plug-in. Again you see the pathway drawing change:
| Before: | After: |
The result of these steps is a visualization which displays the gene-label and the color-coded data in the gene-box:
Apply a visualization
Be sure:
- you selected the right synonym database (how?)
- you selected an expression dataset (how?)
- you have created one or more visualizations (how?)
You can apply a visualization by using the toolbar:
|
