GPML is the native format used by PathVisio and WikiPathways.


GPML is simply an XML-based format. You can use it to define a pathway consisting of purely graphical elements (such as lines and shapes) or graphical elements with added biological information (such as genes, proteins and interactions)

GPML has very strong ties to the GenMAPP MAPP format. This is important to realize because it explains some of the idiosyncrasies in the definition that are usually there for backwards compatibility to GenMAPP.


Naming

GPML stands for Graphical Pathway Markup Language. It used to be called GenMAPP Pathway Markup Language which was derived from GMML, for Gen-MAPP markup language. GMML was renamed to GPML to make it more distinct from both GML and XGMML, to markup languages used extensively by the Cytoscape community.


Structure of a GPML file

The root element is always the <pathway> element. Below the element there are three important types of elements:

  • pure Graphical elements: Shape, Label, Line
  • elements with a biological context: DataNode
  • the only element that can connect elements:¬†Interaction

Contrary to most XML definitions, in GPML all elements and attributes start with an uppercase letter

Root level: Pathway

At the root there is always one Pathway element.

  • SubElement Comment
  • Subelement Graphics
    • BoardWidth -> width of the drawing field, all elements should fit in it.
    • BoardHeight -> height of the drawing field, all elements should fit in it.
    • WindowWidth -> exists only for backwards compatibility with GenMAPP.
    • WindowHeight -> exists only for backwards compatibility with GenMAPP.
  • Name -> Pathway Title. Will be displayed in the infobox in PathVisio.
  • Organism -> Using the full latin name, e.g. “Homo sapiens”. There is no equivalent in MAPP format, in MAPP format the file name determines the organism.
  • Data Source -> e.g. ‘Kegg’, ‘GenMAPP’ etc.
  • Version -> GenMAPP Version, use for Mapps exported from GenMAPP only
  • Author -> Name of author of this mapp
  • Maintainer -> Maintainer, if different from author. Maintainer is “GenMAPP.org” for many genmapp pathways
  • Email -> Email address of the maintainer.
  • Copyright -> Our policy is to use Creative Commons licensing
  • Last-modified -> Last time this pathway was updated
  • Biopaxref -> reference to a biopax element
First level: Pathway Elements

Below the Pathway element, the following elements appear (in order):

Comment, Graphics, DataNode, Interaction, Line, Label, Link, Shape, Group, InfoBox, Legend, Biopax.

Note that the first two elements: Comment and Graphics, are not specific to the first level so they are described in the next section.

DataNode

  • Comment
  • Graphics
    • CenterX, CenterY, Width, Height
    • Color
  • Xref
    • Database
    • ID
  • BiopaxRef
  • GraphId
  • GroupRef
  • ObjectType
  • TextLabel
  • BackpageHead
  • GenMAPP-Xref -> deprecated, there for backwards compatibility
  • Type -> one of Unknown, Gene, Protein, GeneProduct or Metabolite

Interaction

  • Comment subelement
  • Graphics Graphics] subelement
    • Point sublement
      • x, y ->
      • GraphRef
      • GraphId
      • ArrowHead, determines presence and type of arrowhead. Possible values include: Line, Arrow, Receptor, ReceptorRound, ReceptorSquare, LigandRound, LigandSquare, TBar. Other values are possible too, but then they are not recognized by GenMAPP.
      • Head -> deprecated, use ArrowHead instead. Will be removed in the future.
    • Color -> either six hexadecimal digits specifying a “html color” (meaning 3 groups of 2 digits, representing the red, green and blue color levels as values from 00 to FF in hexadecimal) or one of several named colors, including the special color “Transparent”.
  • Style -> The line style, one of “Solid” or “Broken”
  • GroupId
  • GraphId
  • Biopaxref

Label

  • Comment
  • Graphics
    • CenterX, CenterY, Width, Height
    • Color
    • FontName
    • FontStyle
    • FontDecoration
    • FontStrikethru
    • FontWeight
    • FontSize

Link

Very similar to Label, with only one additional, optional attribute

  • Href -> a url pointing to another pathway. This will probably be done using pathway identifiers.

The idea of Link is that it will represent links between pathways, i.e. labels in blue, underlined, that you can click on to take you to another pathway. This feature is at the moment not implemented, neither in GenMAPP nor in PathVisio.

This part of the spec may change in the future.

Shape

  • subelement Graphics
    • CenterX, CenterY, Width, Height
    • Color -> default is Black
    • Rotation
    • FillColor -> default is Transparent
  • subelement Comment
  • Biopaxref
  • Graphid
  • GroupRef
  • ObjectType -> “Node”, “Edge” or “Annotation”. defaults to Annotation. Currently unused.
  • Style -> Solid or Broken, like Line.Style

Group

Groups are used to group elements together, to make it possible to select them as a unit. Groups can share a functional relation, e.g. proteins that form a proteincomplex can be grouped. Because they have a groupId they can be part of another group so they can be nested.

  • Comment
  • GroupId
  • GroupRef
  • Style
  • TextLabel
  • GraphId
  • BiopaxRef

Infobox

Infobox is used by GenMAPP, currently ignored by PathVisio.

  • CenterX, CenterY

Legend

Legend is used by GenMAPP, currently ignored by PathVisio

  • CenterX, CenterY

Biopax

A pool of Biopax objects that other elements can refer to. These objects have to be part of the biopax namespace: http://www.biopax.org/release/biopax-level2.owl. PathVisio itself does not test that the objects are valid biopax; as long as they are clean xml and in the right namespace, PathVisio will accept them.

Shared attributes and minor elements

Graphics

Many elements have a Graphics subelement. Each Graphics subelement has an implementation that is specific to the super element, i.e. the Graphics subelement of a DataNode is totally different from the Graphics subelement of a Label. The only shared feature is that the Graphics subelement groups purely graphical attributes together. Different Graphics subelements are described in the sections of the super elements.

Comment

All pathway elements and pathway itself can have zero or more Comment subElements. They have one attribute

  • source -> value designating the source of the comment, i.e. the program or script that added the comment. For pathways converted from genmapp, the source value is either “GenMAPP notes” when a comment came from the notes column or “GenMAPP remarks” when it came from the remarks column.

GroupIds

GPML allows nested groups of elements.

GraphRefs/GraphIds

GraphId’s have the XML Schema ID type. They have to be identifiers consisting of a sequence of letters, digits and underscores not starting with a digit. They are unique with respect to the document.

In the current implementation of PathVisio, GraphIds are randomly generated hexadecimals in the ranges A0000 to FFFFF and a00 to FFF. This is only a quick & dirty approach to generating identifiers that comply with the XML Schema ID type. GraphIds do not have to be interpretable as a hexadecimal number and applications should not rely on this aspect.

GraphId’s are used to link element together. Elements that have a graphRef attribute use it to refer to a GraphId of another element.

GraphId’s are only meaningful within a pathway: if an element is copied from one pathway to another, it’s graphId may be changed. It is possible that two pathways have totally different elements with the same GraphId by coincidence, in fact it would be impossible to prevent this from happening.

Biopaxref

Reference to any biopax element stored in the Biopax object pool. This is the means to link gpml elements to biopax definitions.


Technical documentation

GPML2013a (current) Schema documentation
GPML2010a Schema documentation
GPML2008a Schema documentation
GPML2007 Schema documentation