This page documents the KGML to GPML converter that can be used to convert pathways from  Kegg for use in PathVisio and WikiPathways.

How to build

You can build the kegg converter by using the ant build script.

  • Check out the PathVisio code from the repository:
    svn checkout http://svn.bigcat.unimaas.nl/pathvisio/trunk pathvisio
    
  • Build the PathVisio jars
    cd pathvisio
    ant jar
    
  • Build the kegg converter package
    cd tools/KeggConverter
    ant dist
    
  • All the files you need to run the converter are now in the dist directory.

How to run

Just run the jar you built in the previous step:

cd dist
java -jar kegg_converter.jar -species "Mus musculus" -kgml /home/user/kegg-20090203/mmu -out /home/user/kegg-20090203/mmu/gpml -useMap

Required parameters:

  • -species: The species of the pathways to convert, use the latin name (e.g. "Homo sapiens").
  • -kgml: The path to the kgml files. You can download the kgml files from the  Kegg ftp.
  • -out: The path to write the converted files to.

Optional parameters:

  • -useMap: Use both the species specific and reference pathways for conversion. Both types of pathways need to be available in the kgml path and named following the Kegg convention (e.g. 'mmu00020.xml' for the species specific mouse pathway and 'map00020.xml' for the reference pathway). It's recommended to turn on this option.
  • -overwrite: Overwrite existing converted files.
  • -offline: Don't use the Kegg web service to fetch gene and metabolite names.
  • -spacing: Stretch the converted pathway by the given factor to create more space for metabolite and gene boxes. By default, all coordinates will be multiplied by 2.

Known problems

Here are some known problems that might occur when converting a KGML file.

  • Maplink directionality is sometimes wrong: The information in the KGML files sometimes just doesn't seem to match with the image you see on the Kegg website.
  • Missing reactions: In rare occasions, a reaction might be missing. This typically occurs when two or more exactly the same reactions are present on the same pathway.
  • Connections to the wrong metabolite: In some occasions, a reaction connects to the wrong metabolite. This typically occurs when multiple instances of the same metabolite are present on the pathway and KGML doesn't provide instance specific information.