The KEGG Markup Language (KGML) is an exchange format of the KEGG graph objects, especially the KEGG pathway maps that are manually drawn and updated. KGML enables automatic drawing of KEGG pathways and provides facilities for computational analysis and modeling of protein networks and chemical networks.
The KEGG pathway maps are graphical image maps representing networks of interacting molecules responsible for specific cellular functions. There are two types of KEGG pathways:
The KGML files contain computerized information about graphical objects and their relations in the KEGG pathways as well as information about orthologous gene assignments in the KEGG GENES database.
In KGML the pathway element specifies one graph object with the entry elements as its nodes and the relation and reaction elements as its edges. The relation and reaction elements indicate the connection patterns of rectangles (gene products) and the connection patterns of circles (chemical compounds), respectively, in the KEGG pathways. The two types of graph objects, those consisting of entry and relation elements and those consisting of entry and reaction elements, are called the protein network and the chemical network, respectively. Since the metabolic pathway can be viewed both as a network of proteins (enzymes) and as a network of chemical compounds, another distinction of KEGG pathways is:
The following figure shows an overview of KGML.
The pathway element is a root element, and one pathway element is specified for one pathway map in KGML. The entry, relation, and reaction elements specify the graph information, and additional elements are used to specify more detailed information about nodes and edges of the graph.
The pathway element specifies graph information stored in the KEGG pathway map. The attributes of this element are as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
name | keggid.type | the KEGGID of this pathway map | REQUIRED |
org | maporg.type | ko/ec/[org prefix] | REQUIRED |
number | mapnumber.type | the map number of this pathway map | REQUIRED |
title | string.type | the title of this pathway map | IMPLIED |
image | url.type | the resource location of the image file of this pathway map | IMPLIED |
link | url.type | the resource location of the information about this pathway map | IMPLIED |
The name attribute contains the KEGG identifier of this pathway map.
attribute value | explanation |
path:ko***** path:[org prefix]***** |
the KEGGID of this pathway map ex) name="path:ko00010" name="path:hsa00010" |
Here ***** represents the pathway map number and [org prefix] is a three-letter species code in KEGG.
The org attribute specifies the classification of this pathway map. The distinction of reference pathways and pathways for various organisms is made according to the attribute value.
attribute value | explanation |
ko | the reference pathway map represented by KO identifiers |
ec | the reference pathway map represented by ENZYME identifiers |
[org prefix] | the organism-specific pathway map for "org" |
The number attribute specifies the five-digit pathway map number.
attribute value | explanation |
five-digit integer | ex) number="00030" |
The title attribute specifies the title of this pathway map.
attribute value | explanation |
string | ex) title="Pentose phosphate pathway" |
The image attribute specifies the resource location of the image file for this pathway map in the KEGG Web service.
attribute value | explanation |
URL | ex) image="https://www.kegg.jp/kegg/pathway/ko/ko00010.png" |
The link attribute specifies the resource location of the information about this pathway map in the KEGG Web service.
attribute value | explanation |
URL | ex) link="https://www.kegg.jp/kegg-bin/show_pathway?ko00010" |
The entry element contains information about a node of the pathway. The attributes of this element are as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
id | id.type | the ID of this entry in the pathway map | REQUIRED |
name | keggid.type | the KEGGID of this entry | REQUIRED |
type | entry_type.type | the type of this entry | REQUIRED |
link | url.type | the resource location of the information about this entry | IMPLIED |
reaction | keggid.type | the KEGGID of corresponding reaction | IMPLIED |
The id attribute specifies the identification number of this entry. Each entry element in the pathway element is uniquely specified according to this id attribute value.
attribute value | explanation |
positive integer | the identification number of this entry |
The name attribute contains the KEGG identifier of this entry, which is generally in the form of db:accession where db is the database name and accession is the accession number.
attribute value | explanation |
path:(accession) | pathway map ex) name="path:map00040" |
ko:(accession) | KO (ortholog group) ex) name="ko:E3.1.4.11" |
ec:(accession) | enzyme ex) name="ec:1.1.3.5" |
rn:(accession) | reaction ex) name="rn:R00120" |
cpd:(accession) | chemical compound ex) name="cpd:C01243" |
gl:(accession) | glycan ex) name="gl:G00166" |
[org prefix]:(accession) | gene product of a given organism ex) name="eco:b1207" |
group:(accession) | complex of KOs If accession is undefined, "undefined" is specified. ex) name="group:ORC" |
The type attribute specifies the type of this entry. Note that when the pathway map is linked to another map, the linked pathway map is treated as a node, a clickable graphics object (round rectangle) in the KEGG Web service.
attribute value | explanation |
ortholog | the node is a KO (ortholog group) |
enzyme | the node is an enzyme |
reaction | the node is a reaction |
gene | the node is a gene product (mostly a protein) |
group | the node is a complex of gene products (mostly a protein complex) |
compound | the node is a chemical compound (including a glycan) |
map | the node is a linked pathway map |
brite | the node is a linked brite hierarchy |
other | the node is an unclassified type |
The link attribute specifies the resource location of the information about this entry in the KEGG Web service. In the organism-specific pathways, this attribute is not defined if the organism does not have the entry (gene).
attribute value | explanation |
URL | ex)link="https://www.kegg.jp/dbget-bin/www_bget?eco+b1207" |
The reaction attribute specifies the KEGGID of the corresponding chemical reaction(s) in the KEGG LIGAND database.
attribute value | explanation |
rn:(accession) | ex)reaction="rn:R02749" |
The graphics element is a subelement of the entry element, specifying drawing information about the graphics object.
attribute name | data type | explanation | REQUIRED/IMPLIED |
name | string.type | the label of this graphics object | IMPLIED |
x | number.type | the X axis position of this graphics object | IMPLIED |
y | number.type | the Y axis position of this graphics object | IMPLIED | coords | string.type | the polyline coordinates | IMPLIED |
type | graphics.type | the shape of this graphics object | IMPLIED |
width | number.type | the width of this graphics object | IMPLIED |
height | number.type | the height of this graphics object | IMPLIED |
fgcolor | graphics-color.type | the foreground color used by this graphics object | IMPLIED |
bgcolor | graphics-color.type | the background color used by this graphics object | IMPLIED |
The name attribute contains the label that is associated with this graphics object. When two or more name attributes are specified in the same entry element, the first one is taken as the attribute value. When the type attribute value of the entry element is "gene", the gene name is specified for this attribute value.
attribute value | explanation |
string | the label of this graphics object ex) name="1.1.1.43" name="Methane metabolism" |
The x attribute specifies the x-coordinate value of this graphics object in the manually drawn KEGG pathway map.
attribute value | explanation |
positive integer | ex) x="190" |
The y attribute specifies the y-coordinate value of this graphics object in the manually drawn KEGG pathway map.
attribute value | explanation |
positive integer | ex) y="51" |
The coords attribute specifies a set of coordinates, x1,y1,x2,y2,..., for the line object.
attribute value | explanation |
string | ex) coords="573,729,573,779" |
The type attribute specifies the shape of this object. The default value is "rectangle".
attribute value | explanation |
rectangle | the shape is a rectangle, which is used to represent a gene product and its complex (including an ortholog group). |
circle | the shape is a circle, which is used to specify any other molecule such as a chemical compound and a glycan. |
roundrectangle | the shape is a round rectangle, which is used to represent a linked pathway. |
line | the shape is a polyline, which is used to represent a reaction or a relation (and also a gene or an ortholog group). |
The width attribute specifies the width this object. The default value is "45".
attribute value | explanation |
positive integer | ex) width="73" |
The height attribute specifies the height of this object. The default value is "17".
attribute value | explanation |
positive integer | ex) height="34" |
The fgcolor attribute specifies the foreground color of this object. It applies to the frame and the character string. The default value is "#000000".
attribute value | explanation |
numerical RGB | ex) fgcolor="#000000" |
The bgcolor attribute specifies the background color of this object. The default value is "#FFFFFF". The background color for the gene product is "#BFFFBF".
attribute value | explanation |
numerical RGB | ex) fgcolor="#BFFFBF" |
The component element is a subelement of the entry element, and is used when the entry element is a complex node; namely, when the type attribute value of the entry element is "group". The nodes that constitute the complex are specified by recurrent calls. For example, when the complex is composed of two nodes, two component elements are specified. The attribute of this element is as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
id | idref.type | the ID of the component which is part of the complex | REQUIRED |
The id attribute specifies the identification number of this component. The entry element of "group" type is specified by a complete set of component elements.
attribute value | explanation |
positive integer | the identification number of this component |
The relation element specifies relationship between two proteins (gene products) or two KOs (ortholog groups) or protein and compound, which is indicated by an arrow or a line connecting two nodes in the KEGG pathways. The relation element has a subelement named the subtype element. When the name attribute value of the subtype element is a value with directionality like "activation", the direction of the interaction is from entry1 to entry2. The attributes of this element are as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
entry1 | idref.type | the first (from) entry that defines this relation | REQUIRED |
entry2 | idref.type | the second (to) entry that defines this relation | REQUIRED |
type | relation-type.type | the type of this relation | REQUIRED |
The entry1 attribute specifies the id attribute value of the first entry element.
attribute value | explanation |
positive integer | the ID of node which takes part in this relation |
The entry2 attribute specifies the id attribute value of the second entry element.
attribute value | explanation |
positive integer | the ID of node which takes part in this relation |
The type attribute specifies one of three types of relations, so-called the generalized protein interactions in KEGG, and additional PCrel for interaction between a protein and a chemical compound, and maplink for linkage between a protein and a map. The maplink relation is provided for interaction between a protein and another in the specified map.
attribute value | explanation |
ECrel | enzyme-enzyme relation, indicating two enzymes catalyzing successive reaction steps |
PPrel | protein-protein interaction, such as binding and modification |
GErel | gene expression interaction, indicating relation of transcription factor and target gene product |
PCrel | protein-compound interaction |
maplink | link to another map |
The subtype element specifies more detailed information about the nature of the interaction or the relation. The attributes of this element are as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
name | subtype-name.type | Interaction/relation information | REQUIRED |
value | subtype-value.type | Interaction/relation property value | REQUIRED |
The name attribute specifies the subcategory and/or the additional information in each of the three types of the generalized protein interactions. The correspondence between the type attribute of the relation element (ECrel, PPrel or GErel) and the name and value attributes of the subtype element is shown below.
name | value | ECrel | PPrel | GErel | Explanation |
compound | Entry element id attribute value for compound. | * | * | shared with two successive reactions (ECrel) or intermediate of two interacting proteins (PPrel) | |
hidden compound | Entry element id attribute value for hidden compound. | * | shared with two successive reactions but not displayed in the pathway map | ||
activation | --> | * | positive and negative effects which may be associated with molecular information below | ||
inhibition | --| | * | |||
expression | --> | * | interactions via DNA binding | ||
repression | --| | * | |||
indirect effect | ..> | * | * | indirect effect without molecular details | |
state change | ... | * | state transition | ||
binding/association | --- | * | association and dissociation | ||
dissociation | -+- | * | |||
missing interaction | -/- | * | * | missing interaction due to mutation, etc. | |
phosphorylation | +p | * | molecular events | ||
dephosphorylation | -p | * | |||
glycosylation | +g | * | |||
ubiquitination | +u | * | |||
methylation | +m | * |
The reaction element specifies chemical reaction between a substrate and a product indicated by an arrow connecting two circles in the KEGG pathways. The reaction element has the substrate element and the product element as subelements. The attributes of this element are as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
id | idref.type | the ID of this reaction | REQUIRED |
name | keggid.type | the KEGGID of this reaction | REQUIRED |
type | reaction-type.type | the type of this reaction | REQUIRED |
The id attribute specifies the identification number of this reaction.
attribute value | explanation |
positive integer | the identification number of this reaction |
The name attribute contains the KEGG identifier of the REACTION database.
attribute value | explanation |
rn:(accession) | ex) reaction="rn:R02749" |
The type attribute specifies the distinction of reversible and irreversible reactions, which are indicated by bi-directional and uni-directional arrows in the KEGG pathways. Note that the terms "reversible" and "irreversible" do not necessarily reflect biochemical properties of each reaction. They rather indicate the direction of the reaction drawn on the pathway map that is extracted from text books and literatures.
attribute value | explanation |
reversible | reversible reaction |
irreversible | irreversible reaction |
The substrate element specifies the substrate node of this reaction. The attribute of this element is as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
id | idref.type | the ID of this substrate | REQUIRED |
name | keggid.type | KEGGID of substrate node | REQUIRED |
The id attribute specifies the identification number of this substrate.
attribute value | explanation |
positive integer | the identification number of this substrate |
The name attribute contains the KEGG identifier of the COMPOUND database or the GLYCAN database.
attribute value | explanation |
cpd:(accession) gl:(accession) |
ex) cpd:C05378 gl:G00037 |
The product element specifies the product node of this reaction. The attribute of this element is as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
id | idref.type | the ID of this product | REQUIRED |
name | keggid.type | the KEGGID of product node | REQUIRED |
The id attribute specifies the identification number of this product.
attribute value | explanation |
positive integer | the identification number of this product |
The name attribute contains the KEGG identifier of the COMPOUND database or the GLYCAN database.
attribute value | explanation |
cpd:(accession) gl:(accession) |
ex) cpd:C05378 gl:G00037 |
The alt element specifies the alternative name of its parent element. The attribute of this element is as follows.
attribute name | data type | explanation | REQUIRED/IMPLIED |
name | keggid.type | the KEGGID of node | REQUIRED |
The name attribute contains the KEGG identifier of the COMPOUND database or the GLYCAN database.
attribute value | explanation |
cpd:(accession) gl:(accession) |
ex) cpd:C05378 gl:G00037 |
Each KGML file may be obtained through KEGG API (academic users only).
Academic users with KEGG FTP subscription can obtain the entire set of KGML files.
Non-academic users are requested to obtain a licensing agreement. Please refer to the page below.