Booklet for the PiER
2022-05-01
Section 1 Background

FIGURE 1.1: The logo for the PiER. The above-water pillar structure in red (symbolising the infrastructure) and water waves in blue (by analogy the piano stave) collectively illustrate the web-based PiER facilities enabling ab initio and real-time genetic target prioritisation.
Motivation
The field of target discovery has been advanced by genetics-led target prioritisation approaches. Integrative prioritisation for early-stage genetic target discovery has proven cost-effective in promoting the translational use of disease genetic associations, which is increasingly recognised in reducing drug attrition rate in late-stage clinical trials.
Design
Building on the verified Pi approach (see Nature Genetics 2019), here I introduce web-based servers/facilities called PiER
. The PiER is free and open to all users and there is no login requirement, allowing the users to perform ab initio and real-time target prioritisation harnessing human disease genetics, functional genomics and protein interactions.
By analogy to the piano stave, the PiER consists of five horizontal lines, with three lines representing the elementary facility (eV2CG
, eCG2PG
and eCrosstalk
), each doing specific tasks on their own, and the rest two lines signifying the combinatory facility (cTGene
and cTCrosstalk
).
eV2CG, linking variants to core genes; see Example Output
eCG2PG, networking core genes to peripheral genes; see Example Output
eCrosstalk, identifying the crosstalk between pathways; see Example Output
cTGene, prioritising targets at the gene level; see Example Output
cTCrosstalk, prioritising targets at the crosstalk level; see Example Output
Section 2 Facilities
The elementary facility supports three specific tasks, including three online tools: (i) eV2CG
, utilising functional genomics to link disease-associated variants (including those located at the non-coding genome) to core genes likely responsible for genetic associations; (ii) eCG2PG
, using knowledge of protein interactions to ‘network’ core genes with each other and with additional peripheral genes as well, producing a ranked list of core and peripheral genes; and (iii) eCrosstalk
, exploiting the information of pathway-derived interactions to identify highly ranked genes that mediate the crosstalk between molecular pathways. By chaining together elementary tasks supported in the elementary facility, the combinatory facility enables the automation of genetics-led and network-based integrative prioritisation for genetic targets, both at the gene level (cTGene
) and at the crosstalk level (cTCrosstalk
). Notably, in addition to target crosstalk, the cTCrosstalk
further supports target pathway prioritisation and crosstalk-based drug repurposing analysis (that is, repositioning approved drugs from original disease indications into new ones).

FIGURE 2.1: Schematic illustration of two facilities supported in the PiER.
Section 3 Compatibility
MacOS (Big Sur) | Windows (10) | Linux (Ubuntu) | |
---|---|---|---|
Safari | 14.1.2 | N/A | N/A |
Microsoft Edge | N/A | 85.0.564.67 | N/A |
Google Chrome | 96.0.4664.110 | 90.0.4430.93 | 96.0.4664.110 |
Firefox | 95.0.2 | 95.0.2 | 95.0.2 |
Section 4 Runtime
Facilities | Tools | Runtime (Server + Client) |
---|---|---|
Elementary | eV2CG | (67 + 82) seconds |
Elementary | eCG2PG | (15 + 70) seconds |
Elementary | eCrosstalk | (53 + 71) seconds |
Combinatory | cTGene | (90 + 91) seconds |
Combinatory | cTCrosstalk | (143 + 97) seconds |
Section 5 Frontpage

FIGURE 5.1: The landing frontpage (visited using Google Chrome in MacBook Pro) of the PiER, featuring two facilities (elementary
and combinatory
). The elementary facility includes: (i) eV2CG
, linking disease associated variants (particularly located at the non-coding genomic region) to core genes likely responsible for genetic associations, based on either promoter capture Hi-C (PCHi-C, that is, conformation evidence), quantitative trait locus (QTL) mapping (that is, genetic regulation of gene expression or protein abundance), or simply genomic proximity; (ii) eCG2PG
, using knowledge of protein interactions to ‘network’ core genes with each other and with additional peripheral genes as well, producing a ranked list of core and peripheral genes; and (iii) eCrosstalk
, exploiting the information of pathway-derived interactions to identify highly-ranked genes that mediate the crosstalk between molecular pathways. By chaining together elementary tasks supported in the elementary facility, the combinatory facility enables automation of genetics-led and network-based integrative prioritisation for genetic targets: (iv) at the gene level (cTGene
); and (v) at the crosstalk level (cTCrosstalk
). Also included is the tutorial-like booklet (in an HTML format) describing step-by-step instructions on how to use.
Section 6 Development
The PiER was developed using a next-generation Perl web framework Mojolicious that requires nearly zero-effort maintenance for interface updates. The PiER was also built using Bootstrap that supports the mobile-first and responsive webserver. The source codes are made available at GitHub.

FIGURE 6.1: The screenshots for the PiER visited using Google Chrome in iPhone. Left: the frontpage; Right: the eV2CG
interface.
Section 8 Error messages
The error messages will be displayed, for example, if the input into the cTCrosstalk
is invalid (see the screenshot below). Notably, in the results page, a summary of input data is also returned to the users for the reference.

FIGURE 8.1: The screenshot for the error messages shown when the input is invalid, for example, in the cTCrosstalk
interface.
Section 9 eV2CG
9.1 Interface
Input
Step 1
: a list of user-input SNPs, with 1st column for dbSNP rsIDs and 2nd column for significance info (p-values between 0 and 1). The error message will be displayed if the input is invalid. Example input data are shared genetic variants identified from cross-disease genome-wide association studies in inflammatory disorders; see Nature Genetics 2016.
Mechanism
Step 2
: includes SNPs in Linkage Disequilibrium (LD). By default, input SNPs with a typical threshold (p-value < 5e−8) are considered, and additional SNPs in linkage disequilibrium (R2 < 0.8) can be also included according to the European population.Step 3
: uses genomic proximity, quantitative trait locus (QTL), or promoter capture Hi-C (PCHi-C) to identify core genes.More Controls
: fine-tunes parameters involved in steps described above.
Output
- Example Output includes two interactive tables for core genes and evidence used, and a manhattan plot (illustrating scored core genes color-coded by chromosomes). A summary of input data and the runtime (computed on the server side) is also returned to the users for the reference.

FIGURE 9.1: The interface of eV2CG, linking disease associated variants (particularly located at the non-coding genomic region) to (core) genes likely responsible for associations, based on either promoter capture Hi-C (PCHi-C; conformation evidence), quantitative trait locus (QTL) mapping (that is, genetic regulation of gene expression or protein abundance), or simply genomic proximity. The Show/Hide Info
toggle button contains the help information on how to use the eV2CG
, including input, output, mechanism, etc.
9.2 Linking results
Under the tab
Output: core genes
,Manhattan plot
illustrates scored core genes that are color-coded by chromosomes. Also provided is the downloadable PDF file.Under the tab
Output: core genes
,An interactive table
lists core genes linked from the input SNPs, with scores quantifying the level of genes responsible for genetic associations (capped at 100). Genes are cross-referenced and hyperlinked to GeneCards. Also provided is the column Evidence used to define core genes.Under the tab
Output: core genes
,Evidence table
for core genes, showing which SNPs (see the columnSNPs
) are used to define core genes (the columnCore genes
) based on which evidence (see the columnEvidence
). The columnSNP type
tells the SNP type (eitherInput
for use-input SNPs orLD
for LD SNPs). Notably, the columnEvidence
details datasets used: the prefixProximity_
indicative of SNPs in the proximity, the prefixPCHiC_
for PCHi-C datasets, and the prefixQTL_
for e/pQTL datasets.

FIGURE 9.2: Interactive results for the eV2CG
. Under the tab Output: core genes
is a manhattan plot illustrating scores for core genes. The user-input data under the tab Input into eV2CG
are also returned for the exploration.

FIGURE 9.3: Two tabular displays about core genes (top) and evidence (bottom) under the tab Output: core genes
.
Section 10 eCG2PG
10.1 Interface
Input
Step 1
: a list of user-defined core genes, with 1st column for gene symbols, 2nd columns for weights (positive values), such as results fromeV2CG
above. The error message will be displayed if the input is invalid.
Mechanism
Step 2
: networks core genes with each other and with additional (peripheral) genes based on the knowledge of protein interactions, generating a ranked list of core and peripheral genes. It is achieved using the random walk with restart (RWW) algorithm. By default, the restarting probability of 0.7 is set, empirically optimised for immune-mediated diseases; selecting a value smaller than 0.6 is not recommended as there is a higher chance to expect low performance.More Controls
: fine-tunes parameters involved in steps described above.
Output
- Example Output includes an interactive table for core and peripheral genes, and a manhattan plot (illustrating scores for genes color-coded by chromosomes). A summary of input data and the runtime (computed on the server side) is also returned to the users for the reference.

FIGURE 10.1: The interface of the eCG2PG
, using the knowledge of protein interactions to ‘network’ core genes with each other and with additional (peripheral) genes as well, generating a ranked list of core and peripheral genes. The Show/Hide Info
toggle button contains the help information on how to use the eCG2PG
, including input, output, mechanism, etc.
10.2 Networking results
Under the tab
Output: core and peripheral genes
,Manhattan plot
illustrates affinity scores for genes that are color-coded by chromosomes. Also provided is the downloadable PDF file.Under the tab
Output: core and peripheral genes
,An interactive table
lists core and peripheral genes, with scores quantifying the affinity to core genes (sum up to 1). Genes are cross-referenced and hyperlinked to GeneCards.

FIGURE 10.2: Interactive results for the eCG2PG
under the tab Output: core and peripheral genes
. The user-input data the tab Input into eCG2PG
are also returned for the exploration.
Section 11 eCrosstalk
11.1 Interface
Input
Step 1
: a ranked list of genes, with 1st column for gene symbols, 2nd columns for scores (positive values), such as results fromeCG2PG
above. The error message will be displayed if the input is invalid.
Mechanism
Step 2
: identifies the subnetwork of highly-ranked genes that mediate the crosstalk between molecular pathways. The significance (p-value) of observing the identified crosstalk by chance is estimated by a degree-preserving node permutation test.
Output
- Example Output includes an interactive table for pathway crosstalk genes, and a network visualisation (illustrating the crosstalk between pathways).

FIGURE 11.1: The interface of the eCrosstalk
, exploiting the information of well-curated pathway-derived interactions to identify the subnetwork of highly ranked genes that mediate pathway crosstalk. The Show/Hide Info
toggle button introducing how to use the eCrosstalk
, including input, output, mechanism, etc.
11.2 Crosstalk results
Under the tab
Output: pathway crosstalk
,A network visualisation
illustrates crosstalk genes color-coded by input scores. The significance (p-value) of observing the identified crosstalk by chance is estimated by a degree-preserving node permutation test. Also provided is the downloadable PDF file.Under the tab
Output: pathway crosstalk
,An interactive table
: lists crosstalk genes together with input scores. Genes are cross-referenced and hyperlinked to GeneCards.

FIGURE 11.2: Interactive results for the eCrosstalk
under the tab Output: pathway crosstalk
. The user-input data under the tab Input into eCrosstalk
are also returned for the exploration.
Section 12 cTGene
12.1 Interface
Input
Step 1
: a list of user-input SNPs, with 1st column for dbSNP rsIDs and 2nd column for significance info (p-values between 0 and 1). The error message will be displayed if the input is invalid. Example input data are shared genetic variants identified from cross-disease genome-wide association studies in inflammatory disorders; see Nature Genetics 2016.
Mechanism
Step 2
: includes SNPs in Linkage Disequilibrium (LD). By default, input SNPs with a typical threshold (p-value < 5e−8) are considered, and additional SNPs in linkage disequilibrium (R2 < 0.8) can be also included according to the European population.Step 3
: uses functional genomic datasets, including genomic proximity, quantitative trait locus (QTL) and promoter capture Hi-C (PCHi-C), to identify core genes.Step 4
: networks core genes with each other and with additional (peripheral) genes based on the knowledge of protein interactions, generating a ranked list of core and peripheral genes. It is achieved using the random walk with restart (RWW) algorithm. By default, the restarting probability of 0.7 is set, empirically optimised for immune-mediated diseases; selecting a value smaller than 0.6 is not recommended as there is a higher chance to expect low performance.More Controls
: fine-tunes parameters involved in steps described above.
Output
- Example Output includes a manhattan plot (illustrating priority rating for target genes color-coded by chromosomes), and two tabular displays about prioritisation and evidence. A summary of input data and the runtime (computed on the server side) is also returned to the users for the reference.

FIGURE 12.1: The interface of the cTGene
, enabling/automating genetics-led and network-based identification and prioritisation of drug targets at the gene level. The Show/Hide Info
toggle button contains the help information on how to use the cTGene
, including input, output, mechanism, etc.
12.2 Prioritisation results
Under the tab
Output: target genes
,Manhattan plot
illustrates priority rating for target genes that are color-coded by chromosomes. Also provided is the downloadable PDF file.Under the tab
Output: target genes
,Prioritisation table
lists all prioritised genes, each receiving 5-star priority rating (scored 0-5). Genes are cross-referenced and hyperlinked to GeneCards. The columnType
tells the target gene type (eitherCore
for core genes orPeripheral
for peripheral genes). Also provided is a summary of evidence used to define core genes, including columnsProximity
(evidence of genomic proximity),QTL
(e/pQTL evidence) andPCHiC
(conformation evidence).Under the tab
Output: target genes
,Evidence table
for core genes, showing which SNPs (see the columnSNPs
) are used to define core genes (the columnCore genes
) based on which evidence (see the columnEvidence
). The columnSNP type
tells the SNP type (eitherInput
for use-input SNPs orLD
for LD SNPs). Notably, the columnEvidence
details datasets used: the prefixProximity_
indicative of SNPs in the proximity, the prefixPCHiC_
for PCHi-C datasets, and the prefixQTL_
for e/pQTL datasets.

FIGURE 12.2: Prioritisation results for the cTGene
. Under the tab Output: target genes
is a manhattan plot illustrating priority rating for target genes. The user-input data under the tab Input into cTGene
are also returned for the exploration.

FIGURE 12.3: Two tabular displays about target genes (top) and evidence (bottom) under the tab Output: target genes
.
Section 13 cTCrosstalk
13.1 Interface
Input
Step 1
: a list of user-input SNPs, with 1st column for dbSNP rsIDs and 2nd column for significance info (p-values between 0 and 1). The error message will be displayed if the input is invalid. Example input data are shared genetic variants identified from cross-disease genome-wide association studies in inflammatory disorders; see Nature Genetics 2016.
Mechanism
Step 2
: includes SNPs in Linkage Disequilibrium (LD). By default, input SNPs with a typical threshold (p-value < 5e−8) are considered, and additional SNPs in linkage disequilibrium (R2 < 0.8) can be also included according to the European population.Step 3
: uses functional genomic datasets, including genomic proximity, quantitative trait locus (QTL) and promoter capture Hi-C (PCHi-C), to identify core genes.Step 4
: networks core genes with each other and with additional (peripheral) genes based on the knowledge of protein interactions, generating a ranked list of core and peripheral genes. It is achieved using the random walk with restart (RWW) algorithm. By default, the restarting probability of 0.7 is set, empirically optimised for immune-mediated diseases; selecting a value smaller than 0.6 is not recommended as there is a higher chance to expect low performance.Step 5
: identifies the subnetwork of highly-ranked genes that mediate the crosstalk between molecular pathways. The significance (p-value) of observing the identified crosstalk by chance is estimated by a degree-preserving node permutation test.More Controls
: fine-tunes parameters involved in steps described above.
Output
- Example Output includes target genes, target pathways, targets at the crosstalk level, and crosstalk-based drug repurposing. A summary of input data and the runtime (computed on the server side) is also returned to the users for the reference.

FIGURE 13.1: The interface of the cTCrosstalk
, enabling/automating genetics-led and network-based identification and prioritisation of drug targets at the crosstalk level. The Show/Hide Info
toggle button contains the help information on how to use the cTCrosstalk
, including input, output, mechanism, etc.
13.2 Prioritisation results
Output: target genes
: includesManhattan plot
illustrating priority rating for target genes that are color-coded by chromosomes. Also provided is the downloadable PDF file. It also includesPrioritisation table
listing all prioritised genes, each receiving 5-star priority rating (scored 0-5), andEvidence table
for core genes showing which SNPs are used to define core genes based on which evidence. Genes are cross-referenced and hyperlinked to GeneCards.Output: target pathways
: includes a dot plot and a prioritisation table for target pathways. Also provided is the downloadable PDF file.Output: targets at the crosstalk level
: includesA network visualisation
illustrating the crosstalk between pathways, with genes colored by priority rating and labelled in the form ofrating®rank
,Prioritisation table
listing crosstalk genes, each receiving 5-star priority rating (scored 0-5), andEvidence table
for pathway crosstalk genes, showing which SNPs are used to crosstalk genes based on which evidence. Genes are cross-referenced and hyperlinked to GeneCards.Output: crosstalk-based drug repurposing
: includesA heatmap-like illustration
showing drug repurposing analysis of approved drugs (licensed medications) based on pathway crosstalk genes, with crosstalk genes on y-axis, disease indications on x-axis, red dots indexed in number and referenced beneath in the table where the information on approved drugs and mechanisms of action is detailed. It also includesAn interactive table
of crosstalk genes (the columnCrosstalk genes
), disease indications (the columnDisease indications
), approved drugs and mechanisms (the columnApproved drugs [mechanisms of action]
), and drug index (the columnIndex
) shown above within the dot plot.

FIGURE 13.2: Prioritisation results for the cTCrosstalk
. In addition to a summary of input data and the runtime (computed on the server side) under the tab Input into cTCrosstalk
, the prioritisation results page provides the output, including target genes under the tab Output: target genes
(the same as shown in the cTGene
), target pathways under the tab Output: target pathways
, and targets at the crosstalk level under the tab Output: targets at the crosstalk level
, and crosstalk-based drug repurposing under the tab Output: crosstalk-based drug repurposing
. Under the tab Output: target genes
include network visualisation of the crosstalk, with genes/nodes colour-coded by priority rating and labelled in the form of rating®rank
, and two tabular displays about prioritisation and evidence for crosstalk genes.

FIGURE 13.3: A dot plot for prioritised target pathways, with the top five labelled, available under the tab Output: target pathways
. Also available is Prioritisation table
for target pathways.

FIGURE 13.4: A heatmap-like illustration, with crosstalk genes on the y-axis, disease indications on the x-axis, and red dots indexed in numbers under the tab Output: crosstalk-based drug repurposing
. The index numbers are referenced in a table where the information on approved drugs and mechanisms of action is detailed.