What does HGCA do?
Human Gene Correlation Analysis (HGCA) is used for the identification of transcriptionally correlated (coexpressed) genes in Homo sapiens.
How do I cite HGCA?
- Michalopoulos, I., Pavlopoulos, G.A., Malatras, A., Karelas, A., Kostadima, M.A., Schneider, R., and Kossida, S. (2012). Human gene correlation analysis (HGCA): A tool for the identification of transcriptionally coexpressed genes. BMC Res Notes 5, 265.
Further reading on coexpression analysis:
- Zogopoulos, V.L., Saxami, G., Malatras, A., Papadopoulos, K., Tsotra, I., Iconomidou, V.A., and Michalopoulos, I. (2022). Approaches in Gene Coexpression Analysis in Eukaryotes. Biology 11, 1019.
What data are stored and where do they come from?
HGCA database mainly contains:
What is the input and output of HGCA?
The ENSG Gene Name or the Gene Symbol of a gene of interest
can be used as input. The output shows the most closely coexpressed genes
to the driver gene as a coexpression subtree, as well as their ENSG Gene names,
Gene Symbols and descriptions. A biological term category can be picked
from the drop down menu to perform an Over-representation analysis.
What is the Over-representation analysis?
By selecting one of the available Enrichment Analyses from the drop down menu,
HGCA will perform a term over-representation analysis for each term in
that category that describes the list of the coexpressed genes. The
statistical significance (p-value) of the over-representation of each term
is based on Hypergeometric Distibution. P-values are adjusted
using Benjamini–Hochberg procedure. The Enrichment
Summary only outputs terms whose over-representation
p-value is below the 0.05 Cut-off.
What are the Ensembl Gene Annotation, Gene
Ontology: Biological Process, Gene Ontology: Cellular
Component, Gene Ontology: Molecular Function, KEGG Pathway, WikiPathways, ENCODE, OMIM, DisGeNET, Pfam and Chromosome Band lists
useful for?
The following sorts of analyses can be performed:
- Ensembl Gene Annotation: This is the default output of the tool. A
description of each gene is shown. There is no Over-representation
analysis at this stage.
- Biological Process: Biological Process is one of the Gene Ontology
aspects. The Biological Process GO Terms for each gene is shown.
Over-representation analysis shows the most over-represented GO Terms of
the collection of GO Terms. This analysis is useful when a repeated term
can imply the biological process the gene is likely to participate in.
- Molecular Function: Molecular Function is one of the Gene Ontology
aspects. The Molecular Function GO Terms for each gene is shown.
Over-representation analysis shows the most over-represented GO Terms of
the collection of GO Terms. This analysis is useful when a repeated term
can imply the molecular function the gene is likely to have.
- Cellular Component: Cellular Component is one of the Gene Ontology
aspects. The Cellular Component GO Terms for each gene is shown.
Over-representation analysis shows the most over-represented GO Terms of
the collection of GO Terms. This analysis is useful when a repeated term
can imply the cellular component the gene is likely to be part of.
- KEGG Pathway: The KEGG Pathway terms and their descriptions for each
gene is shown. Over-representation analysis shows the most
over-represented KEGG terms. This analysis is useful when a repeated
term can imply the pathway the gene is likely to participate in.
- WikiPathways: The WikiPathways terms and their descriptions for each
gene is shown. Over-representation analysis shows the most
over-represented WikiPathways terms. This analysis is useful when a
repeated term can imply the pathway the gene is likely to participate
in.
- ENCODE: The transcription factors and their target genes are shown. Over-representation analysis shows the most
over-represented transcription factors. This analysis is useful when a repeated term can imply the list of transcription factors which drive coexpression.
- OMIM: The OMIM genetic disease id and their description are shown. This analysis is useful to discover if coexpressed genes play a vital role in the same genetic disease.
- DisGeNET: The DisGeNET disease id and their description are shown. Over-representation analysis shows the most over-represented diseases in which the coexpressed genes in the list are involved.
- Pfam: The Pfam terms and their descriptions for each gene is shown.
Over-representation analysis shows the most over-represented Pfam terms
of the collection of Pfam terms. This analysis is useful when a repeated
term can imply the protein family the gene is likely to belong to.
- Chromosome Band: The Chromosome Bands of each gene are shown. Each band redirects to the specific genomic location in the Ensembl genome browser
How can I navigate through the lists?
The user can change enrichment analysis by selecting
another category. Alternatively, the user can select a different driver gene
by clicking on a different ENSG id.
The user can also visit external sources that are related to the terms shown on the analysis.
What is the gene list useful for?
The user can download the current tree gene list that can be used for further analyses in external websites. Automatic redirections to multiple websites such as String and g:Profiler are already provided.
How can I navigate through the trees?
Further to the list navigation, the user can choose to
see more or less nodes of the subtree. The Newick formatted subtree can
also be downloaded. The tree can also be viewed externally in the iTol tree viewer.
What are your contact details?
Contact Dr Ioannis Michalopoulos
Powered by

The “ELIXIR-GR: Managing and Analysing Life Sciences Data (MIS: 5002780)” Project is co-financed by Greece and the European Union - European Regional Development Fund