GO Enrichment Analysis


Introduction of GO Enrichment Analysis

Gene Ontology (GO) database is an international standard classification system for gene function. It aims to establish a language vocabulary standard that is applicable to various species, defines and describes the functions of genes and proteins, and can be updated as research continues. GO is divided into three parts: Molecular Function, Biological Process, and Cellular Component. GO enrichment analysis is to classify differential genes and so on according to GO, and carry out the significance analysis, error rate analysis, and enrichment analysis based on the discrete distribution of the classification results, then obtain the targeted gene function classifications that are significantly related to the experimental purpose. This target classification is the most important functional difference that leads to differences in sample traits.

Applications of GO Enrichment Analysis in Biology

In biological research, finding differentially expressed genes and exploring their possible functions is the main purpose of various omics sequencing (such as RNA sequencing). Obviously, these different genes must be closely related to functional changes. For example, comparing the tissue expression profiles of diseased individuals and normal individuals, these genes with significantly changed expression are involved in disease or immune-related biological processes, signal pathways, etc. The imbalance of expression level is definitely inseparable from the occurrence and development of the disease. According to the results of GO annotations, the functions of differentially expressed genes are linked to phenotypes, and relevant target genes can be quickly found.

An Example of GO Enrichment Analysis

Gene Ontology (GO) enrichment analysis of the top 10 differentially expressed genes (DEGs) by p-value.Figure1. Gene Ontology (GO) enrichment analysis of the top 10 differentially expressed genes (DEGs) by p-value. (Wang Y, et al. 2019)

  • The horizontal axis represents the number of enriched DEGs.
  • The vertical axis represents the biological description: BP stands for Biological process (orange); MF stands for molecular function (blue); CC stands for cellular component (green).
  • Black trend line: -log10 (p-adjust)/2; P-adjust: rectified p-value.

  1. Wang Y, et al. Analysis of key genes and their functions in placental tissue of patients with gestational diabetes mellitus[J]. Reproductive Biology and Endocrinology, 2019, 17(1).

