Chapter 8 Omics analysis

When you get the filtered ions, the next step is making annotations for them. Such annotations would be helpful for omics studies. Omics analysis try to combine the information from other ‘omics’ to answer one specific question. Since we have got the annotations, Omics analysis could be performed.Upload the data obtained from the xcms to other tools or databases.

You will get an updated database list here.

Right now, it is hard to connect different omics databases such as gene, protein and metabolites together for a whole scope of certain biological process. However, you might select few metabolites across those databases and find something interesting.

8.1 From Bottom-up to Top-down

Bottom-up analysis mean the model for each metabolite. In this case, we could find out which metabolite will be affected by our experiment design. However, take care of multiple comparison issue.

\[ metabolite = f(control/treatment, co-variables) \]

Top-down analysis mean the model for output. In this case, we could evaluate the contribution of each metabolites. You need variable selection to make a better model.

\[ control/treatment = f(metabolite 1,metabolite 2,...,metaboliteN,co-varuables) \]

For omics study, you might need to integrate dataset from different sources.

\[ control/treatment = f(metabolites, proteins, genes, miRNA,co-varuables) \]

8.2 Pathway analysis

Pathway analysis maps annotated data into known pathway and make statistical analysis to find the influenced pathway or the compounds with high influences on certain pathway.

8.2.1 Pathway Database

  • SMPDB (The Small Molecule Pathway Database) is an interactive, visual database containing more than 618 small molecule pathways found in humans. More than 70% of these pathways (>433) are not found in any other pathway database. The pathways include metabolic, drug, and disease pathways.

  • KEGG (Kyoto Encyclopedia of Genes and Genomes) is one of the most complete and widely used databases containing metabolic pathways (495 reference pathways) from a wide variety of organisms (>4,700). These pathways are hyperlinked to metabolite and protein/enzyme information. Currently KEGG has >17,000 compounds (from animals, plants and bacteria), 10,000 drugs (including different salt forms and drug carriers) and nearly 11,000 glycan structures.

  • BioCyc is a collection of 14558 Pathway/Genome Databases (PGDBs), plus software tools for exploring them.

  • Reactome is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education.

  • WikiPathway is a database of biological pathways maintained by and for the scientific community.

8.2.2 Pathway software

  • Pathway Commons online tools for pathway analysis

  • RaMP could make pathway analysis for batch search

  • metabox could make pathway analysis

  • impala is used for pathway enrichment analysis

  • Metscape based on Debiased Sparse Partial Correlation (DSPC) algorithm (Basu et al. 2017) to make annotation.

8.3 Network analysis

Mummichog could make pathway and network analysis without annotation.

MSS: sequential feature screening procedure to select important sub-network and identify the optimal matching for metabolimics data (Q. Cai et al. 2017).

Metapone is joint pathway testing package for untargeted metabolomics data (L. Tian et al. 2022).

8.4 Omics integration

  • Blast finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance.

  • The Omics Discovery Index (OmicsDI) provides a knowledge discovery framework across heterogeneous omics data (genomics, proteomics, transcriptomics and metabolomics).

  • Omics Data Integration Project

  • Standardized multi-omics of Earth’s microbiomes could check this GNPS based work(Shaffer et al. 2022).

  • Windows Scanning Multiomics: Integrated Metabolomics and Proteomics(Shi et al. 2023)

References

Basu, Sumanta, William Duren, Charles R. Evans, Charles F. Burant, George Michailidis, and Alla Karnovsky. 2017. “Sparse Network Modeling and Metscape-Based Visualization Methods for the Analysis of Large-Scale Metabolomics Data.” Bioinformatics 33 (10): 1545–53. https://doi.org/10.1093/bioinformatics/btx012.
Cai, Qingpo, Jessica A. Alvarez, Jian Kang, and Tianwei Yu. 2017. “Network Marker Selection for Untargeted LCMS Metabolomics Data.” Journal of Proteome Research 16 (3): 1261–69. https://doi.org/10.1021/acs.jproteome.6b00861.
Shaffer, Justin P., Louis-Félix Nothias, Luke R. Thompson, Jon G. Sanders, Rodolfo A. Salido, Sneha P. Couvillion, Asker D. Brejnrod, et al. 2022. “Standardized Multi-Omics of Earth’s Microbiomes Reveals Microbial and Metabolite Diversity.” Nature Microbiology 7 (12): 2128–50. https://doi.org/10.1038/s41564-022-01266-x.
Shi, Jiachen, Jialiang Zhao, Yu Zhang, Yanan Wang, Chin Ping Tan, Yong-Jiang Xu, and Yuanfa Liu. 2023. “Windows Scanning Multiomics: Integrated Metabolomics and Proteomics.” Analytical Chemistry, December. https://doi.org/10.1021/acs.analchem.3c03785.
Tian, Leqi, Zhenjiang Li, Guoxuan Ma, Xiaoyue Zhang, Ziyin Tang, Siheng Wang, Jian Kang, Donghai Liang, and Tianwei Yu. 2022. “Metapone: A Bioconductor Package for Joint Pathway Testing for Untargeted Metabolomics Data.” Bioinformatics 38 (14): 3662–64. https://doi.org/10.1093/bioinformatics/btac364.