Chapter 2 Experimental design(DoE)
Before you perform any metabolomics experiment, a clean and meaningful experimental design is the best start. Depending on different research purposes, experimental design can be classified into homogeneity and heterogeneity study. Technique such as isotope labeled media will not be discussed in this chapter while this paper(Jang, Chen, and Rabinowitz 2018) could be a good start.
2.1 Homogeneity study
In homogeneity study, the research purpose is about method validation in most cases. Pooled sample made from multiple samples or technical replicates from same population will be used. Variances within the samples should be attributed to factors other than the samples themselves. For example, we want to know if sample injection order will affect the intensities of the unknown peaks, one pooled sample or technical replicates samples should be used.
Another experimental design for homogeneity study will use biological replicates to find the common features from a group of samples. Biological replicates mean samples from same population with same biological process. For example, we wanted to know metabolites profiles of a certain species and we could collected lots of the individual samples from the population. Then only the peaks/compounds appeared in all samples will be used to describe the metabolites profiles of this species. Technical replicates could also be used with biological replicates.
2.2 Heterogeneity study
In heterogeneity study, the research purpose is to find the differences among samples. You need at least a baseline to perform the comparison. Such baseline could be generated by random process, control samples or background knowledge. For example, outlier detection can be performed to find abnormal samples in unsupervised manners. Distribution or spatial analysis could be used to find geological relationship of known and unknown compounds. Temporal trend of metabolites profile could be found by time series or cohort studies. Clinical trial or random control trial is also an important class of heterogeneity studies. In this cases, you need at least two groups: treated group and control group. Also you could treat this group information as the one primary variable or primary variables to be explored for certain research purposes. In the following discussion about experimental design, we will use random control trail as model to discuss important issues.
2.3 Power analysis
Supposing we have control and treated groups, the numbers of samples in each group should be carefully calculated.For each metabolite, such comparison could be treated as one t-test. You need to perform a Power analysis to get the numbers. For example, we have two groups of samples with 10 samples in each group. Then we set the power at 0.9, which means one minus Type II error probability, the standard deviation at 1 and the significance level (Type 1 error probability) at 0.05. Then we will get the meaningful delta between the two groups should be higher than 1.53367 under this experiment design. Also we could set the delta to get the minimized numbers of the samples in each group. To get those data such as the standard deviation or delta for power analysis, you need to perform preliminary or pilot experiments.
##
## Two-sample t test power calculation
##
## n = 10
## delta = 1.53367
## sd = 1
## sig.level = 0.05
## power = 0.9
## alternative = two.sided
##
## NOTE: n is number in *each* group
##
## Two-sample t test power calculation
##
## n = 2.328877
## delta = 5
## sd = 1
## sig.level = 0.05
## power = 0.9
## alternative = two.sided
##
## NOTE: n is number in *each* group
However, since sometimes we could not perform preliminary experiment, we could directly compute the power based on false discovery rate control. If the power is lower than certain value, say 0.8, we just exclude this peak as significant features.
In this review (Oberg and Vitek 2009), author suggest to estimate an average \(\alpha\) according to this equation (Benjamini and Hochberg 1995) and then use normal way to calculate the sample numbers:
\[ \alpha_{ave} \leq (1-\beta_{ave})\cdot q\frac{1}{1+(1-q)\cdot m_0/m_1} \]
Other study (Blaise et al. 2016) show a method based on simulation to estimate the sample size. They used BY correction to limit the influences from correlations. Other investigation could be found here(Saccenti and Timmerman 2016; Blaise 2013). However, the nature of omics study make the power analysis hard to use one number for all metabolites and all the methods are trying to find a balance to represent more peaks with least samples.
MetSizeR GUI Tool for Estimating Sample Sizes for metabolomics Experiments(Nyamundanda et al. 2013).
MSstats Protein/Peptide significance analysis (Choi et al. 2014).
enviGCMS GC/LC-MS Data Analysis for Environmental Science(Z. Yu et al. 2017).
2.4 Optimization
One experiment can contain lots of factors with different levels and only one set of parameters for different factors will show the best sensitivity or reproducibility for certain study. To find this set of parameters, Plackett-Burman Design (PBD), Response Surface Methodology (RSM), Central Composite Design (CCD), and Taguchi methods could be used to optimize the parameters for metabolomics study. The target could be the quality of peaks, the numbers of peaks, the stability of peaks intensity, and/or the statistics of the combination of those targets. You could check those paper for details(Jacyna, Kordalewska, and Markuszewski 2019; Box, Hunter, and Hunter 2005).
2.5 Pooled QC
Pooled QC samples are unique and very important for metabolomics study. Every 10 or 20 samples, a pooled sample from all samples and blank sample in one study should be injected as quality control samples. Pooled QC samples contain the changes during the instrumental analysis and blank samples could tell where the variances come from. Meanwhile the cap of sequence should old the column with pooled QC samples. The injection sequence should be randomized. Those papers(Phapale et al. 2020; Dudzik et al. 2018; Dunn et al. 2012; Broadhurst et al. 2018; Corey D. Broeckling et al. 2023; González-Domínguez et al. 2024) should be read for details.
If there are other co-factors, a linear model or randomizing would be applied to eliminate their influences. You need to record the values of those co-factors for further data analysis. Common co-factors in metabolomics studies are age, gender, location, etc.
If you need data correction, some background or calibration samples are required. However, control samples could also be used for data correction in certain DoE.
Another important factors are instrumentals. High-resolution mass spectrum is always preferred. As shown in Lukas’s study (Najdekr et al. 2016):
the most effective mass resolving powers for profiling analyses of metabolite rich biofluids on the Orbitrap Elite were around 60000-120000 fwhm to retrieve the highest amount of information. The region between 400-800 m/z was influenced the most by resolution.
However, elimination of peaks with high RSD% within group were always omitted by most study. Based on pre-experiment, you could get a description of RSD% distribution and set cut-off to use stable peaks for further data analysis. To my knowledge, 30% is suitable considering the batch effects.
Adding certified reference material or standard reference material will help to evaluate the quality large scale data collocation or important metabolites(Wise 2022; Wright, Beach, and McCarron 2022).
For quality control in long term, ScreenDB provide a data analysis strategy for HRMS data founded on structured query language database archiving(Mardal et al. 2023).
AVIR develops a computational solution to automatically recognize metabolic features with computational variation in a metabolomics data set(Z. Zhang et al. 2024).