pHealth in other fields

Much of the points below came from a review paper by MacEachern and Forkert (2020)

Examples of pHealth in practice

One of the first fields to use a precision medicine approach to treat human disease was transfusion medicine, where the discovery of blood types in the early 1900s revolutionized blood transfusions, allowing for matching of donor and recipient blood types, and avoiding complications associated with mismatched donor and recipient blood Giangrande et al. 2000
pHealth has enabled gene therapy for infants with spinal muscular atrophy (SMA) type I

Genetics is the study of genes and their roles in inheritance
Genomics is the study of a person’s genome and the interactions within the genome and the greater environment
Other fields now incorporate their data in pHealth:
- epigenetics, proteomics (protein), metabolomics (metabolics), radiomics (radiology), pharmacomics (pharmacology), microbiomics (microbiome)
Multi-omics: inclusion of data from multiple domains

Machine learning is a branch of artificial intelligence in which computer-based models are created to learn and identify patterns in high dimensional datasets to create prediction and classification models based on training data
ML is typically divided into supervised, unsupervised and reinforcement learning (RL)
In pHealth, supervised and unsupervised approaches are more common than RL, since the latter requires direct feedback

Unsupervised learning aims to uncover patterns in unlabelled data - in doing so, it can automatically identify clusters of similar cases within a dataset (e.g., variational autoencoders, k-means, pca)
Supervised learning aims to identify patterns in multi-dimensional data based on labelled data (e.g., healthy vs disease). A training dataset with ground truth labels is typically used to build a model and optimize performance of the ML model for the desired outcome. The uncovered (learned) patterns can then be used to classify new datasets or make data-driven, patient-individual predictions. Some popular supervised learning approaches: SVM, random forests, linear models, deep neural nets.

basic make-up: input layer, multiple hidden layers, output layer
number of “neurons” in the input layer is equal to the number of input features while the number of “neurons” in the output layer is equal to the number of classes in case of a classification problem (e.g., patients with/without a disease)
Artificial neural networks with many hidden layers are typically referred to as deep neural networks.
Convolutional neural networks apply convolutional filters in the first layers of the deep neural network that can automatically identify important temporal or spatial features in a dataset such as three-dimensional imaging data. Convolutional filters are optimized in the same fashion as network weights are optimized.

advantages:
- no strict assumption about data distribution (models can combine multi-omics data)
- regularization is widely used in ML models, therefore, these models can usually handle noisy data and large variances
- complex ML models have the ability to identify multi-faceted, non-linear patterns in training data, which might not be observable to simple linear models
limitations:
- the ability to identify non-linear patterns comes with reduced interpretation capabilities, compared to simple linear models (i.e. black box criticism)
- in overfitting training data, models often fail to generalize, this happens more often with complex models. deep neural networks with thousands (millions) of features often fail to generalize

GWAS studies (genome-wide association studies): genomic information is examined to identify variants associated with a trait or phenotype. some work has shown that genotype-based disease risk assessment may be possible for diseases for which single nucleotide polymorphism (SNP) arrays carry a large risk proportion
ML models can also be applied to multi-modal data acquired through electronic medical records (EMRs) to identify patients with conditions that may benefit from early treatment or participation in randomized control trials or novel interventions.