Biologist / Data Scientist / Mathematician / Quantum Chemist
RESEARCH
Abstracts of Published Work
Smartphone-based machine learning model for real-time assessment of medical kidney biopsy
Eigbire-Molen, O., et al. (2024): https://doi.org/10.1016/j.jpi.2024.100385
Background: Kidney biopsy is the gold standard for diagnosing medical renal diseases, but the accuracy of the diagnosis greatly depends on the quality of the biopsy specimen, particularly the amount of renal cortex obtained. Inadequate biopsies, characterized by insufficient cortex or predominant medulla, can lead to inconclusive or incorrect diagnoses, and repeat biopsy. Unfortunately, there has been a concerning increase in the rate of inadequate kidney biopsies, and not all medical centers have access to trained
professionals who can assess biopsy adequacy in real time. In response to this challenge, we aimed to develop a machine-learning model capable of assessing the percentage cortex of each biopsy pass using smartphone images of the kidney biopsy tissue at the time of biopsy. Methods: 747 kidney biopsy cores and corresponding smartphone macro images were collected from 5 unused deceased donor kidneys. Each core was imaged, formalin-fixed, sectioned, and stained with Periodic acid–Schiff (PAS) to determine cortex percentage. The fresh unfixed core images were captured using the macro camera on an iPhone 13 Pro. Two experienced renal pathologists independently reviewed the PAS-stained sections to determine the cortex percentage. For the purpose of this study, the biopsies with less than 30% cortex were labeled as inadequate, while those with 30% or more cortex were classified as adequate. The dataset was divided into training (n=643), validation (n=30), and test (n=74) sets. Preprocessing steps involved converting HEIC iPhone format images to JPEG, normalization, and renal tissue segmentation using a U-Net deep learning model. Subsequently, a classification deep learning model was trained on the renal tissue region of interest and corresponding class label.
Results: The deep learning model achieved an accuracy of 85% on the training data. On the independent test dataset, the model exhibited an accuracy of 81%. For inadequate samples in the test dataset, the model showed a sensitivity of 71%, suggesting its capability to identify cases with inadequate cortical representation. The area under the receiver-operating curve (AUC-ROC) on the test dataset was 0.80. Conclusion: We successfully developed and tested a machine-learning model for classifying smartphone
images of kidney biopsies as either adequate or inadequate, based on the amount of cortex determined by expert renal pathologists. The model's promising results suggest its potential as a smartphone application to assist real-time assessment of kidney biopsy tissue, particularly in settings with limited access to trained personnel. Further refinements and validations are warranted to optimize the model's performance.
Development and Validation of a Multi-Class Model Defining Molecular Archetypes of Kidney Transplant Rejection
Zhang, H., et al. (2023): https://doi.org/10.1016/j.labinv.2023.100304
Gene expression profiling (GEP) from formalin-fixed paraffin-embedded (FFPE) renal allograft biopsies is a promising approach for feasibly providing a molecular diagnosis of rejection. However, large-scale studies evaluating the performance of models using NanoString platform data to define molecular archetypes of rejection are lacking. We tested a diverse retrospective cohort of over 1400 FFPE biopsy specimens, rescored according to Banff 2019 criteria and representing ten of 11 UNOS regions, using the Banff Human Organ Transplant (B-HOT) panel from NanoString and developed a multi-class model from the gene expression data to assign relative probabilities of four molecular archetypes: No Rejection, Antibody-Mediated Rejection (ABMR), T cell-Mediated Rejection (TCMR), and Mixed Rejection. Using LASSO regularized regression with 10-fold cross-validation fitted to 1050 biopsies in the discovery cohort and technically validated on an additional 345 biopsies, our model achieved overall accuracy of 85% in the discovery cohort and 80% in the validation cohort, with ≥75% positive predictive value (PPV) for each class, except for the Mixed Rejection class in the validation cohort (PPV 53%). This study represents the technical validation of the first model built from a large and diverse sample of diagnostic FFPE biopsy specimens to define and classify molecular archetypes of histologically-defined diagnoses as derived from B-HOT panel GEP data.
​
Connectivity of combination puzzles: Using machine learning on groups
Napier, J. O. H. (2021): ProQuest Dissertations & Theses Global​
This study analyzed the structure of the Pocket Cube (2 × 2) and explored machine learning designed to solve the Rubik’s Cube (3 ×3). Categorization of 2 × 2 permutations revealed 158 subtypes of which some were “dead-end” permutations, incapable of being scrambled further despite not being at the furthest scrambled depth of 11. Markovian analyses of the 2 × 2 determined the mixing time (18.74635), mean time between depths, transition probabilities between layers, and the Monkey number (23,332,546). In building neural networks for the 3 × 3, unique methods were developed to enhance machine learning: warm-encoding for training networks with single inputs and multiple outputs, lambda layer mediated outputs to chain cost- functions, and color rotations that identify algorithms of the same conjugacy classes. Further investigation was done on the 3 × 3 by counting the correct number of edges, corners, and stickers for generated permutations. This highlighted the non-separability with respect to depth and its potential problems for machine learning.
​
Behavioral and Disease Ecology of Gopher Tortoises (Gopherus polyphemus) Post Exclusion and Relocation with a Novel Approach to Homing Determination
Napier, J. O. H. (2018): University of Central Florida] STARS Electronic Theses and Dissertations
In the wake of human expansion, relocations and the loss of habitat can be stressful to an organism, plausibly leading to population declines. The gopher tortoise (Gopherus polyphemus) is a keystone species that constructs burrows it shares with 362 commensal species. Frequent exclusions and relocations and long generation times have contributed to G. polyphemus being State-designated as Threatened in Florida. Prior studies have indicated that G. polyphemus may possess homing behavior and thus be able to counteract stressors due to relocation and exclusion. I radiotracked a cohort of G. polyphemus for 11 months following excavation, relocation, and exclusion due to a pipeline construction project. In conjunction with analyzing G. polyphemus movement patterns post-release, I developed novel statistical methodologies with broad application for movement analysis and compared them to traditional analyses. I evaluated habitat usage, burrowing behavior, movements, growth, and disease signs among control versus relocated and excluded individuals and among sexes and size classes, forming predictors for behavior and disease risk. I found statistical support that my new methodology is superior to previous statistical tests for movement analyses. I also found that G. polyphemus engages in homing behavior, but only in males. Behavioral differences were also found between the sexes with respect to burrowing behavior. Overall health, disease prevalence, and immune response were unaffected by relocation and exclusion, nor were they statistically correlated. Signs were unreliable as etiological agents, outperformed by serological detection. I determined that the Sabal Trail pipeline as a potential stressor did not affect movement behavior, homing, nor the disease/immune profile of G. polyphemus in this study.
Research Interests
Mathematical Theory
Apollonian Sphere Packing
Correlated Markov Chains
Hypersphere Geodesy
Kissing Number Problem
Lie Algebra
Riemann Hypothesis
Rubik's Cube Mathematics
Statistical Test Development
Applied Mathematics & AI
Alloy Optimization
Differential Gene Expression
Molecular Design
Opsin Class Proteins
Protein Folding
Quantum Material Design
Tissue Classification
Topological Error Correction
Machine Learning Theory
Algorithm Design, Classical
Algorithm Design, Quantum
Convolutional Neural Networks
Error Function Topology
Generative Adversarial Networks
Graph Neural Networks
Pseudo-Autoencoders
Reinforcement Learning
​