Project Overview
This project was developed at the intersection of medical imaging and applied deep learning, in collaboration with pathologists from Carnegie Mellon University, Rutgers and Tulane. The work focused on building an end-to-end diagnostic system that brings explainable, high-performance ML models into digital pathology pipelines for vascular autoimmune disorders like Giant Cell Arteritis (GCA).
✨ Built a fully automated ML pipeline achieving 92.32% ROI-level accuracy and 0.93 AUC for inflammation detection in digitized biopsy slides.
✨ Trained and deployed a ResNet-34 classifier optimized for low-latency inference (168 ms/16 ROIs) using clinical-grade image preprocessing.
✨ Published and presented results at ARVO 2022, with GradCAM visual validation accepted by board-certified ophthalmic pathologists.
FUNDED BY: NIH U54 GM104942 Oliver and Carroll Dabezies Tulane Endowed Chair
Description
This project automated the detection of Giant Cell Arteritis (GCA) from digitized temporal artery biopsy (TAB) slides, spanning every stage from raw data handling to clinical deployment readiness and publication. I collaborated directly with clinicians, developed the ML pipeline, validated it with expert feedback, and co-authored the peer-reviewed publication.
1. Data Collection & Expert Labeling
- Curated a dataset of 472 high-resolution TAB slides spanning 20 years (2000–2019).
- Participated in quality control and slide review, excluding 192 suboptimal samples.
- Collaborated with ophthalmic pathologists to annotate slides with binary GCA labels, forming the ground truth.
2. Image Preprocessing
- Developed a CV pipeline to automatically detect artery regions from stain intensity.
- Extracted and padded 3,558 ROIs to standardized 512×512 px RGB tiles.
- Applied color jittering and rotation (0°, 90°, 180°, 270°) for augmentation and rotational invariance.
3. Model Development
- Trained a ResNet-34 model fine-tuned on ImageNet for binary GCA classification.
- Designed a dual-level prediction strategy: ROI-level and slide-level classification.
- Balanced model complexity and latency to enable fast inference (~168 ms for 16 ROIs) on an Apple M1 GPU.
4. Validation & Explainability
- Integrated GradCAM to generate region-wise visual explanations for clinical review.
- Validated that heatmaps aligned with features like lymphocytic infiltrates and giant cells.
- Worked with pathologists to verify that model attention matched diagnostic hotspots in expert-labeled slides.
5. Model Optimization
- Benchmarked performance across ResNet-18, 34, and 50, selecting ResNet-34 for its 92.32% accuracy and 0.93 AUC on the held-out 2019 test set.
- Tuned augmentation, data balancing, and inference pipeline for clinical deployment feasibility.
- Demonstrated robust performance across time-split validation, reinforcing generalizability.
6. Publication & Reporting
- Co-authored and published results in Investigative Ophthalmology & Visual Science (ARVO 2022)
- Designed all pipeline diagrams, GradCAM visualizations, and tables for the final report and poster.
- Created the project poster for dissemination at conferences and institutional reviews.
Tools & Frameworks
Research Focus | Stack / Tools Used |
---|---|
Deep Learning Pipelines | PyTorch , Torchvision , ImageNet , Transfer Learning , scikit-learn |
Medical Image Preprocessing | OpenSlide , PIL , OpenCV , skimage , IBM H&E Normalization , ROI Extraction , optparse |
Explainability & Debugging | GradCAM , pytorch-grad-cam , Matplotlib , Seaborn |
High-Performance Inference | Batch Processing , Multi-threading , GPU Training , WSI Multiprocessing |
Deployment Readiness | Pipeline Automation (CMD) , PDF Reporting , Clinical Evaluation Integration |