Suraj Pai

Suraj Pai

Research Scientist & Machine Learning Engineer

Harvard Medical School · Mass General Cancer Center

Boston, MA  ·  331-806-7744  ·  surajpai.tech

I’m a researcher and ML engineer working at the intersection of AI and medicine. I recently defended my PhD in Biomedical Engineering and spend most of my time building foundation models for medical imaging and wrangling LLMs into clinical workflows. Currently a Research Fellow at Harvard Medical School, where I lead projects on multimodal AI for radiation oncology. I care a lot about open science — a few of my tools have found their way into other labs’ pipelines, which is always a nice feeling. 20+ papers, 700+ citations, h-index 10.

 

News

 

Experience

Brigham and Women's Hospital, Harvard Medical School Research Fellow
Sep 2021 – Present  ·  Boston, MA
  • Architecting a code-first agentic multimodal AI framework integrating vision-language models with LLM workflows (RAG, tool use, MCP servers) for radiation therapy planning to reduce planning time from weeks to hours.
  • Developed a multi-stage multimodal foundation model combining 3D medical imaging encoders with language models for segmentation of referring objects, improving personalized patient care through clinical context-driven contouring.
  • Leading research direction and managing a team of 2 PhD students and 2 interns in building multi-modal foundation models, collaborating with a cross-functional team of medical physicists, radiation oncologists, and dosimetrists — resulting in 3 conference abstracts and 2 in-progress journal submissions.
  • Developed domain-specific implementations of self-supervised learning (SimCLR, VicReg, MAE, DinoV2) using 11k+ 3D CT scans, creating robust vision foundation models with 20% performance improvements on lung cancer tasks under limited labelled data constraints.
  • Built an AI model to measure immune health using data from 27,000 patients — adopted by 5 clinical studies at Mass General for more personalized patient care.
  • Co-created Lighter, an open-source YAML-driven deep learning framework, achieving 200+ GitHub stars and 30% faster iteration cycles.
Maastro Clinic Research Assistant in Medical Imaging
Dec 2019 – Jul 2021  ·  Maastricht, The Netherlands
  • Researched CycleGAN architectures for medical image enhancement, implementing custom frequency-domain losses and invertible networks for 3D medical imaging to facilitate more accurate radiation delivery.
  • Developed Ganslate, an open-source PyTorch framework for image-to-image translation adopted by multiple research collaborators.
  • Successfully integrated research prototypes into radiotherapy workflows, demonstrating practical impact of ML in a clinical setting.
BlinkIN Machine Learning Research & Development
Jun 2018 – Aug 2019  ·  Hyderabad, India
  • As a founding ML engineer, built DAG-based workflows for sequential decision-making using detection and classification events to model user interactions for customer support scenarios via an AR-based platform.
  • Architected automated ML deployment system (GCP) with cloud triggers for model training and serving, reducing deployment time by 50%.
  • Developed data annotation web app and ETL pipelines integrated with Kurento media server for real-time inference.
Upwork Computer Vision & ML Engineer (Freelance)
Dec 2017 – Aug 2019  ·  Remote
  • Implemented YOLO object detection as a C++ socket service with Node.js client, achieving 5× speedup over baseline through optimized OpenCV inference.
  • Deployed edge AI systems (RPI + Intel Neural Compute Stick) for real-time detection with custom CNN models for facial analysis and customer engagement scoring.
Cognitive Machines Software Solutions Associate Engineer, Machine Learning
Sep 2017 – Apr 2018  ·  Bangalore, India
  • Built cross-platform CV library combining CNNs with classical methods (Hough transforms, edge detection), deployed on Android, iOS, and RPI for automated inventory tracking.
  • Developed Kinect-based pose estimation using CNN models for low-resource gait monitoring of patients undergoing physical recovery.

 

Education

PhD — Biomedical Engineering Sep 2021 – Nov 2025 · Defended ✓
Maastricht University  ·  Conducted at Mass General & Harvard Medical School  ·  Boston, MA
Thesis: Representation Learning in Radiology and Cancer Imaging  ·  Watch defense ↗
MSc — Artificial Intelligence Sep 2019 – Jul 2021
Maastricht University  ·  GPA 8.78/10  ·  Graduated Cum Laude  ·  Maastricht, The Netherlands
BTech — Electronics and Communications Engineering Jan 2012 – Dec 2016
Manipal Institute of Technology  ·  GPA 8.43/10  ·  Manipal, India

 

Skills

ML & Deep Learning
PyTorchTensorFlowTransformers (HF)SimCLRMAEDINOv2ViTSAMUNetYOLOCycleGANFoundation Models
LLMs & Agentic AI
DSPyOllamasmolagentsGoogle ADKRAG PipelinesPrompt EngineeringTool UseMCP Servers
MLOps & Engineering
PythonGitDockerCI/CDWeights & BiasesMLflowHydraMONAILinux
Cloud & Infrastructure
AWSAzureGCPHPC / SlurmDistributed Training
Data Science
SQLRScikit-LearnStatistical ModelingData Visualization

 

Achievements

  • 700+ citations, h-index 10, 20+ publications — significant impact in top-tier journals
  • First author publication in Nature Machine Intelligence; two first-author papers under review at Nature and two at Nature Communications
  • Research featured in major media outlets including NYT, Science Magazine, and Science Daily
  • Reviewed 19 manuscripts for Nature Scientific Reports, npj Breast Cancer, Nature Biomedical Engineering, QIMS, MICCAI, ML4H, JOSS
  • Invited Speaker at DL IndabaX, Microsoft Health Futures, Novartis, Fred Hutchinson Cancer Research Center
  • 200+ GitHub stars across self-led open-source projects
  • Awarded Brigham Research Institute Microgrant
  • Active contributor to NVIDIA’s Project MONAI, nnunet, nnssl

 

Publications

Foundation Models & Representation Learning

Medical Image Synthesis & Translation

Clinical AI & Biomarkers

Open Science & Tools