Machine Learning Boot Camp: Analyzing Biomedical and Health Data

Returning Summer 2025 | Hybrid training

The next Machine Learning Boot Camp will be Summer 2025. Sign up below to hear about registration opening! 

The Machine Learning Boot Camp is a two-day intensive boot camp of seminars combined with hands-on R labs and data applications to provide an overview of statistical concepts, techniques, and data analysis methods with applications in biomedical research. 

Subscribe for updates on registration and scholarship dates, deadlines, and announcements.

Summer 2025 dates: Hybrid training (In-person and simultaneous livestream for remote attendees) Dates TBD; 10am - ~5pm EDT. 

This two-day intensive training will provide a broad introduction to machine learning methodology with applications in biomedical research. Taught by a team of biostatisticians, the boot camp will integrate seminar lectures with hands-on R lab sessions to put concepts into practice. Emphasis will be given to supervised (e.g., penalized methods, classification and decision trees, survival forests) and unsupervised methods (e.g., clustering algorithms, dimensionality reduction) with numerous case studies and biomedical applications. The workshop will conclude with an overview and demonstration of  ‘deep learning’ algorithms. 

By the end of the boot camp, participants will be familiar with the following topics: 

  • Penalized Regression Methods (Ridge and Lasso) 
  • Classification Models (e.g., Support Vector Machines) 
  • Tree Based Methods (Decision/Regression Trees) 
  • Clustering Algorithms 
  • Principal Component Analysis (PCA) 
  • Deep Learning – Introduction to dense and convolutional neural networks 

Audience and Requirements

Investigators from any institution and from all career stages are welcome to attend, and we particularly encourage trainees and early-stage investigators to participate. There are three requirements to attend this training:

  1. Each participant must have an introductory background in statistics (i.e., linear and logistic regression).
  2. Each participant must be familiar with R. The main platform used for the workshop will be RStudio Cloud, therefore we strongly recommend that participants have a basic understanding of R/RStudio prior to attending the Training. 
  3. Each participant is required to have a personal laptop and a free, basic RStudio Cloud account prior to the first day of the workshop. All lab sessions will be done on this platform.

If you have any specific questions about R and R studio in the context of the Machine Learning Boot Camp, please email the Machine Learning team.


Summer 2025 instructing team is being finalized, but will be comparable to the 2023 lineup below.

Cody Chiuzan, PhD, Associate Professor, Institute of Health System Science, Feinstein Institutes for Medical Research​, Northwell Health. Dr. Chiuzan’s research interests concern development of adaptive early-phase designs for oncology trials, and using real-world evidence to improve clinical outcomes and guide the transfer of knowledge among non-experimental studies. 

Jean Feng, PhD, Assistant Professor, Department of Epidemiology and Biostatistics, University of California, San Francisco. Dr. Feng's research interests include the interpretability and reliability of machine learning methods for biomedical applications, particularly those involving black-box models. 

Noah Simon, PhD, Associate Professor, Department of Biostatistics, School of Public Health, University of Washington. Dr. Simon’s methodological interests include computationally efficient methods for predictive modeling with high-dimensional, complex data, and the design of adaptive clinical trials. 


Training scholarships are not available for this year's training.


Summer 2025: The Machine Learning Boot Camp will be a hybrid setup with an in-person training at the Columbia University Irving Medical Campus in NYC and simultaneously livestreamed for remote attendees. Please note this training is not a self-paced, pre-recorded online training. All training start and end times are in EDT.

More information on travel, lodging, and getting around NYC.


"As a neurologist focused on research, the boot camp was a superb introduction to machine learning, and it gave me the tools I need to start creating my own models in my research and to improve my critical thinking." - PhD candidate at HM CINAC Integral A.C. Neuroscience Center, 2023

"As an online attendee, the machine learning boot camp was well organized. The trainers accommodated all the attendees at their various levels of understanding of machine learning. It was well worth the time spent training." - Senior lecturer at University of Nairobi, 2023

"Such a great experience with enthusiastic instructors. I had dabbled in machine learning in the past but this course helped to contextualize the methods and got sufficiently into the mathematics. I feel confident in being able to generate questions and answer them with appropriate machine learning methods." - Postdoctoral fellow at Stanford/VA, 2023

Additional Testimonials

"It was very enriching. I have acquired knowledge that will be useful in my research." - Lecturer at the University of Nairobi, 2022

"This was a fantastic boot camp and so well organized.  The lectures provided the background required for the labs and the labs are extremely helpful for working through problems that can then be applied to our own work!" - Assistant Professor at UCSF, 2022

"Excellent and an eye opener. Very good introduction." - Associate Professor at the University of Nairobi, 2022

"Very well done and well organized.  All of the tools worked on the first try and there were no excessive tangents.  One of the best and most useful workshops I've attended." - Assistant Research Professor at the University of Rhode Island, 2022

"This was a fantastic training with a very engaged group of students and teachers. There was a perfect balance between theoretical underpinnings and practical advice."- Faculty Member from Harvard University, August 2021 Virtual Training 

"The ML boot camp was a wonderful experience that provide a thorough review from the basics to advanced techniques for sophisticated analysis of data. I found it extremely accessible as a non-expert in the field. " - Postdoc at Memorial Sloan Kettering Cancer Center, August 2021 Virtual Training

"This was really a phenomenal course, I don't think I could have had a better experience in terms of learning a wide range of machine learning techniques." - Postdoc at The University of Pittsburg Medical Center August 2021 Virtual Training

"The ML Boot Camp is a wonderful introduction to machine learning in a way that is accessible to non-computer scientists." - Postdoc at the Wistar Institute, August 2021 Virtual Training

"This bootcamp is an excellent introduction to machine learning. It covers many important fundamental concepts of machine learning and its nuances are well-taught during the lab sessions. The instructors are also well-versed in the field and do a great job explaining these complex concepts." - Faculty member from Harvard University, August 2020 virtual training

"The workshop introduces up-to-date concepts and provide training using timely examples with well-integrated insights." - Postdoc from MIT, June 2020 virtual training

"Great instructors, interactive, and an excellent short course!" - Corporate Staff member at Johnson & Johnson, August 2020 virtual training

"The bootcamp was an informative graduate-level introduction to ML methods and covered a breadth of topics in a short amount of time. The instructors gave very clear presentations and coding demos, and I appreciated how approachable they were in terms of answering questions and offering advice." - Postdoc at the NIH, August 2020 virtual training

"It was a great introduction to ML and it provided me with the right tools to apply these techniques in my own research." - Sujith R., Faculty member from University of Mississippi, 2019

Registration Fees

This is a Hybrid training, meaning attendees can choose to attend the live training either 1) in-person in NYC or 2) virtually livestream via Zoom. Fees in the table below are in-person costs. Virtual registration fees are $200 less.

  Early-Bird Rate* Regular Rate*    Columbia Discount**
Student/Postdoc/Trainee     $1,195 $1,395 10%
Faculty/Academic Staff/Non-Profit Organizations/Government Agencies $1,395 $1,595 10%
Corporate/For-Profit Organizations $1,595 $1,795 NA


*Virtual Livestream Fee: Fees in the table above are in-person costs. Those attending virtually livestream via zoom instead of in-person on campus will pay a reduced registration fee by $200 for each category. Pricing will be reflected during the registration process based on category and time of registration.

**Columbia Discount: This discount is valid for any active student, postdoc, staff, or faculty at Columbia University. If paying by credit card, use your Columbia email address during the registration process to automatically have the discount applied. If paying by internal transfer within Columbia, submit this Columbia Internal Transfer Request form to receive further instructions. Please note: filling out this form is not the same as registering for a training and does not guarantee a training seat.  

Invoice Payment: If you would prefer to pay by invoice/check, please submit this Invoice Request form to receive further instructions. Please note: filling out this form is not the same as registering for a training and does not guarantee a training seat.

Registration Fee:

  • In-Person Registration Fee includes course material, breakfast, and lunch on training days. Course material will be available to all attendees during and after the workshop. Lodging and transportation are not included.
  • Virtual Registration Fee includes course material, which will be made available to all participants both during and after the conclusion of the training.  

Cancellations: Cancellation notices must be received via email at least 30 days prior to the training start date in order to receive a full refund, minus a $75 administrative fee. Cancellation notices received via email 14-29 days prior to the training will receive a 75% refund, minus a $75 administrative fee. Please email your cancellation notice to Due to workshop capacity and preparation, we regret that we are unable to refund registration fees for cancellations <14 days prior to the training.

If you are unable to attend the training, we encourage you to send a substitute within the same registration category. Please inform us of the substitute via email at least one week prior to the training to include them on attendee communications, updated registration forms, and materials. Should the substitute fall within a different registration category your credit card will be credited/charged respectively. Please email substitute inquiries to In the event Columbia must cancel the event, your registration fee will be fully refunded.

Additional Information

The Machine Learning Boot Camp is hosted by Columbia University's Department of Environmental Health Sciences and Department of Biostatistics in the Mailman School of Public Health, and the Irving Institute for Clinical and Translational Research: Biostatistics, Epidemiology, and Research Design (BERD) Educational Resource.

Jump to: Overview | Audience & Requirements | InstructorsScholarships | Locations | Testimonials | Registration Fees | Additional Information