Winter 2026 | Michigan Data Science Team
299 heart failure patient records with 13 clinical features. Binary outcome: survival or death during follow-up period.
Two features (serum creatinine & ejection fraction) alone achieve strong predictive performance comparable to models using all 13 features.
9 weeks of hands-on lessons covering data exploration, statistics, ML, ensemble methods, deep learning, and model interpretation with real clinical data.
Introduction to the dataset, exploratory data analysis, and visualization techniques using Pandas, Seaborn, and Matplotlib.
Hypothesis testing (T-test, Mann-Whitney U), correlation analysis, FDR correction, and variance inflation factor (VIF) for multicollinearity detection.
Dimensionality reduction with PCA, K-Means clustering, hierarchical clustering, silhouette analysis, and the elbow method.
Classification algorithms including Logistic Regression, Random Forest, SVM, and KNN. Train/test splitting with stratification and model evaluation metrics.
Advanced tuning techniques: GridSearchCV, Random Search, and Bayesian Optimization with Optuna for efficient hyperparameter space search.
From Random Forest to Gradient Boosting to LightGBM. Hyperparameter tuning with Optuna, evaluation with 4 metrics, and hands-on exercises.
Why fewer features often beat more: Lasso (L1), Elastic Net (L1+L2), and MRMR filter method. Learn to identify which clinical features truly drive survival prediction.
Introduction to deep learning with Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTMs). Learn neural network fundamentals, backpropagation, activation functions, and how to apply deep learning to clinical prediction tasks.
Interpretability and explainability in machine learning. Partial Least Squares Discriminant Analysis (PLS-DA) for supervised dimensionality reduction, and SHAP (SHapley Additive exPlanations) for understanding model predictions and feature importance in black-box models.
Interactive Jupyter notebooks to help you set up and get started
Learn Git version control: installation, core concepts, essential commands, branching, and workflows.
Open TutorialMaster Python virtual environments (venv): setup, activation, package management, and troubleshooting.
Open TutorialThis project is based on the paper by Chicco & Jurman (2020): "Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone"