Instructions to use meirnm13/Olympic-Athlete-Weight-Prediction with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use meirnm13/Olympic-Athlete-Weight-Prediction with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("meirnm13/Olympic-Athlete-Weight-Prediction", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
- π₯ Olympic Athlete Weight Prediction
- This Data Science project analyzes over a century of Olympic data to understand athlete physiology and predict their weight categories using Machine Learning.
Watch Project Video
- π 1. The Story of the Data (EDA)
- π§ 2. Smart Feature Engineering
- π― 3. Regression Results (Predicting Exact Weight)
- βοΈ 4. Classification Results (Predicting Weight Class)
- π Bonus: 3D Visualization
- π Project Files
- This Data Science project analyzes over a century of Olympic data to understand athlete physiology and predict their weight categories using Machine Learning.
Watch Project Video
π₯ Olympic Athlete Weight Prediction
From 120 Years of History to Predictive ML Insights
This Data Science project analyzes over a century of Olympic data to understand athlete physiology and predict their weight categories using Machine Learning.
π 1. The Story of the Data (EDA)
Before building models, we explored how the Olympics have evolved. The graph below shows the surge in participation over time, clearly highlighting the cancellations during WWI and WWII.
We also identified a strong correlation (0.8) between Height and Weight, which served as the core predictor for our models.
π§ 2. Smart Feature Engineering
To improve predictions, we didn't just use raw data. We used Unsupervised Learning (K-Means) to cluster athletes into 4 distinct "Physical Archetypes" (based on Height and Age) and visualized them using PCA.
This allows the model to "understand" body types beyond simple linear relationships.
π― 3. Regression Results (Predicting Exact Weight)
We trained three models to predict weight in kg. The Random Forest Regressor was the clear winner, outperforming linear models by capturing non-linear patterns in different sports.
- Winner: Random Forest
- R2 Score: ~0.71
βοΈ 4. Classification Results (Predicting Weight Class)
We converted the problem into a classification task (Low / Medium / High weight). The Random Forest Classifier achieved the best accuracy, successfully distinguishing between classes.
- Accuracy: ~74.4%
- Precision Focus: We prioritized precision to minimize "False Positives" in potential scouting scenarios.
π Bonus: 3D Visualization
Using Plotly, we mapped the decision boundaries in 3D space (Height vs. Age vs. Weight), confirming the complexity of the data structure.
π Project Files
Student_Notebook.ipynb- The complete Python code.random_forest_regressor.pkl- Trained Regression Model.random_forest_classifier.pkl- Trained Classification Model.
Created for the Data Science Course, 2025.
- Downloads last month
- -





