What if you could predict how many calories you burn on your next bike ride — using machine learning?
I just built a full-stack ML app that does exactly that.
👇 Here's how I did it.
💡 The idea
As a road cycling enthusiast and aspiring ML Engineer, I wanted to create something useful, personal, and technically solid.
Calories burned during a ride depend on several factors: weight, distance, duration, heart rate, and more.
So I decided to build a regression model trained on realistic synthetic data, deploy it with Streamlit, and make it available as a public web app.
⚙️ The stack
- Data generation: 500-row synthetic dataset using realistic physiological formulas (Zillman’s equation)
-
Modeling:
RandomForestRegressor
from scikit-learn - Web App: Streamlit UI to enter data and display prediction
- Deployment: Hosted on Streamlit Cloud
📊 The dataset
The dataset was generated with simulated values for:
- Weight (50–100 kg)
- Duration (30–180 min)
- Distance (10–150 km)
- Age (16–65 years)
- Heart Rate (as % of age-based max HR)
Then I calculated calories burned using:
calories_per_min = (
-55.0969 + 0.6309 * heart_rate + 0.1988 * weight + 0.2017 * age
) / 4.184
calories = calories_per_min * duration_min
This gave me realistic values from ~300 to 1500 kcal per ride.
🧠 The model
I trained a RandomForestRegressor on the dataset, and compared it to a basic linear regression.
- Linear Regression: MAE ≈ 153 kcal
- Random Forest: MAE ≈ 76 kcal
Pretty solid results considering the variability of heart rate and ride durations.
🌐 The app
I built a clean, simple Streamlit interface where you can enter:
- Your weight (kg)
- Ride duration (min)
- Distance (km)
- Average heart rate (bpm)
👉 Try it live here:
🔗 https://caloriespredictor-vdzysn7hjkwcxm8sss5wyu.streamlit.app/
🧑💻 Why I built this
I’m currently transitioning into machine learning & data engineering.
This project was my way to:
- Learn end-to-end project flow
- Combine my passion (cycling) with code
- Show my skills in data science, ML, and deployment
The code is open-source and available here:
🔗 https://github.com/arnaudstdr/calories_predictor
📈 What’s next ?
I’m already working on a follow-up project to predict fatigue or overtraining risk using physiological and training data.
If you’re interested in:
- ML in sport & health
- Physiological data modeling
- Human-centered AI apps
Let’s connect! I’d love to discuss, get feedback or collaborate.
Thanks for reading!
👉 Drop a comment, star the repo, or try the app 🚀