What if you could predict how many calories you burn on your next bike ride — using machine learning?

I just built a full-stack ML app that does exactly that.

👇 Here's how I did it.


💡 The idea

As a road cycling enthusiast and aspiring ML Engineer, I wanted to create something useful, personal, and technically solid.

Calories burned during a ride depend on several factors: weight, distance, duration, heart rate, and more.

So I decided to build a regression model trained on realistic synthetic data, deploy it with Streamlit, and make it available as a public web app.


⚙️ The stack

  • Data generation: 500-row synthetic dataset using realistic physiological formulas (Zillman’s equation)
  • Modeling: RandomForestRegressor from scikit-learn
  • Web App: Streamlit UI to enter data and display prediction
  • Deployment: Hosted on Streamlit Cloud

📊 The dataset

The dataset was generated with simulated values for:

  • Weight (50–100 kg)
  • Duration (30–180 min)
  • Distance (10–150 km)
  • Age (16–65 years)
  • Heart Rate (as % of age-based max HR)

Then I calculated calories burned using:

calories_per_min = (
  -55.0969 + 0.6309 * heart_rate + 0.1988 * weight + 0.2017 * age
) / 4.184
calories = calories_per_min * duration_min

This gave me realistic values from ~300 to 1500 kcal per ride.


🧠 The model

I trained a RandomForestRegressor on the dataset, and compared it to a basic linear regression.

  • Linear Regression: MAE ≈ 153 kcal
  • Random Forest: MAE ≈ 76 kcal

Pretty solid results considering the variability of heart rate and ride durations.


🌐 The app

I built a clean, simple Streamlit interface where you can enter:

  • Your weight (kg)
  • Ride duration (min)
  • Distance (km)
  • Average heart rate (bpm)

👉 Try it live here:
🔗 https://caloriespredictor-vdzysn7hjkwcxm8sss5wyu.streamlit.app/


🧑‍💻 Why I built this

I’m currently transitioning into machine learning & data engineering.
This project was my way to:

  • Learn end-to-end project flow
  • Combine my passion (cycling) with code
  • Show my skills in data science, ML, and deployment

The code is open-source and available here:
🔗 https://github.com/arnaudstdr/calories_predictor


📈 What’s next ?

I’m already working on a follow-up project to predict fatigue or overtraining risk using physiological and training data.

If you’re interested in:

  • ML in sport & health
  • Physiological data modeling
  • Human-centered AI apps

Let’s connect! I’d love to discuss, get feedback or collaborate.


Thanks for reading!

👉 Drop a comment, star the repo, or try the app 🚀