Machine learning typically means servers, GPUs, and Python scripts running in the cloud — but what if you could ship a self-updating ML model directly inside a static website? This post shows how to embed a tiny TensorFlow.js model into a static frontend, and use background fetches to update it over time — no backend needed.

Use Cases


  • Offline-friendly personalization (e.g., recommender tweaks)
  • Client-side anomaly detection or scoring
  • Privacy-preserving inference at the edge

Step 1: Train and Convert Your Model


Use TensorFlow to train a model locally, then convert it with:


tensorflowjs_converter \
--input_format=tf_saved_model \
./saved_model \
./web_model

This generates files like model.json and binary weight shards.

Step 2: Embed the Model in Your Static Site


Add the files to your build output and use TensorFlow.js to load them:


https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>

Step 3: Add Update Mechanism via Fetch


You can periodically check for new models and swap them in live:


async function updateModelIfAvailable() {
const response = await fetch('/web_model/model.json', {
cache: "no-cache"
});
const newModel = await tf.loadLayersModel(tf.io.browserHTTPRequest('/web_model/model.json'));
model = newModel; // hot-swap
}
setInterval(updateModelIfAvailable, 86400000); // every 24h

Step 4: Run Predictions Client-Side


Once the model is loaded, run predictions like this:


const input = tf.tensor2d([[0.1, 0.2, 0.3]]);
const prediction = model.predict(input);
prediction.print();

Pros and Cons

✅ Pros


  • No server or cloud infrastructure required
  • Offline-ready and CDN-cacheable
  • Improves over time with passive updates

⚠️ Cons


  • Model size must be small for web delivery
  • Security concerns with hot-swapped logic
  • No GPU acceleration on low-end clients

🚀 Alternatives


  • WebAssembly inference runtimes for compiled models
  • Remote API inference via a small proxy backend
  • ONNX.js as a cross-framework alternative

Summary


By embedding a TensorFlow.js model directly in your frontend and updating it via background fetches, you get a flexible way to ship intelligent, adaptive behavior without cloud infrastructure. Ideal for edge AI, web personalization, and hobbyist ML experiments.

If this was useful, you can support me here: buymeacoffee.com/hexshift