Manoeuvre TF Lite in Edge Devices

What is TF Lite?

TF Lite models are lightweight models, production-ready and cross-platform framework for deploying ML models that are used to get inferences on edge devices like mobile phones and microcontrollers.

Ideal Audience

ML Engineers who are looking for ways to optimize models for deployment purposes.

Let’s take an example of a model which you have created and trained and now you want to make an inference of your model on edge devices like smartphones, raspberry pi and jetson nano.

To get a good prediction at your end your model should pass the following criteria, like

Flutter ❤️

Why Flutter?

Flutter provides fast development and with a single codebase, we can build apps for multiple platform i.e Android, iOS, Ubuntu, macOS and Windows. It also provides flexibility in terms of building a Custom UI. Hot Reloading also makes the developement process smooth.

What is Provider?

Provider is the most popular for state management in Flutter. It is highly recommended for beginners who want to learn state management.

So, you might be wondering what provider?

Provider is a state management helper. It’s a widget that makes some value — like a state model object.

In this article, I will show you how to…

In this blog, we will be understanding the concept of weight pruning with Keras. Basically, weight pruning is a model optimization technique. In weight pruning, it gradually zeroes out model weight during the training process to achieve model sparsity.

This technique brings improvements via model compression. This technique is widely used to decrease the latency of the model.

I will be implementing weight pruning in the Fashion MNIST dataset where I have made a comparison between the normal way and the pruning method.

The example which I will be implementing will be required Tensorflow version 2.4 as well as


In the previous article, we have discussed saving our Tensorflow models in TF-Lite format. Now let’s understand why it is important to optimize models.

  1. Latency will also get reduced. (The amount of time taken by the model to get a single inference in the edge devices is known as latency.) Latency also reduces power consumption in edge devices.
  2. For some hardware accelerated devices like Edge GPU, the model…

Sayan Nath

I am amongst the top contributors in Github from India currently, my rank is #136. I am an aspiring Junior Data Scientist at Codebugged AI.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store