Why On-Device Machine Learning?

Machine Learning is an exciting area that lets you build new AI-powered features for your apps or sites. The process of Machine Learning involves using models that have either been pre-trained for you, or that you train yourself, to give your app or site functionality in areas such as Computer Vision, Natural Language Processing, and more.

Traditionally, ML models only ran on powerful servers in the Cloud. On-device Machine Learning is when you perform inference with models directly on a device (e.g. in a mobile app or web browser). The ML model processes input data—like images, text, or audio—on device rather than sending that data to a server and doing the processing there.

Innovations like the TensorFlow Lite framework make it possible to run on-device machine learning models on Android and iOS, desktops, and other consumer devices. Drastic improvements in compute power (CPU, GPU and through dedicated ML accelerator blocks, like NPUs) bring performance close to what could be achieved on dedicated servers.

The advantages of on-device ML

Running machine learning on-device has many advantages, including:

Low Latency

  • There is no round trip to the server and no wait for results to come back.
  • You can leverage hardware acceleration (e.g. GPU or TPU) on the device for faster results.
  • Due to low latency inference, unlocks new use cases, such as processing video in real-time. For example, using a segmentation model to remove the background in a video call, or tracking objects with the camera powering visual search.


  • By using on-device ML, the data processing can happen on your user’s device. This means that you can also apply machine learning to sensitive data which you would not want to leave the device.
  • If you apply end-to-end encryption for data — like messages — on-device machine learning still makes it possible to provide smart data-powered features.

Works offline

  • Running ML models on-device means that you can build ML-powered features that don’t rely on network connectivity.
  • In addition, since processing of all input data happens on the device, you can reduce the usage of your user’s mobile data plan.

No cost

  • Makes use of the processing power of the device rather than maintaining additional servers or paying for cloud compute cycles.

Some limitations of on-device ML

On-device machine learning does have some limitations. Due to the nature of machine learning, the models that are trained can become rather large. Although this is not a problem when they run on a server, it can be limiting when running these on a client. Mobile devices are more restricted with respect to storage, memory, compute resources, and power consumption limitations.

As a result, on-device models need to be much smaller than their server counterparts. Typically, that means that these are less powerful. Your use case will determine whether using on-device models is a good fit.