BlazeFace hello world

BlazeFace is a neural network model that detects faces in images. It’s designed to be fast, to run at 30fps on mobile GPUs. There is a TensorFlow.js library for BlazeFace, which downloads the model, runs it in WebGL using TensorFlow.js, and wraps the raw model input/output with a friendly, semantic API. to start a demo, which captures and displays your webcam, runs BlazeFace against frames as often as possible, and draws the detected face landmarks on top of your webcam stream:

Here’s what I get when I run it against my own face:

Basic usage of this API is basically one function call; a pure function from input image to predicted faces:

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs@2.4"></script>

<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/blazeface@0.0.5"></script>
const model = await blazeface.load();
const predictions = await model.estimateFaces(webcamVideoEl, false);
console.log(predictions);

This will log an object looking something like:

[
  { // One detected face
    topLeft: [162.84341430664062,153.98446655273438],  // [x,y] coordinates
    bottomRight: [422.3966979980469,348.6485900878906],
    landmarks:[
      [238.96787643432617,204.8737621307373], // right eye
      [328.2145833969116,205.4714870452881],  // left eye
      [273.6716037988663,252.84512042999268], // nose
      [280.01041051000357,293.8540989160538], // mouth
      [206.6525173187256,226.03596210479736], // right ear
      [386.57989501953125,226.02698922157288] // left ear
    ],
    probability: [0.9997807145118713]  // Always a one-element array; a bit odd
  }
]

The topLeft and bottomRight coordinates define a “bounding box”, but I don’t know exactly what it’s supposed to bound. Certainly, it seems to always contain the six detected landmarks, but not precisely. My eventual goal is to draw a boundary around the head; the default bounding box is not necessarily helpful for this.

Under certain conditions, BlazeFace consistently recognized a face in my forehead, and was extremely confident about it:

Probably one of my weirder debugging sessions pic.twitter.com/AubMIkM1kI

— Jim Fisher (@MrJamesFisher) September 20, 2020

The bug seems to only happen when I use my high-resolution webcam feed. BlazeFace performs much more reliably with a lower-resolution webcam feed. This is very strange, because I believe the library resizes the input to 128×128 pixels before analyzing it. I’ll do a future post on the internals of this library, and how to use TensorFlow.js. This should help understand the weird forehead bug.

Tagged #programming, #web, #tensorflow, #machine-learning.

Similar posts

More by Jim

Want to build a fantastic product using LLMs? I work at Granola where we're building the future IDE for knowledge work. Come and work with us! Read more or get in touch!

This page copyright James Fisher 2020. Content is not associated with my employer. Found an error? Edit this page.