Yoti facial age estimation – newest model evaluation by NIST

Erlend Davidson 7 min read
An image of a person holding their phone up and performing a facial age estimation.

We are delighted to share our latest evaluation by the National Institute of Standards and Technology (NIST) for our newest facial age estimation model. We have seen notable performance improvements across a number of metrics. 

NIST’s evaluation is extremely thorough – they have over 20 million images for evaluation – and NIST develops their testing methodology over time. This helps to highlight models that are robust across multiple datasets and scenarios. 

Our strategy for developing our model is not “data dependent”. Machine learning models can benefit greatly from quantity of data at the initial stage, but once reaching maturity, there is a significant diminishing return to adding more and more data. We do not use data from our checks, for example our facial age estimation images, nor scrape the web for images. 

We do continue to add consented training data, but that is not the primary source of improvements we see over time. Instead, we have built our model to learn various characteristics of an image based on specific outcomes. This means we can scale our model more effectively, react to moving threat vectors, improve efficacy and work to reduce bias.

For example, this means we can train our model to avoid incorrect markers:

  • A wrinkled forehead could be a frown or due to the ageing process.
  • Take less notice of glasses as they may be more prevalent in older age groups, but are not indicative of age.
  • A beard would generally indicate someone is over about 18 years of age for a male, but could also be an easy presentation attack.

This is a challenge to balance out differing signals. Our strategy from the start has been to build a model that performs best to meet better real-world requirements for businesses and regulators. This latest NIST evaluation neatly demonstrates this.

 

Key takeaways from our latest NIST evaluation

We have seen a number of improvements from our previously submitted model – the key takeaways are:

Mean Absolute Error (MAE) has improved from 3.102 to 2.615 for the NIST mugshot data set

NIST uses multiple datasets, each of which have different characteristics. Principally, they also vary in terms of image quality:

NIST image sizes used in the evaluation of Yoti's Facial Age Estimation

This has a statistically significant effect on model performance. Even though the mugshot data set is only tagged by year of age (not, month, or even day, of birth). 

  • We consider this the closest dataset to our real world use case due to the higher image quality. Mobile images we typically use are even higher quality at 720 x 800 pixels. 
  • This takes us from 12th to 3rd best company for MAE.
  • The gap to the first place is now just 0.2 years MAE – statistically very low – this equates to just over two months.

 

Yoti facial age estimation improvement over time

Yoti has submitted four models for NIST evaluation. In the table below you can see how our overall performance for MAE has improved over time across all datasets, but for the mugshot dataset specifically from 3.78 to 2.615. This table shows MAE improvement for the 18-30 age group, stratified across age and gender.

A table showing the mean absolute errors of Yoti's four facial age estimation models submitted to NIST.

How did we do this?

Not the easy way – older age groups aren’t critical for accuracy with respect to current and impending legislation. The easy way to reduce MAE over an entire dataset would be to reduce the average error, which is much higher, for older demographics. 

Our approach to improving our overall MAE was to reduce the error rate for the 18-30 age group, without materially sacrificing accuracy on older and younger persons.

The 18-30 age group covers many new and existing legislative developments, and is what our clients require to meet this legislation. As you can see below, our 39+ evaluation is very close between the two models, but with significant improvement in younger ages. 

An image showing the mean absolute error of Yoti's previous 003 model compared to its current 004 model.

Bias across gender and skintone

NIST uses geographical regions as a proxy for skintone, whilst acknowledging this is not a perfect solution. 

Below you can see Yoti’s performance across gender and region, where the MAE have the highest and lowest error rate. Closer to zero (the smaller the bar) the better, across the chart, is desirable.

A chart showing how Yoti's current model performs, by year of age, according to demographic group.

As you can see – we underperform across 14-16 year old females. This is why independent testing is critical – it shows us where we need to work on our model for specific demographics. 

Here, as an example, is a model from an alternative vendor who is one of the other top 5 models:

A chart showing how another anonymous vendor's model performs, by year of age, according to demographic group.

Minimising bias is not just one of our principles, but is something regulators and businesses demand. We have always publicly released our own accuracy testing across age, gender and skin tone. Our goal is to minimise bias for everyone, and we recognise demographics where we need to improve, and focus resources into resolving those issues. 

 

Most robust model

Our models are relatively robust (invariant) to facial expressions. This is an early experimental test with NIST, but an important and indicative one. This involves adding and removing glasses and changing facial expression – smiling, frowning, talking and just being your neutral self. This test measures the difference in age estimation across those scenarios. 

The charts below show the difference in age estimation of a video of a person changing their facial expressions, then putting on glasses and repeating the same. The primary goal for a robust model would be to stay close to the blue line (the actual age of the individual). A secondary goal to have as little variance as possible across the test.

Here, Yoti comes top with the lowest noise. Yoti’s latest model (yoti-004) is bottom left. 

A series of charts showing how each vendor's model varies when estimating the age of one person who is changing their expression.

Summary 

Balancing model performance across all of these various metrics is a challenge, but important for online safety and for businesses to meet their obligations, as well as for regulators to feel satisfied that high assurance age checks are being performed well, and in a balanced way. 

At Yoti, it is one of our principles that we build technology that works for all – we strive to build technology that performs fairly. That has led us to build a model from the start in a way that doesn’t just rely on increasing amounts of data to improve. All of this data is publicly available on the NIST website. 

Combined with our world leading liveness and SICAP, businesses can have confidence that they are using a world class age check solution. 

To learn more about facial age estimation, you can read our white paper or get in touch.

 

– Erlend, Head of Research & Development 

Want to read more like this?

Signup for our newsletter

Keep reading

An image of Robin with accompanying text that reads "Thoughts from our CEO, Robin Tombs, April 2026".

Thoughts from our CEO: The growth in UK digital ID checks

In this blog series, our CEO Robin Tombs will be sharing his experience, whilst focusing on major themes, news and issues in the world of identity verification and age assurance. This month, Robin speaks about the expanding use of digital IDs across employment and DBS checks, user choice and the role of privacy-preserving digital IDs for alcohol sales.   The growth of UK digital ID checks Yoti is seeing strong growth in Digital ID checks for UK right to work and Disclosure and Barring Service (DBS) checks. In March 2026, Yoti completed 254,200 right to work and DBS checks.

9 min read
A selection of images showing the various stages of using the ID Checker app to verify a customer's Digital ID when proving their age.

Yoti and Luciditi demonstrate interoperable, privacy-first proof of age with Digital IDs for UK alcohol sales

Global Age Assurance Summit – Manchester – April 16, 2026 – Digital identity companies Yoti and Luciditi have successfully demonstrated how interoperable digital identity solutions can enable secure, privacy-preserving proof of age for alcohol sales across the UK market. The collaboration shows how different providers can work together seamlessly, giving businesses flexibility and consumers more choice, with both companies certified under the UK Digital Verification Services Trust Framework, aligning with trusted UK standards. This interoperability extends across the network, including Yoti ID, Post Office EasyID and Luciditi, which are all designed to work together. In a live demonstration at

4 min read
An illustrated newspaper front page with the words "Two big wins in age assurance: Global Age Assurance Summit".

Lifetime Achievement for Robin Tombs alongside recognition for Yoti client Yubo at Global Age Assurance Summit

London, UK – 16th April 2026 – Digital identity company Yoti has today announced that its CEO and co-founder, Robin Tombs, has been awarded the Lifetime Achievement in Age Assurance at the Global Age Summit, recognising the significant contribution he has made to building privacy-preserving age assurance into a global industry standard. At a time when many in the industry prioritised data collection and convenience, Robin founded Yoti in 2014 and championed a different approach: data minimisation, giving users control over their data and independent accountability. More than a decade on, Yoti’s approach has become the benchmark for responsible,

3 min read