Face recognition and comparison for onboarding
Published on: 2024-08-10 18:48:28
Face recognition has changed fast in recent years. Deep learning drove that shift. It made remote identity verification practical. Face comparison against identity documents, which used to be manual, is now moving online.
Humans are good at recognizing and comparing faces because the brain has areas built for that task. Beyond face recognition, the brain also uses other signals to identify people. These include clothes, gender, location, context, and how someone walks or moves.
Getting artificial intelligence to reach similar results in full-person recognition would require multiple methods, so here we focus on the narrower problem of face recognition. Running face recognition online requires a process for validating inputs, running models, and making decisions from the data collected.
Photos used for decision making in face recognition usually fall into these types:
- Simple user-uploaded photo,
- Selfie taken during the recognition process,
- Photo captured during the liveness detection process.
The person's photo is then matched against another source. That source may be a government service such as NCIIC in China or Dukcapil in Indonesia, or a photo from an identity document such as an ID card, driving license, or passport.
Data input validation
Main methods for validating the authenticity of digital photos are:
Only two of these can usually be automated well without creating too many false positives. Error level analysis is limited because someone often still needs to inspect the result. It works more as a visual tool. It also misses some manipulations that are simple but effective. For example, a screenshot of a manipulated photo will often not be flagged by ELA.
Metadata analysis provides useful information, including the camera used, timestamps, the location of objects in the photo, and sometimes even geolocation. That helps when you need to confirm the photo was taken in the right place, such as a point of sale, was taken recently, or was not edited in Photoshop or other software. If metadata is stripped, treat that as a warning sign. If metadata is missing from every photo, check with your developers how the photos are captured, processed, and stored.
Last saved quality is tied to compression, because stored or edited photos are often compressed and no longer kept at the original quality produced by the device.
Many fake-photo risks can be reduced by using liveness detection in the process. That usually means a mobile app runs liveness detection algorithms and captures a photo during the flow, which is then sent for recognition. At that point, the likely attack vector is the API that uploads photo to the server. To reduce that risk, use more than standard hashing and encryption. Additional hardening methods can make API endpoints harder to abuse with fake data.
Running comparison
Once the incoming data is trustworthy, the next step is comparison. Government services often perform well. As for building your own model, that now makes little sense in most cases because deep learning models need large amounts of training data, and third-party services are already advanced and relatively cheap.
You can select a service on several dimensions: price, speed, and comparison quality. It is also worth checking performance across the racial profile of the people being compared. For example, Microsoft services often perform well on caucasian faces, but can perform poorly on Asian faces. For Asian faces, I have seen strong results from Face++. On caucasian faces, those same services can sometimes miss finer facial detail.
In most cases, I recommend using two services for face recognition. One handles analysis, and one handles comparison. Some teams run comparison only and skip analysis. That is a mistake. Analysis helps check what is being compared. Sometimes algorithms are clearly wrong, such as classifying someone as male when the image obviously shows a female.
Final decisioning
A practical process for face recognition decision making looks like this:
- Data validation - incoming data can be trusted
- Outlier or strange-result check - use analysis results for trouble detection
- Final decision - compare confidence results
Incoming data detection rules recommendation is as follow:
- No Photoshop or other software
- Camera maker matches phone make, based on other metadata such as browser metadata
- Geolocation is not off
- Photo is not too old
- Image metadata is present
Photo analysis
- Gender match
- Only one person detected in the photo
Comparison
- The comparison result is high, for example 99%
- Confidence threshold based on the vendor recommendation. Usually:
- 80%+ high confidence that it is the same person
- 60-80% some certainty
- <60% not the same person
Final words…
The goal of this post is to explain the face recognition process in broad terms. It does not cover everything. At most, it should be the start of a policy or strategy for decision making in identity verification, not the end. Human identification is a multi-part problem. If you focus only on calling cognitive services and setting a cutoff, you can end up with poor decision making because you assumed the input data was better than it really was.