🥝 3DiVi Experts on a Massive Facial Recognition Trial in NZ Stores

The New Zealand Office of the Privacy Commissioner (OPC) recently released the results of an inquiry into an experimental facial recognition system (FRT) used by Foodstuffs North Island (FSNI) across 25 supermarkets, scanning over 225.9 million faces.

The inquiry examined the trial’s privacy impact, assessed its compliance with the Privacy Act, and evaluated whether FRT was an effective tool in reducing serious retail crime compared to less privacy-intrusive alternatives.

To help interpret the findings and highlight the most noteworthy insights for facial recognition in retail, we asked 3DiVi Sales Manager Mikhaylo Pavlyuk to give his commentary on the report’s key points and implications.

Key Takeaways from the OPC’s Report

The report outlines key elements of FSNI’s operational model that allowed the company to claim compliance with the Privacy Act (para. 8 of the “Findings relating to the trial” section). Several features of FSNI’s operational approach, as applied during the trial, support this conclusion.

A clear and limited purpose

FSNI used facial recognition only to identify individuals previously involved in serious incidents at stores: physical or verbal abuse, threats, aggressive behavior, and major thefts. Other uses were strictly prohibited.

Mikhaylo Pavlyuk: “This restriction on who can be added to the watchlist makes sense for several reasons. For one, law enforcement typically isn’t interested in minor incidents, and without police involvement, retailers have limited options to act.”

The system was effective to address that purpose

Independent evaluations and interviews with store employees suggested that FRT was generally an effective tool for reducing the number of serious repeat offenses during the trial period.

Mikhaylo Pavlyuk: “It’s worth noting how “effectiveness” is quantified. In different countries and studies, a 10–20% reduction—or more—is often considered the threshold for recognizing a method or program as effective.”

Fit for purpose technology

FSNI selected a system proven in real-world conditions—without staged photos. The technology was not trained on a New Zealand dataset, as no such dataset exists. However, it was trained on similar groups in Australia—including Maori and Pacific Islander populations—which helped minimize the risk of technical bias.

Mikhaylo Pavlyuk: “This is a complex point, and the justification is somewhat stretched. Training on similar populations in Australia does not guarantee fairness in New Zealand. Proving this would require extensive testing and validation locally.”

No use of images for system training

The software provider was explicitly prohibited from using the collected images for training purposes. The Privacy Commissioner supports the idea of creating a New Zealand-specific training dataset, but only with explicit consent from individuals.

Mikhaylo Pavlyuk: “No issues here—this is a strong and responsible approach.”

Immediate deletion of most images

Images not matching the watchlist were deleted almost instantly, representing the majority of captured faces.

Mikhaylo Pavlyuk: “It’s good practice. The question is whether the biometric descriptors were deleted too, which affects auditability.”

Rapid deletion of images where no action taken

Matches with no follow-up were deleted by midnight of the same day.

Mikhaylo Pavlyuk: “The term ‘matches’ is used here, but the same question applies—what happens to the biometric template? Is it deleted as well?”

Watchlists were generally of reasonable quality and carefully controlled

Each store maintained its own watchlist. Only trained staff could add individuals, strictly following criteria for serious offences. Adding children or young people under 18, elderly, or persons with known mental health conditions was prohibited.

Mikhaylo Pavlyuk: “From a facial recognition system provider’s perspective, this store-by-store watchlist policy is ideal, though unusual for a business to adopt such rigorous controls.”

Retention of watchlist information was limited

Primary offenders could remain on the watchlist for no more than 2 years, while accomplices were listed for a maximum of 3 months. This approach helped ensure that information stayed relevant and reduced long-term consequences for individuals.

Mikhaylo Pavlyuk: “The mention of ‘accomplices’ is confusing here—it seems to contradict point (a), which limited inclusion strictly to serious offenders.”

Watchlists are not shared between stores

All watchlists were store-specific and not shared across other FSNI locations. This ensured that individuals were not automatically barred from every store in the network, allowing them continued access to food and other essentials.

Mikhaylo Pavlyuk: “From a facial recognition vendor’s perspective, this store-by-store watchlist policy is ideal. But it’s surprising that the business agreed to such a model.”

Accuracy levels were acceptable, once adjusted in response to problems

Initially, matches triggered alerts at 90% confidence. After two identification errors, FSNI raised the threshold to 92.5% and improved image quality, verification procedures, and staff training. No further incidents occurred.

Mikhaylo Pavlyuk: “This part is not entirely clear. The report uses terminology that seems closer to marketing language than technical precision—likely because the research was carried out by a marketing agency. The term ‘confidence’ probably refers to a score, but to properly assess performance, we’d need details on false positives and false negatives at that threshold.”

Alerts were checked by two trained staff

Before the system was launched, staff were informed that it was not infallible. An alert required confirmation from at least two cameras, after which it was reviewed by two trained employees, who then decided whether to intervene or contact the police.

Mikhaylo Pavlyuk: “Manual verification is already a good practice. Having a double layer of manual checks is even better.”

Reasonable degree of transparency that the FRT trial was operating

Stores displayed large A1/A0 signs at entrances to inform customers that the trial was in operation, with additional signage inside. Information was also published on the FSNI website, and further details were available at the customer information desk upon request. Staff involved in the trial were trained to answer questions, while other employees were trained to direct inquiries appropriately.

Mikhaylo Pavlyuk: “This is a very valuable measure. Unfortunately, not every organization implementing FRT is willing to invest in proper staff training.”

No apparent bias or discrimination in how discretion was exercised

Based on sample checks, the Privacy Commissioner’s compliance team found no apparent bias or discrimination in how watchlists were created, how alerts were verified, or how intervention decisions were made.

Mikhaylo Pavlyuk: “This point is debatable. According to NIST reports, every algorithm shows some variation in accuracy across ethnic groups. The real question is the frequency and scale of such errors. The report does not provide methodology or supporting data here, which makes it difficult to validate the claim.”

Processes for requests and complaints

Individuals who believed they had been misidentified or wrongly added to a watchlist were able to file complaints. If an error was confirmed, their information was corrected or removed.

Mikhaylo Pavlyuk: “This is a very important element. A solid feedback and correction protocol is half the success of any such system.”

Security processes in place to protect information

Only authorised personnel had access to the system and the secure room where equipment was located. The FRT system was not automatically linked to the store’s incident reporting platform; any information had to be transferred manually according to strict criteria. All access was logged and reviewed by the Loss Prevention Manager. FRT alerts could only be received on authorised devices operating within the in-store network.

Mikhaylo Pavlyuk: “Honestly, the argumentation here is not very strong. Let’s put it down to a lack of deep cybersecurity expertise.”

Good record-keeping about system operation

FSNI created records on key events: numbers of matches, alerts, outcomes of interventions (including reasons for action or inaction), customer reactions, and whether interventions prevented or escalated harmful behaviour. This record-keeping allowed FSNI to monitor the effectiveness of the system.

Mikhaylo Pavlyuk: “This is a must-have function for any system of this kind, yet surprisingly many projects skip it. Ideally, there should also be a set of standard automatic performance metrics — hopefully FSNI had those in place.”

Stores have good security infrastructure and are committed to privacy measures

Stores were equipped with adequate CCTV coverage and dedicated rooms for FRT equipment. FSNI implemented policies and assigned personnel responsible for ensuring compliance with privacy requirements.

Mikhaylo Pavlyuk: “Having CCTV alone does not necessarily indicate robust security infrastructure, and it certainly doesn’t guarantee strong attention to privacy. Still, we’ll leave that assessment to information security experts.”

Following the trial, FSNI completed a detailed Privacy Impact Assessment (PIA) with the Office of the Privacy Commissioner (OPC), identifying key risks and implementing mitigation processes.

Further Improvements Needed

While the model largely complies with the Privacy Act, the inquiry highlighted several areas that need attention before FSNI can commit to long-term use or expand FRT:

Update the match algorithm so an alert is triggered at a higher accuracy level

Right now, the system flags matches at 90%, but staff are trained not to intervene below 92.5%. This gap needs to be fixed technically, and it might even make sense to aim higher—perhaps 94%.

Mikhaylo Pavlyuk: “Just keep in mind: raising the score threshold increases the risk of Type II errors—cases where a person is in the database but the system fails to flag them.”

Watchlist criteria should remain consistent with store practice during the trial that targeted genuinely harmful behaviour

FRT should target only serious offenses—violence, aggression, or major thefts. It must not be used for minor incidents or “problematic” individuals.

Mikhaylo Pavlyuk: “From my experience, most losses come from repeated minor thefts. This is something that should be discussed with the business and carefully documented. Individually small incidents can quickly add up to a significant total loss.”

Conclusion

Mikhaylo Pavlyuk: “Overall, excellent work has been done. As far as I know, it’s the first publicly available document of its kind, and it will be invaluable for retailers. My only notes: tighten up the facial recognition terminology and clarify some of the labels to avoid confusion. Small tweaks, but they’ll make a big difference in clarity.”

👉 Curious how biometrics can level up your business? Book your free consultation today.