Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Face recognition systems have become ubiquitous across consumer electronics, access control, border management, and financial services. However, their widespread adoption has made them an attractive target for presentation attacks — attempts to bypass the system using photographs, videos, masks, or digital replays of a legitimate user’s face. ISO/IEC 29143:2022, as Part 6 of the biometric presentation attack detection series, specifically addresses the detection of presentation attacks targeting face recognition modalities.
The standard classifies face presentation attacks into several fundamental categories. Photo attacks involve printed or displayed still images. Video replay attacks use pre-recorded video of the genuine user displayed on a screen. Mask attacks employ 3D facial approximations made from materials like silicone, latex, or resin. Digital injection attacks bypass the physical camera entirely by injecting manipulated frames directly into the processing pipeline. Each category demands distinct countermeasure strategies.
Texture analysis leverages the fundamental differences between live skin and artificial materials. Live facial tissue exhibits characteristic micro-texture patterns including fine wrinkles, pores, and skin ridges that are difficult to replicate accurately. Techniques such as Local Binary Patterns (LBP), Binarized Statistical Image Features (BSIF), and Gray-Level Co-occurrence Matrix (GLCM) extract these textural signatures. The standard specifies performance baselines for texture-based methods across varying image resolutions and lighting conditions.
Motion analysis exploits the difference between natural facial movements and the rigid motion of a photograph or screen. Techniques include optical flow analysis to detect 3D facial structure from parallax, challenge-response protocols requiring specific facial actions such as blinking or head rotation, and micro-motion analysis that detects subtle blood flow patterns through photoplethysmography (PPG). ISO/IEC 29143:2022 provides detailed guidance on evaluating the statistical robustness of motion-based detection under various attack scenarios.
| Detection Technique | Attack Types Detected | Strengths | Limitations |
|---|---|---|---|
| Texture (LBP / BSIF) | Photo, video replay | Low computational cost, works on single frames | Vulnerable to high-quality prints and masks |
| Motion (optical flow) | Photo, rigid mask | Robust to material quality variations | Requires multiple frames, sensitive to lighting |
| Challenge-response | Photo, video replay | Simple to implement and interpret | User experience friction, predictable patterns |
| PPG / remote photoplethysmography | Photo, video, mask | Detects physiological liveness signals | Requires good lighting, longer capture time |
| Multi-spectral imaging | Photo, mask | Material-discriminant beyond visible spectrum | Requires specialized NIR / SWIR camera hardware |
| Depth sensing (ToF / stereo) | Photo, video | Direct 3D structure measurement | Limited resolution at distance, sensor cost |
Multi-spectral imaging extends liveness detection beyond the visible spectrum. Live skin exhibits distinctive reflectance properties in near-infrared (NIR) and short-wave infrared (SWIR) bands that differ markedly from printed ink, display emissions, and silicone. The standard defines test protocols for evaluating spectral discriminability across different materials and skin tones. Additionally, depth sensing using time-of-flight (ToF) cameras or structured light projectors provides direct 3D geometry measurements that can distinguish a live face from a planar photograph or curved mask surface.
Deploying a production-grade face PAD system requires careful consideration of computational constraints, environmental variability, and adversarial adaptation. The standard emphasizes that PAD performance must be validated across diverse demographic groups, lighting conditions, and camera hardware configurations to avoid biased or brittle implementations.
From an implementation perspective, the standard recommends several architectural patterns. First, a cascade approach where computationally inexpensive texture analysis filters obvious attacks early, passing only borderline cases to more expensive motion or depth analysis. Second, ensemble scoring that fuses decisions from multiple independent detectors weighted by their modality-specific confidence. Third, adversarial retraining pipelines that periodically update detection models using newly collected attack samples from deployed systems.
Latency budgets for face PAD in real-time access control typically range from 200 ms to 500 ms for the complete capture-and-decision pipeline. The standard provides guidance for measuring and reporting per-component latency contributions to facilitate system-level optimization.