computer vision · biometric auth

FRS VERSION 3.0

Face
Recognition
System

A production-grade face authentication module combining MTCNN detection, FaceNet embedding, anti-spoofing CNN, and gaze tracking into a single composable Python class.

512

Embedding dims

Guard layers

0.75

Cosine threshold

<1s

Auth latency

👤

livenessPASS ✓

gazeFORWARD ✓

similarity0.847

authentication flow

Recognition
Pipeline

face_recognize_from_image(frame, threshold=0.75)

00 · input

🖼️

BGR Frame

cv2 ndarray

→

01 · detect

🔲

MTCNN

P→R→O net

→

02 · gaze

👁️

Gaze Check

iris x∈[.25,.75]

→

03 · liveness

🛡️

Anti-Spoof

CNN ensemble

→

04 · encode

🧠

FaceNet

512-d embed

→

05 · match

📐

Cosine Sim

top-K score

→

06 · out

✅

name | None

str or None

security architecture

Three Guard
Layers

👁️ Gaze Detection

MediaPipe Iris Tracking

Class: GazeDetector

Uses FaceMesh with refine_landmarks=True to extract 478 facial landmarks including iris keypoints.

Left iris: landmarks 474–477 · Right iris: 469–472

Computes average iris center X per eye. If either falls outside [0.25, 0.75] normalized range — user is not looking forward — authentication is immediately denied. Prevents angled photo attacks.

🛡️ Liveness Detection

Silent-Face CNN Ensemble

Class: predict_liveness

Loads all models from resources/anti_spoof_models/. For each, parses h_input, w_input, scale from filename via parse_model_name().

Crops face at each scale, runs inference, accumulates scores. Final label = argmax of summed predictions.

Label 1 = Real ✓ · Label 0 = Spoof ✗. Guards against photo, screen replay, and 3D masks.

🧠 Face Embedding

FaceNet / InceptionResnetV1

Class: FaceRecognition

MTCNN(keep_all=False) detects primary face and returns aligned tensor.

InceptionResnetV1(pretrained='vggface2') encodes it into 512 dims. Runs under torch.no_grad().

Compared against all .npy files via cosine_similarity. Best match above 0.75 is returned. All scores are logged.

code structure

Class
Overview

# Face Recognition System Version 3.0 — three composable classes class GazeDetector: # MediaPipe FaceMesh · iris landmarks 469-477 is_looking_forward(frame, threshold=0.05) → bool checks iris center X ∈ [0.25, 0.75] class predict_liveness: # Silent-Face Anti-Spoofing · model ensemble test(image) → bool loops models/ · crops at scale · sums predictions label 1 = real · label 0 = spoof class FaceRecognition: # MTCNN detector + InceptionResnetV1 encoder set_face_embedding(name) → saves {name}.npy get_face_image(image) → aligned face tensor get_embedding(face_img) → 512-d torch.Tensor compare_embeddings(emb1, emb2, thr) → bool face_recognize(video_index=0) → name | None (CLI mode) face_recognize_from_image(frame, thr) → name | None (API mode)

technical specs

Model
Details

512

FaceNet dims

478

FaceMesh landmarks

MTCNN stages

N+

Anti-spoof ensemble

Component	Model / Library	Role	Output
Face Detector	MTCNN — facenet_pytorch	P-Net → R-Net → O-Net cascade. Detects, aligns, and crops face region from raw BGR frame	aligned tensor
Face Encoder	InceptionResnetV1 — VGGFace2	Deep CNN pretrained on VGGFace2 dataset. Extracts 512-dim discriminative face representation	512-d vector
Gaze Tracker	FaceMesh — MediaPipe	Refine landmark mode. Tracks iris keypoints 469–477. Checks forward-facing constraint	bool
Liveness	AntiSpoofPredict — Silent-Face	Multi-scale face crop + CNN ensemble. Accumulates prediction scores across all loaded models	bool
Similarity	cosine_similarity — PyTorch	Scores all enrolled users simultaneously. Best match above threshold 0.75 wins	float + name

usage modes

CLI vs
API Mode

🖥️ face_recognize() — CLI

Inputwebcam (video_index)

Capture5s countdown

Gaze check✓ yes

Liveness✓ yes

Thresholdcosine 0.6

Use caseterminal / local

🌐 face_recognize_from_image() — API

Inputcv2 frame from server

Capturebrowser WebRTC

Gaze check✓ yes

Liveness✓ yes

Thresholdcosine 0.75 (stricter)

Use caseFastAPI / SonDrive

enrollment

Face
Registration

📷

Capture via Webcam

5-second countdown lets user position face. capture_face_5s(video_index) reads from OpenCV VideoCapture.

cap = cv2.VideoCapture(video_index)

🔲

MTCNN Face Detection

MTCNN detects face in frame. Returns an aligned, normalized face tensor ready for encoding.

face = self.mtcnn(image) # → aligned tensor

🧠

FaceNet Encoding

InceptionResnetV1 encodes face into 512-d embedding vector. Runs under torch.no_grad().

emb = model(face.unsqueeze(0)) # → [1, 512]

💾

Save Embedding

Saved as {name}.npy in embedded_face/. Loaded at auth time via NumPy for comparison.

np.save("embedded_face/{name}.npy", emb.numpy())

FaceRecognitionSystem

RecognitionPipeline

Three GuardLayers

ClassOverview

ModelDetails

CLI vsAPI Mode