Design and Optimization of a Privacy-Preserving Speaker Identification System Using Pre-Trained Deep Learning Embeddings and Cancelable Biometrics

Khaldi, Belal; BOUSNINA, IMANE

Please use this identifier to cite or link to this item: https://dspace.univ-ouargla.dz/jspui/handle/123456789/40052

Title:	Design and Optimization of a Privacy-Preserving Speaker Identification System Using Pre-Trained Deep Learning Embeddings and Cancelable Biometrics
Authors:	Khaldi, Belal BOUSNINA, IMANE
Keywords:	Speaker identification cancelable biometrics pre-trained deep learning ECAPA- TDNN DRPE
Issue Date:	2025
Publisher:	UNIVERSITY OF KASDI MERBAH OUARGLA
Citation:	FACULTY OF NEW TECHNOLOGIES OF INFORMATION AND COMMUNICATION
Abstract:	Biometric authentication is increasingly adopted for secure system access, replacing tradi- tional passwords with unique physiological or behavioral traits, yet raw biometrics risk perma- nent identity theft due to their unchangeable nature if compromised. Cancelable biometrics mitigate this by transforming traits into revocable, non-invertible templates, thereby enhancing security and privacy. This thesis develops a privacy-preserving speaker identification system that integrates Mel-spectrogram preprocessing to convert audio into frequency-time representa- tions, pre-trained SpeechBrain ECAPA-TDNN embeddings to extract 192-dimensional feature vectors, and a Support Vector Machine (SVM) classifier with a linear kernel (C=10.0) for user classification. Privacy is fortified through Double Random Phase Encoding (DRPE), which encrypts speaker templates using two random phase masks applied in spatial and frequency domains via Fast Fourier Transform (FFT) and inverse FFT operations. The methodology encompasses metadata extraction from a 240-user dataset, audio resampling to 16,000 Hz, embedding generation and validation, SVM training with 7200 train and 3120 test samples, template creation by averaging embeddings, DRPE encryption, and identity verification using cosine similarity scores computed on encrypted pairs. Analysis, supported by confusion ma- trices and Receiver Operating Characteristic (ROC) curves, highlights challenges such as voice similarity between users, achieving a preliminary accuracy of approximately 99% and ongoing threshold optimization. Distinct from conventional methods, this system leverages pre-trained deep learning embeddings with cancelable biometrics, balancing security and reliability. Future work will refine encryption parameters, address user-specific variability, and explore deep learn- ing enhancements for spectrogram processing.
Description:	Artificial Intelligence and Data Science
URI:	https://dspace.univ-ouargla.dz/jspui/handle/123456789/40052
Appears in Collections:	Département d'informatique et technologie de l'information - Master

Files in This Item:

File	Description	Size	Format
BOUSNINA.pdf	Artificial Intelligence and Data Science	1,09 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets