Please use this identifier to cite or link to this item: https://dspace.univ-ouargla.dz/jspui/handle/123456789/35678
Title: Embedding techniques and their application to Computer vision
Authors: KHERFI, Mohammed Lamine
CHERIET, Abdelhakim
ALLAOUI, Mebarka
Keywords: machine learning
dimensionality reduction
embedding
clus- tering
joint learning
Issue Date: 2023
Publisher: UNIVERSITY OF KASDI MERBAH OUARGLA
Abstract: Dimensionality reduction is a commonly employed technique in the field of machine learning and analytics, as it aids in the examination and representation of expansive data- sets characterized by a multitude of dimensions. This approach is precious for enhancing the effectiveness of tasks such as data clustering and classification. Recently, embedding methods have emerged as a promising direction for improving clustering accuracy. As a result, robust embedding and clustering techniques can be used to resolve real-life pro- blems. The contribution of the present dissertation is fourfold : (1) We explored ways to enhance the performance of several clustering algorithms by employing one of the most effective embedding techniques available. Our central hypothesis posits that the chosen embedding technique can enable the discovery of the optimal clusterable embedding ma- nifold. Consequently, we utilized it as a preprocessing step prior to clustering, thereby enabling the clustering algorithms to enhance their performance. (2) We performed em- bedding and clustering simultaneously through an original formulation, which allows for preserving the data’s original structure in the embedding space and producing a better clustering assignment. The unified manifold embedding and clustering (UEC) algorithm is based on a bi-objective loss function that combines data embedding and clustering, which is optimized using three different ways : 1) Comma Variant, 2) Plus Variant, and 3) Light Plus Variant. (3) The onset of the COVID-19 pandemic posed a significant challenge, making it increasingly arduous for researchers to keep abreast of the latest scientific advancements, given the rapid influx of scientific articles. To address this issue, we introduced an intelligent tool rooted in Machine Learning. This tool automatically organizes a vast repository of scientific literature pertaining to COVID-19 and presents it in a manner that facilitates easy navigation and swift document retrieval. The initial step involves preprocessing and transforming the documents into numerical features. Subse- quently, these features undergo dimensionality reduction into a 2D space through a deep denoising autoencoder followed by the Uniform Manifold Approximation and Projection technique (UMAP). The projected data is then clustered using the Agglomerative Cluste- ring Algorithm. Finally, we employ Latent Dirichlet Allocation (LDA) for topic modeling, assigning a label to each cluster. (4) We propose an innovative deep learning framework designed to structure the extensive collection of scientific literature pertaining to COVID- 19. The fundamental concept underlying this architecture revolves around the training of the autoencoder, which utilizes a two-fold objective function comprising distinct terms. The first term is devoted to assessing the latent representation, while the second is used to achieve the clustering assignments. Afterward, the Latent Dirichlet Allocation (LDA) is used as topic modeling techniques.
Description: Computer Vision
URI: https://dspace.univ-ouargla.dz/jspui/handle/123456789/35678
Appears in Collections:Département d'informatique et technologie de l'information - Doctorat

Files in This Item:
File Description SizeFormat 
Mebarka ALLAOUI-Doctorat.pdf1,28 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.