Please use this identifier to cite or link to this item: https://dspace.univ-ouargla.dz/jspui/handle/123456789/40056
Title: Exploring the Use of Large Language Models for Lossless Text Compression
Authors: Mechalkh, Charaf Eddine
Fennouh, Marya Douniazad
Keywords: Large Language Models
LLMs
Context-Aware
Text Compression
Com- pression
Issue Date: 2025
Publisher: UNIVERSITY OF KASDI MERBAH OUARGLA
Citation: FACULTY OF NEW TECHNOLOGIES OF INFORMATION AND COMMUNICATION
Abstract: The rapid growth in data generation has led to an increasing demand for efficient data compression techniques. Traditional compression methods, such as Huffman coding, LZ- based algorithms, and arithmetic coding, have proven effective in reducing file sizes. However, these techniques often fail to account for the contextual nature of data, which can limit their performance when handling complex, variable-length content such as text, images, or multi- modal data. In recent years, Large Language Models (LLMs) have demonstrated impressive capabilities in understanding and generating human-like text, making them a promising candidate for enhancing compression techniques through context-awareness. LLMs, with their ability to process large amounts of sequential data and recognize pat- terns, offer significant potential in improving compression by leveraging context in a more dynamic and adaptive manner. Unlike traditional methods that rely on fixed algorithms, LLM-based compression could adjust to the content being compressed, leading to more effi- cient encoding and potentially higher compression ratios. This thesis explores the potential of LLMs in context-aware compression. We investigate how LLMs, specifically GPT-like models, can be integrated into compression pipelines to op- timize encoding strategies based on the context within the data. Our objectives are to assess the advantages of LLM-enhanced compression methods compared to traditional techniques and demonstrate how context-awareness can lead to more efficient compression, particularly in complex or varied datasets. The results of our study show that LLM-based approaches can outperform traditional methods in certain scenarios, offering promising avenues for fu- ture research and practical applications in data compression.
Description: Artificial Intelligence and Data Science
URI: https://dspace.univ-ouargla.dz/jspui/handle/123456789/40056
Appears in Collections:Département d'informatique et technologie de l'information - Master

Files in This Item:
File Description SizeFormat 
FENNOUH.pdfArtificial Intelligence and Data Science896,24 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.