LOSSLESS TEXT COMPRESSION FOR UNICODE TAMIL DOCUMENTS

Journal Title: ICTACT Journal on Soft Computing - Year 2018, Vol 8, Issue 2

Abstract

Data compressions for different world languages including Indian languages are in high need and demand. Tamil language is one of the longest-surviving classical languages in the world. Usage of Tamil language for communication and storage was increased due to the digitization of government documents and orders. Lossless text compression process for Tamil language document involves substituting an ASCII character in place of Unicode Tamil characters, since the size of an ASCII character is one byte where as a Unicode character size range between 1 byte to 4 bytes depends on the encoding file storage type. The decompression process involves the reverse of compression technique (i.e) replacing ASCII characters with Unicode characters. This paper describes about the architecture of compression and decompression process for Tamil text documents.

Authors and Affiliations

Vijayalakshmi B, Sasirekha N

Keywords

Related Articles

OPTIMUM PARAMETERS SELECTION USING BACTERIAL FORAGING OPTIMIZATION FOR WEIGHTED EXTREME LEARNING MACHINE

Extreme Learning Machine (ELM) is a Single Layer Feed Forward Network (SLFN) model with extremely learning capacity and good generalization capabilities. Generally, the performance of ELM for classification task highly b...

A NOVEL APPROACH FOR LONG TERM SOLAR RADIATION PREDICTION

With present stress, being laid on green energy worldwide, harnessing solar energy for commercial use has importance in sizing and long-term prediction of solar radiation. However, with continuous changing environment pa...

ENHANCED HYBRID PSO – ACO ALGORITHM FOR GRID SCHEDULING

Grid computing is a high performance computing environment to solve larger scale computational demands. Grid computing contains resource management, task scheduling, security problems, information management and so on. T...

OPTIMIZATION OF GRID RESOURCE SCHEDULING USING PARTICLE SWARM OPTIMIZATION ALGORITHM

Job allocation process is one of the big issues in grid environment and it is one of the research areas in Grid Computing. Hence a new area of research is developed to design optimal methods. It focuses on new heuristic...

LOSSLESS TEXT COMPRESSION FOR UNICODE TAMIL DOCUMENTS

Data compressions for different world languages including Indian languages are in high need and demand. Tamil language is one of the longest-surviving classical languages in the world. Usage of Tamil language for communi...

Download PDF file
  • EP ID EP532420
  • DOI 10.21917/ijsc.2018.0227
  • Views 40
  • Downloads 0

How To Cite

Vijayalakshmi B, Sasirekha N (2018). LOSSLESS TEXT COMPRESSION FOR UNICODE TAMIL DOCUMENTS. ICTACT Journal on Soft Computing, 8(2), 1635-1640. https://europub.co.uk/articles/-A-532420