Language Identification by Using SIFT Features

Abstract

 Two novel techniques for language identification of both, machine printed and handwritten document images, are presented. Language identification is the procedure where the language of a given document image is recognized and the appropriate language label is returned. In the proposed approaches, the main body size of the characters for each document image is determined, and accordingly, a sliding window is used, in order to extract the SIFT local features. Once a large number of features have been extracted from the training set, a visual vocabulary is created, by clustering the feature space. Data clustering is performed using K-means or Gaussian Mixture Models and the Expectation - Maximization algorithm. For each document image, a Bag of Visual Words or Fisher Vector representation is constructed, using the visual vocabulary and the extracted features of the document image. Finally, a multi class Support Vector Machine classification scheme is used, to score the system. Experiments are performed on well-known databases and comparative results with another established technique, are also given.

Authors and Affiliations

Nikos Tatarakis, Ergina Kavallieratou

Keywords

Related Articles

A Cumulative Multi-Niching Genetic Algorithm for Multimodal Function Optimization

This paper presents a cumulative multi-niching genetic algorithm (CMN GA), designed to expedite optimization problems that have computationally-expensive multimodal objective functions. By never discarding individuals fr...

 Category Decomposition Method Based on Matched Filter for Un-Mixing of Mixed Pixels Acquired with Spaceborne Based Hyperspectral Radiometers

 Category decomposition method based on matched filter for un-mixing of mixed pixels: mixels which are acquired with spaceborne based hyperspectral radiometers is proposed. Through simulation studies with simulated...

Access Fee Charging System for Information Contents Sharing Through P2P Communications

Charge system for information contents exchange through P2P communications is proposed. Security is the most important for this charge system and is kept with data hiding method with steganography and watermarking. Secur...

 Vibration Control of MR Damper Landing Gear

 In the field of Automation, Fuzzy Control Fuzzy control has significant merits which are utilized in intelligent controllers, especially for vibration control systems. This paper is concerned with the application a...

 Evolutionary Approaches to Expensive Optimisation

 Surrogate assisted evolutionary algorithms (EA) are rapidly gaining popularity where applications of EA in complex real world problem domains are concerned. Although EAs are powerful global optimizers, finding opti...

Download PDF file
  • EP ID EP101173
  • DOI 10.14569/IJARAI.2015.041206
  • Views 104
  • Downloads 0

How To Cite

Nikos Tatarakis, Ergina Kavallieratou (2015).  Language Identification by Using SIFT Features. International Journal of Advanced Research in Artificial Intelligence(IJARAI), 4(12), 34-43. https://europub.co.uk/articles/-A-101173