A Hybrid Approach for Complex Layout Detection of Newspapers in Gurumukhi Script Using Deep Learning

Journal Title: International Journal of Experimental Research and Review - Year 2023, Vol 35, Issue 6

Abstract

Layout analysis is the crucial stage in the recognition system of newspapers. A good layout analysis results in better recognition results. The complexity of newspaper layout structures poses a formidable challenge in digitization. The intricate arrangement of text, images, and various sections within a newspaper demands sophisticated algorithms and techniques for accurate layout detection. The paper introduces a diverse set of methodologies from existing literature, highlighting the evolution of techniques for newspaper layout analysis. In this paper, we present a novel method to detect the complex layout of newspapers in the Gurumukhi script by using a hybrid approach. The method developed consists of two parts. In the first part, we proposed an algorithm to remove pictures and graphics from Punjabi newspaper images that involve various image preprocessing tasks based on binarization, finding contours, and erosion on the image to remove the graphics from the image. This method removes pictures from complex non-Manhattan layouts. We have tested this algorithm on 100 newspaper images, giving an accuracy of 96.22%. In the second part, a dataset of 500 newspapers was created with images labeled with five classes on which the model was trained. Finally, we have trained the deep-leaning model based on a convolutional network to detect the columns of text in newspapers. We have used four different architectures of CNN and compared their performance based on different metrics such as precision, recall, and F1 score. We have tested this method on a number of newspapers in the Gurumukhi script. We have achieved an accuracy of 95.53% with this approach.

Authors and Affiliations

Atul Kumar, Gurpreet Singh Lehal

Keywords

Related Articles

Sex variations in anthropometric variables of Santal children of Birbhum district, West Bengal, India

A cross sectional study was undertaken to assess the anthropometric characteristics among 400 pre-primary and primary school going Santal children aged 4 to 11 years which includes 217 boys and 183 girls of Bolpur Srinik...

DNA regulates foraging physiology and behavior in black ant (Paratrechina longicornis) and red ant (Solenopsis geminata): A novel molecular approach

Ants prefer variety of food to carry out different physiologically controlled eusocial activities. However, the role of internal biological factors regulating the sensibility in physiological recognition is yet to be exp...

Demographic inequality among the tribal and non-tribal community in Nasik district of Maharashtra State

Demography of tribal people cannot materialize huge in India’s overall demographic status; demographic structures in tribal peoples have often been distinct and distinguished both in historical and comparative outlooks....

Prevalence of Stunting, wasting and underweight among Santal children of Galudih, Purbi Singbhum district, Jharkhand, India

The objective of this study was to assess the differences in body stature (height), body weight, and frequency of stunted, wasted, and underweight children of the Santal ethnicity in Galudih area, Purbi Singbhum, Jharkha...

Design and development Virtual Doctor Robot for contactless monitoring of patients during COVID-19

The main objective of this paper is to design and develop a virtual doctor robot (VDR) that will operate on the command of the actual doctor available far away from the patient through new technology AI and IoT. It is no...

Download PDF file
  • EP ID EP724725
  • DOI 10.52756/ijerr.2023.v35spl.004
  • Views 20
  • Downloads 0

How To Cite

Atul Kumar, Gurpreet Singh Lehal (2023). A Hybrid Approach for Complex Layout Detection of Newspapers in Gurumukhi Script Using Deep Learning. International Journal of Experimental Research and Review, 35(6), -. https://europub.co.uk/articles/-A-724725