Combined method for scanned documents images segmentation using sequential extraction of regions

Abstract

<p>We propose a combined method to segment the images of scanned documents, which, in contrast to known methods, implies a preliminary separation of the graphics and photograph regions from the text regions and a background. In this case, an analysis of the connected components is performed, which are different for graph­ics, photographs, and text regions. In order to classify the selected regions into the photograph and graphics regions, a block method is employed. It was established that such a technique for splitting the regions into blocks less affects the quality of segmentation when compared to applying the block method directly to the original im­age. To extract the text regions that are more complex in their shape from the background, the neighborhood of each pixel was processed.</p><p>To detect the boundaries of illustrations on the images of scanned documents, we applied the bloomberg method. In order to classify into photographs and graphics, it is proposed to split an illustration into blocks of pixels. Each block of pixels is identified with a vector of two features: the mean value of the local gradient magnitude, and the mean value of the function that localizes at the images of scanned documents the linear objects (graphics and text characters). The derived feature vectors were classified using a sup­port vector machine.</p><p>When extracting the text regions, we applied a low-frequency filtering and a thresholding.</p><p>The combined method was implemented in practice to segment the test images of scanned newspaper articles from the document da­tabase mediateam at oulu university (finland). It was established that the combined method is characterized by an increase in perfor­mance speed during image segmentation at high quality processing.</p>

Authors and Affiliations

Marina Polyakova, Alesya Ishchenko, Natalya Volkova, Oleg Pavlov

Keywords

Related Articles

Determining additional power losses in the electricity supply systems due to current's higher harmonics

The paper reports results of research into the influence of higher harmonics of the power source voltage and the load current on power losses in an electric network. The relevance of this study is predetermined by the ev...

Electric heaters based on nanomodified paraffin with self­installing heat contact for anti­icing systems of aerospace crafts

Improved effectiveness of ice protection systems of aerospace crafts can be achieved with the development of more effective heaters. Self-regulating electric heaters based on positive or negative temperature coefficient...

Acceleration analysis of the quadratic sieve method based on the online matrix solving

<p>The algorithm for the<strong> </strong>online matrix solving<strong> </strong>is proposed. The rate of acceleration of the basic quadratic sieve method based on the online matrix solving<strong> </strong>is investigat...

Development of control for the ankle joint simulator applied to the problem on vertical posture balance of a human

<p>The optimal ankle joint controller based on the model that describes the system of human vertical balance in response to small disturbances was developed. The method for optimization of the selection of control matric...

Defining the parameters for a brush with polypropylene bristle when uncovering the root system of maternal plants

<p><span style="font-family: 'Times New Roman'; font-size: small;">The paper addresses the operation of a cylindrical brush with elastic rods of bristles at the disclosure of the root system of maternal plants. It has be...

Download PDF file
  • EP ID EP528151
  • DOI 10.15587/1729-4061.2018.142735
  • Views 42
  • Downloads 0

How To Cite

Marina Polyakova, Alesya Ishchenko, Natalya Volkova, Oleg Pavlov (2018). Combined method for scanned documents images segmentation using sequential extraction of regions. Восточно-Европейский журнал передовых технологий, 5(2), 6-15. https://europub.co.uk/articles/-A-528151