Efficient Feature Selection for Product Labeling over Unstructured Data

Abstract

The paper introduces a novel feature selection algorithm for labeling identical products collected from online web resources. Product labeling is important for clustering similar or same products. Products blindly crawled over the web sources, such as online sellers, have unstructured data due to having features expressed in different representations and formats. Such data result in feature vectors whose representation is unknown and non-uniform in length. Thus, product labeling, as a challenging problem, needs efficient selection of features that best describe the products. In this paper, an efficient feature selection algorithm is proposed for product labeling problem. Hierarchical clustering is used with the state of the art similarity metrics to assess the performance of the proposed algorithm. The results show that the proposed algorithm increases the performance of product labeling significantly. Furthermore, the method can be applied to any clustering algorithm that works on unstructured data.

Authors and Affiliations

Zeki YETGIN, Abdullah ELEWI, Furkan GÖZÜKARA

Keywords

Related Articles

AUTOMATED EDGE DETECTION USING CONVOLUTIONAL NEURAL NETWORK

The edge detection on the images is so important for image processing. It is used in a various fields of applications ranging from real-time video surveillance and traffic management to medical imaging applications. Curr...

 An Improved Squaring Circuit for Binary Numbers

In this paper, a high speed squaring circuit for binary numbers is proposed. High speed Vedic multiplier is used for design of the proposed squaring circuit. The key to our success is that only one Vedic multiplier is us...

Performance Analysis of Security Mechanism for Automotive Controller Area Network

Connectivity of modern cars has led to security issues. A number of contributions have proposed the use of cryptographic algorithms in order to provide automotive Controller Area Network (CAN) security. However, due to C...

 Managing Knowledge in Development of Agile Software

 Software development is a knowledge-intensive work and the main attention is how to manage it. The systematic reviews of empirical studies presents, how knowledge management is used in software engineering and deve...

Dynamic Evaluation and Visualisation of the Quality and Reliability of Sensor Data Sources

Before using remote data sources, or those from external organisations, it is important to establish if the source is fit for purpose. We have developed an approach to automatic sensor data annotation and visualisation t...

Download PDF file
  • EP ID EP260430
  • DOI 10.14569/IJACSA.2017.080750
  • Views 60
  • Downloads 0

How To Cite

Zeki YETGIN, Abdullah ELEWI, Furkan GÖZÜKARA (2017). Efficient Feature Selection for Product Labeling over Unstructured Data. International Journal of Advanced Computer Science & Applications, 8(7), 376-381. https://europub.co.uk/articles/-A-260430