Efficient Feature Selection for Product Labeling over Unstructured Data

Abstract

The paper introduces a novel feature selection algorithm for labeling identical products collected from online web resources. Product labeling is important for clustering similar or same products. Products blindly crawled over the web sources, such as online sellers, have unstructured data due to having features expressed in different representations and formats. Such data result in feature vectors whose representation is unknown and non-uniform in length. Thus, product labeling, as a challenging problem, needs efficient selection of features that best describe the products. In this paper, an efficient feature selection algorithm is proposed for product labeling problem. Hierarchical clustering is used with the state of the art similarity metrics to assess the performance of the proposed algorithm. The results show that the proposed algorithm increases the performance of product labeling significantly. Furthermore, the method can be applied to any clustering algorithm that works on unstructured data.

Authors and Affiliations

Zeki YETGIN, Abdullah ELEWI, Furkan GÖZÜKARA

Keywords

Related Articles

Classification of Melanoma Skin Cancer using Convolutional Neural Network

Melanoma cancer is a type of skin cancer and is the most dangerous one because it causes the most of skin cancer deaths. Melanoma comes from melanocyte cells, melanin-producing cells, so that melanomas are generally brow...

Developing an Assessment Tool of ITIL Implementation in Small Scale Environments

Considering the problematic of IT Service Management (ITSM) frameworks Implementation in SMEs, among the various frameworks available for companies to manage their IT services, ITIL is recognized as the most structured a...

 A Reliable Security Model Irrespective of Energy Constraints in Wireless Sensor Networks

 Wireless Sensor Networks (WSNs) are one of the most exciting and challenging research areas. It is an emerging technology that shows various applications both for public and military purpose. In order to operate th...

An Enhanced Method for Detecting the Shaded Images of the Car License Plates based on Histogram Equalization and Probabilities

Shadow is one of the major and significant challenges in detection algorithms which track the objects such as the license plates. The quality of images captured by cameras is influenced by weather conditions, low ambient...

DES: Dynamic and Elastic Scalability in Cloud Computing Database Architecture

Nowadays, companies are becoming global organizations. Such organizations do not limit themselves in conducting business in one country. They need dynamic, elastic, scalable cloud computing platform that operates around-...

Download PDF file
  • EP ID EP260430
  • DOI 10.14569/IJACSA.2017.080750
  • Views 79
  • Downloads 0

How To Cite

Zeki YETGIN, Abdullah ELEWI, Furkan GÖZÜKARA (2017). Efficient Feature Selection for Product Labeling over Unstructured Data. International Journal of Advanced Computer Science & Applications, 8(7), 376-381. https://europub.co.uk/articles/-A-260430