Product Matching using Sentence-BERT: A Deep Learning Approach to E-Commerce Product Deduplication

Journal Title: Engineering and Technology Journal - Year 2024, Vol 9, Issue 12

Abstract

Product matching in e-commerce platforms presents a significant challenge due to variations in product titles, descriptions, and categorizations across different vendors. This paper presents a lightweight yet effective approach to product matching using Sentence-BERT (SBERT), specifically the all-MiniLM-L6-v2 variant. Our method combines efficient text preprocessing, strategic training pair generation, and threshold-based similarity matching to achieve high-accuracy product matching while maintaining computational efficiency. The system was evaluated on the Pricerunner dataset, achieving exceptional results with 98.10% accuracy, 100% precision, and 91.84% recall. The implementation includes a modular architecture that facilitates maintenance and updates, while the threshold-based matching strategy allows fine-tuned control over precision-recall trade-offs. Our results suggest that carefully designed preprocessing and training strategies, combined with lightweight transformer models, can achieve state-of-the-art performance in product matching without requiring complex model architectures or extensive computational resources.

Authors and Affiliations

Heribertus Yulianton , Rina Candra Noor Santi,

Keywords

Related Articles

APPLYING PRECISION INTO PROACTIVE MAINTENANCE IN NIGERIA ELECTRIC POWER INDUSTRY

Power station plants and equipments are required to run for well beyond their intended lifetime. Opening up machines for inspections is expensive and owners need to consider all relevant information in making the decisio...

COMPUTER-AIDED MEDICAL DIAGNOSIS SYSTEM USING LOGISTICS REGRESSION ALGORITHMS (LRA) SUPERVISED LEARNING APPROACH

This work focused on the designing of medical diagnosis system using Supervised Machine Learning. Logistics Regression Algorithms (LRA) was adopted, the label inputs for the data set which the symptoms were trained and m...

Improving Results of TF-IDF based Retrieval System using Co-reference Resolution and Pronoun Substitution

Information Retrieval systems involve the process of retrieving relevant information based on user queries. TF-IDF is one of the most popular techniques of Information Retrieval. It is widely used and been successful in...

Design and Implementation of Portable Low-Cost Heart Rate Monitoring ECG System

Electrocardiogram is an important health parameter in the diagnosis of cardiovascular diseases, which are among the leading cause of death in the world including Nigeria. This research work “portable low-cost heart rate...

Investigating the Impact of Ground-Return Parameters on Transitional Voltages at Switching-Off Unloaded Power Transmission Lines

The electrical parameters calculations of the overhead transmission lines are very important to the areas concerned to electromagnetic compatibility and transitional processes in power systems. Therefore, accurate calcul...

Download PDF file
  • EP ID EP753137
  • DOI 10.47191/etj/v9i12.14
  • Views 74
  • Downloads 1

How To Cite

Heribertus Yulianton, Rina Candra Noor Santi, (2024). Product Matching using Sentence-BERT: A Deep Learning Approach to E-Commerce Product Deduplication. Engineering and Technology Journal, 9(12), -. https://europub.co.uk/articles/-A-753137