Smart Cloud Document Clustering and plagiarism checker using TF-IDF Based on Cosine Similarity

Journal Title: GRD Journal for Engineering - Year 2017, Vol 2, Issue 5

Abstract

This research paper describes the results oriented from experimental study of conventional document clustering techniques implemented in the commercial spaces so far. Particularly, we compared main approaches related to document clustering, agglomerative hierarchical document clustering and K-means. Though this paper, we generates and implement checker’s algorithms which deals with the duplicacy of the document content with the rest of the documents in the cloud. We also generate algorithm required to deals with the classification of the cloud data. The classification in this algorithm is done on the basis of the date of data uploaded and. We will take the ratio of both vectors and generate a score which rates the document in the classification.

Authors and Affiliations

Sudhir Sahani, Rajat Goyal, Saurabh Sharma, Shaili Gupta

Keywords

Related Articles

Welding of Plastics through Hot Gas Technique: A Review

The goal of this paper is to provide the basic fundamentals of hot gas plastic welding technique. Use of polymers in industries is increasing in order to reduce overall weight of products. Polymers are generally joined b...

Seismic Performance Evaluation of Setback Building with Open Ground storey

This study investigates the performance of the setback building with open ground storey using nonlinear static pushover analysis. Such type of building possess vertical geometric and mass irregularity as well as stiffnes...

Mapping of XML Document and Relational Database (Using Structural Queries)

Now a day's, XML files are most vital facet of code trade. As a core normal, XML provides a solid foundation around that different standards could grow. Making DTD’s is possibly what the creators of XML had in mind after...

A Dataset for 3D Object Recognition in Industry

We introduce the 3D Object Detection Dataset, public dataset for 3D object detection and pose estimation with a strong focus on objects, settings, and requirements that are realistic for industrial setups. Contrary to ot...

Combined Traditional & Green Supplier Selection Criteria used in Indian Chemical Industries

Supply chain is a network of supplier, manufacturing, assembly, distribution and logistics facilities that perform the functions of procurement of materials, transformation of these materials into intermediate and finish...

Download PDF file
  • EP ID EP224420
  • DOI -
  • Views 83
  • Downloads 0

How To Cite

Sudhir Sahani, Rajat Goyal, Saurabh Sharma, Shaili Gupta (2017). Smart Cloud Document Clustering and plagiarism checker using TF-IDF Based on Cosine Similarity. GRD Journal for Engineering, 2(5), 331-333. https://europub.co.uk/articles/-A-224420