Study on Efficient Way to Identify User Aware Rare Sequential Pattern Matching in Document Stream

Abstract

As we know internet is the source of large number textual document those are created by users and distributed in various forms. Most of existing works are done on topic modelling and the evolution of individual topics, while sequential relations of topics in successive documents published by a specific user are ignored. In this paper, in order to characterize and detect personalized and abnormal behaviours of Internet users, we propose Sequential Topic Patterns (STPs) and formulate the problem of mining User-aware Rare Sequential Topic Patterns (URSTPs) in document streams on the Internet. They are rare on the whole but relatively frequent for specific users, so can be applied in many real-life scenarios, such as real-time monitoring on abnormal user behaviours. We present a group of algorithms to solve this innovative mining problem through three phases: preprocessing to extract probabilistic topics and identify sessions for different users, generating all the STP candidates with (expected) support values for each user by pattern-growth, and selecting URSTPs by making user-aware rarity analysis on derived STPs. Twitter is the best real time example, from that we able to discover the users abnormal behaviour. This approach gives the effective and efficient way to find out rare pattern in document string.

Authors and Affiliations

Swati V. Mengje, Prof. R R Shelke

Keywords

Related Articles

Analytical expression for the steady state concentration of the species of an enzyme containing polymer modified electrode

In this article the mathematical analysis of non-linear differential equation in the action of an enzyme containing polymer modified electrode is discussed. The approximate analytical expressions of the steady state con...

Enhancing the Intellect to the Mobile Device Using “Sequential Pattern Technique”

In the project Sequential pattern technique is used for Emergency Communication. The Android application is developed in which user’s “Hand Waving Pattern” is recorded and the action is repeated for more times until the...

Emergency City Guide: Application for Android Mobile

Mobile phone is now a necessary part of human life. There is a continuous rise in number of mobile applications, specifically on the people’s daily lives. In such applications, the location dependent systems have been i...

Synthesis and Flexural Strength of Carbon Fiber Reinforced Epoxy Matrix Composite

Composite materials are multiphase materials obtained through the judicious combination of different materials in order to attain properties that the individual components by themselves cannot attain. The development of...

Variation of Performance on A 4s Single Cylinder Diesel Fuel Compression Ignition Engine for Variable Compression Ratio-A Review

The aim of this paper is to study the effect of compression ratio on the internal combustion engines and its influences on brake power, brake thermal efficiency, volumetric efficiency, Specific fuel consumption etc.. Th...

Download PDF file
  • EP ID EP23108
  • DOI 10.22214/ijraset.2017.2019
  • Views 263
  • Downloads 7

How To Cite

Swati V. Mengje, Prof. R R Shelke (2017). Study on Efficient Way to Identify User Aware Rare Sequential Pattern Matching in Document Stream. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 5(2), -. https://europub.co.uk/articles/-A-23108