Study on Efficient Way to Identify User Aware Rare Sequential Pattern Matching in Document Stream
Journal Title: International Journal for Research in Applied Science and Engineering Technology (IJRASET) - Year 2017, Vol 5, Issue 2
Abstract
As we know internet is the source of large number textual document those are created by users and distributed in various forms. Most of existing works are done on topic modelling and the evolution of individual topics, while sequential relations of topics in successive documents published by a specific user are ignored. In this paper, in order to characterize and detect personalized and abnormal behaviours of Internet users, we propose Sequential Topic Patterns (STPs) and formulate the problem of mining User-aware Rare Sequential Topic Patterns (URSTPs) in document streams on the Internet. They are rare on the whole but relatively frequent for specific users, so can be applied in many real-life scenarios, such as real-time monitoring on abnormal user behaviours. We present a group of algorithms to solve this innovative mining problem through three phases: preprocessing to extract probabilistic topics and identify sessions for different users, generating all the STP candidates with (expected) support values for each user by pattern-growth, and selecting URSTPs by making user-aware rarity analysis on derived STPs. Twitter is the best real time example, from that we able to discover the users abnormal behaviour. This approach gives the effective and efficient way to find out rare pattern in document string.
Authors and Affiliations
Swati V. Mengje, Prof. R R Shelke
Macs: A Highly Customizable Low-Latency Communication Architecture
Networks-on-chips (NoCs) are an increasingly popular communication infrastructure in single chip VLSI design for enhancing parallelism and system scalability. Processing elements (PEs) connect to a communication topolog...
An Efficient Technique for Fingerprint Features Protection and Person Identification Using Wavelet Transform
Fingerprint recognition is a widely used technique for person identification. The two major techniques for fingerprint identification are minutiae based technique and non-minutiae based technique. In this paper we propo...
Staff Locating and Notifying System using RFID technology
In global perspective, it is a difficult job to contact human beings in big institutions in spite of the prohibition in mobile network. Presently, global system positioning and Zigbee are employed in locating and tracki...
Study on Compressive Strength of M30 Grade Concrete with Partial Replacement Of C.A with Electrical ARC Furnace Slag
In this research we have replace different proportion percentage of normal aggregate with Electric Arc Furnace Slag aggregate and compared with conventional concrete. The compressive strength and tensile strength test i...
Increasing Network Lifetime by Using Secure Clustering With Reliable Node Disjoint Multi-path Routing in Wireless Sensor Networks
In order to increase the network latency and resolve the security bottlenecks induced by the camouflaged malicious nodes in Wireless Sensor Networks, the residual energy and trust values are used to form a secured clust...