Improving Data Collection on Article Clustering by Using Distributed Focused Crawler
Journal Title: Data Science: Journal of Computing and Applied Informatics - Year 2017, Vol 1, Issue 1
Abstract
Collecting or harvesting data from the Internet is often done by using web crawler. General web crawler is developed to be more focus on certain topic. The type of this web crawler called focused crawler. To improve the datacollection performance, creating focused crawler is not enough as the focused crawler makes efficient usage of network bandwidth and storage capacity. This research proposes a distributed focused crawler in order to improve the web crawler performance which also efficient in network bandwidth and storage capacity. This distributed focused crawler implements crawling scheduling, site ordering to determine URL queue, and focused crawler by using Naïve Bayes. This research also tests the web crawling performance by conducting multithreaded, then observe the CPU and memory utilization. The conclusion is the web crawling performance will be decrease when too many threads are used. As the consequences, the CPU and memory utilization will be very high, meanwhile performance of the distributed focused crawler will be low.
Authors and Affiliations
Dani Gunawan, Amalia Amalia, Atras Najwan
Using random search and brute force algorithm in factoring the RSA modulus
Abstract. The security of the RSA cryptosystem is directly proportional to the size of its modulus, n. The modulus n is a multiplication of two very large prime numbers, notated as p and q. Since modulus n is public, a c...
The Determining Gender Using Facial Recognition Based On Neural Network With Backpropagation
One area of science that can apply facial recognition applications is artificial intelligence. The algorithms used in facial recognition are quite numerous and varied, but they all have the same three basic stages, face...
Data Analysis as a Method to gather Data to study the Relation between Fundamental Rights and Rule of Law
The research described here involves the relation between fundamental rights and the rule of law. In the end we want to find out what meaning is attributed to ‘rule of law’ by the European Court of Human Rights. In this...
Efficiency of Local Government Units in North Western Philippines as to the Attainment of the Millennium Development Goals
This study entitled “Efficiency of Local Government Units in Northwestern Philippines as to the Attainment of the Millennium Development Goals” determined the performance of the four provinces and eight cities in Region...
On Factoring The RSA Modulus Using Tabu Search
It is intuitively clear that the security of RSA cryptosystem depends on the hardness of factoring a very large integer into its two prime factors. Numerous studies about integer factorization in the field of number theo...