An Emperical Study of Clustering Algorithms to extract Knowledge from PubMed Articles

Journal Title: Transactions on Machine Learning and Artificial Intelligence - Year 2017, Vol 5, Issue 3

Abstract

Extraction of useful information from biomedical literature is one of the thrust for the world nowadays due to availability of almost articles on the web in electronic form. Information retrieval (IR) from biomedical literature is finding useful patterns from the unstructured text corpus that satisfies information. In this paper intelligent text analysis is carried out on PubMed articles related to influenza virus. In this context, various algorithms are discussed to reveal the information from PubMed articles, like year wise count of articles containing influenza virus related terms (viz. H1N1, H5N1, and H7N1 etc.), countries with their publication count, which tells about the outbreaks of the diseases in these countries. The articles may be grouped by searching the keyword �influenza virus strain� pattern with the help of regular expressions. Automatic text categorization is another challenging issue for text mining. We applied k-means, fuzzy C-means, and fuzzy C-shell algorithm for automatic categorization of text articles. The association between words based on their cooccurrence is computed which further helps to categorize the documents based on their cooccurrences. The basic k-means clustering algorithm is first applied to cluster the documents, and then to handle the fuzzy nature of words which may belong to more than one cluster, fuzzy c-means clustering is applied to form more accurate clusters. As Fuzzy c-means method clusters the documents which are in linear spaces but not in the circle, spherical, or ellipsoidal spaces. A new method is proposed here, which considers the clusters of documents in the radius of the circle.K

Authors and Affiliations

Deepak Agnihotri, Kesari Verma, Priyanka Tripathi

Keywords

Related Articles

Extracting Sentiments and Summarizing Health Reviews from Social Media Using Machine Learning Techniques

Most of the health organizations provide an array of medical services and request their beneficiaries to provide their experience’s in the form of opinion/reviews for which they are associated. Doctors of national and in...

Mongo2SPARQL: Automatic and Semantic Query Conversion of MongoDB Query Language to SPARQL

In the last decades, the web has experienced a quantitative explosion of digital data handled by companies or organizations, prompting web users to switch to NoSQL system dedicated to Big Data in order to support large w...

Technical Data Extraction and Representation in Expert CAPP System

Computeraided process planning (CAPP) is an essential interface for linking design and manufacturing processes, the purpose of CAPP is to transform a part design specification obtained from CAD system into a sequence of...

Contribution to the Measurement of Organizational Performance based on A Multi-Agent Approach

This research focuses on evaluating and analyzing the organizational performance of a risk management unit within banks. The main proposal is to analyze and simulate the process of risk management based on decision suppo...

Role of Management and Policy Issues in Computer Security: Rand Report R-609 within Organization

The need to provide strengthened Security for Information Systems within organization increases day after day seeing the large development of interconnection of the World Wide Web and the clear effect that results by the...

Download PDF file
  • EP ID EP275516
  • DOI 10.14738/tmlai.53.3106
  • Views 89
  • Downloads 0

How To Cite

Deepak Agnihotri, Kesari Verma, Priyanka Tripathi (2017). An Emperical Study of Clustering Algorithms to extract Knowledge from PubMed Articles. Transactions on Machine Learning and Artificial Intelligence, 5(3), 13-27. https://europub.co.uk/articles/-A-275516