Big Data Analysis using R and Hadoop
Journal Title: International Journal of Computational Engineering and Management IJCEM - Year 2014, Vol 17, Issue 5
Abstract
The way Big data - heavy volume, highly volatile, vast variety and complex data - has entered our lives, it is becoming day by day difficult to manage and gain business advantages out of it. This paper describes as what big data is, how to process it by applying some tools and techniques so as to analyze, visualize and predict the future trend of the market. The tools and techniques described in this paper using the best of R language which is the future of the statistics and the Hadoop which is a parallel processing for the data so as to get a blend of best data model being processed over Big data parallelly. The integration of R and Hadoop give us the brand new environment where in R code can be written and deployed in Hadoop without any data movement. Using R and Hadoop helps organization to resolve the scalability, issues and solve their predictive analysis with high performance. You can have a much better deep dive over the big data when combined R and Hadoop.
Authors and Affiliations
Anju Gahlawat
Bioinformatics: Homology Detection and Protein Families Computation in the Area of New Research
It is the name given to these mathematical and computing approaches used to glean understanding of biological processes. It is the application of information technology to the field of molecular biology. Bioinformatics e...
A Research on Perception of Working Women Employees to Work-Life Balance
Today’s workforce encompasses a wide variety of employees with specific needs and resources when it comes to balancing work and life roles. The phrase ‘work-life balance’ has become a bit of a buzz phrase within the work...
Enhanced Compression Code for SOC Test Data Volume Reduction
Test data reduction is an important issue for the system-on-a-chip designs. A number of coding techniques have been developed in the past to compress the test data to achieve the best compression. In this paper we have m...
QTL Identification in Presence of QTL×Environment Interaction
Quantitative traits are the traits controlled by many genes and each of the genes has a small effect on the trait. The loci controlling quantitative traits are referred to as QTLs (Quantitative Trait Loci) and the proced...
Innovative heuristics modeling for Dynamic Project Time Optimization
Efficient and effective project management significantly improves the bottom line of an organization as well as enhances service level provided to customers. The dynamic nature of time escalations of the various tasks of...