Real-Time Analysis of Students’ Activities on an E-Learning Platform based on Apache Spark
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 7
Abstract
Real time analytics is the capacity to extract valuables insights from data that comes continuously from activities on the web or network sensors. It is largely used in web based business to drive decisions based on user’s experiences, such dynamic pricing and personalized advertising. Many universities have adopted web based learning in their learning process. They use data-mining techniques to better understand students’ behavior, and most of the tools developed are based on historical and stored data, and do not allow real time reactivity. Online activities of learners generate at high speed a huge amount of data in form of users’ interactions which have all characteristics to be considered as Big data. Deal with volume and velocity of these data in order to inform and enable decisions-makers to act at right time lead us to use new methods to capture E-Learning data, and process it in real time. This paper focuses on the design and implementation of modern and hybrid real time data pipeline architecture using Apache Flume to collect data, Apache Spark as an unified engine computation for performing analytics on students’ activities data and Apache Hive as a data warehouse for storing the processed data and for use by various reporting tools. To conceive this platform we conduct an experiment on Moodle database source.
Authors and Affiliations
Abdelmajid Chaffai, Larbi Hassouni, Houda Anoun
Applying Diffie-Hellman Algorithm to Solve the Key Agreement Problem in Mobile Blockchain-based Sensing Applications
Mobile blockchain has achieved huge success with the integration of edge computing services. This concept, when applied in mobile crowd sensing, enables transfer of sensor data from blockchain clients to edge nodes. Edge...
Energy Management Strategy of a PV/Fuel Cell/Supercapacitor Hybrid Source Feeding an off-Grid Pumping Station
This work aims to develop an accurate energy management strategy for a hybrid renewable energy system feeding a pumping station. A developed model under Simulink environment is used to compare the performance of the pump...
Quantifying the Relationship between Hit Count Estimates and Wikipedia Article Traffic
This paper analyzes the relationship between search engine hit counts and Wikipedia article views by evaluating the cross correlation between them. We observe the hit count estimates of three popular search engines over...
Attacking Misaligned Power Tracks Using Fourth-Order Cumulant
Side channel attacks (SCA) use the leaked confidential data to reveal the cipher key. Power consumptions, electromagnetic emissions, and operation timing of cryptographic hardware are examples of measurable parameters (a...
Web 2.0 Technologies and Social Networking Security Fears in Enterprises
Web 2.0 systems have drawn the attention of corporation, many of which now seek to adopt Web 2.0 technologies and transfer its benefits to their organizations. However, with the number of different social networkin...