Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Big Data Processing on Single Board Computer Clusters: Exploring Challenges and Possibilities

Authors
Lee, EunseoOh, HyunjuPark, Dongchul
Issue Date
Oct-2021
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Economic indicators; Big Data; Servers; Media; Sparks; Universal Serial Bus; Power demand; Raspberry Pi; big data; Hadoop; Spark; UFS; SBC; single board computer; cluster
Citation
IEEE ACCESS, v.9, pp 142551 - 142565
Pages
15
Journal Title
IEEE ACCESS
Volume
9
Start Page
142551
End Page
142565
URI
https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/146366
DOI
10.1109/ACCESS.2021.3120660
ISSN
2169-3536
Abstract
For more than a decade, "big data" has been an industry and academia buzz phrase. Over this time, many companies adopted Apache Hadoop and Spark frameworks for their massive data storage and analysis efforts, using powerful, energy-hungry, general-purpose server as their big data processing platforms. But not all industry or academic fields want, or even need, such large systems. Moreover, capital costs aside, power consumption has also become a primary data center concern. Consequently, lower-cost, lower-power microservers have emerged as viable alternatives in many settings. Now, the latest generation Raspberry Pi (RPi), model 4B, exhibits significant computational performance improvements over its predecessors, and is presently considered a sufficiently powerful single board computer (SBC) to run many mainstream operating systems and accommodate heavy workloads. This paper reexamines SBC cluster big data processing possibilities by integrating the most powerful (presently) RPi model-the RPi 4B with 4 Gigabytes (GB) main memory. We examine external storage's performance impact on such an SBC cluster's big data processing performance by employing three different external storage solutions with measurably distinct performance characteristics. Moreover, we discuss challenges we encountered and identify further SBC cluster performance optimizations. We perform several representative big data application benchmarks and measure various key performance metrics such as execution time, power consumption, throughput, performance-per-dollars, etc. Our extensive experiments and comprehensive studies conclude this current, fourth-generation RPi has evolved to become the first generation to effectively run massive (i.e., more than 100GB) workloads in big data processing applications.
Files in This Item
There are no files associated with this item.
Appears in
Collections
공과대학 > 소프트웨어학부 > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE