Skip to main content
WorldCist'18 - 6th World Conference on Information Systems and Technologies

Full Program »

Redundant Independent Files (RIF): A Technique for Reducing storage and Resources in Big Data Replication

Most of cloud computing storage systems widely use a distributed file system (DFS) to store big data, such as Hadoop Distributed File System (HDFS) and Google File System (GFS). Therefore, the DFS depends on replicate data and stores it as multiple copies, to achieve high reliability and availability. On the other hand, that technique increases storage and resources consumption. This paper addresses these issues by presenting a decentralized hybrid model. That model; called CPRIF, is a combination of a cloud provider (CP) and a suggested service that we call Redundant Independent Files (RIF). The CP provides HDFS without replica, and the RIF acts as a service layer that splits data into three parts and uses the XOR operation to generate a fourth part as parity. These four parts are to be stored in HDFS files as independent files on CP. The generated parity file not only guarantees the security and reliability of data but also reduces storage space, resources consumption and operational costs. It also improved the writing and reading performance. The suggested model was implemented on a cloud computing storage that we built using three physical servers (Dell T320) running a total 12 virtual nodes. The TeraGen benchmark tool and Java Code were used to test the model. Implemented results show the suggested model decreased the storage space by 35% compared to other models and improved the data writing and reading by about 34%.

Mostafa Kaseb
Faculty of Computers and Information, Fayoum University
Egypt

Mohamed Khafagy
Faculty of Computers and Information, Fayoum University
Egypt

Ihab Ali
Faculty of Engineering, Helwan University
Egypt

ElSayed Saad
Faculty of Engineering, Helwan University
Egypt

 

Powered by OpenConf®
Copyright ©2002-2017 Zakon Group LLC