COSTECH Integrated Repository

Performance evaluation of NoSQL databases on streaming data

Show simple item record

dc.creator Mfungo, Dani
dc.date 2018-10-15T17:15:46Z
dc.date 2018-10-15T17:15:46Z
dc.date 2017
dc.date.accessioned 2022-10-20T13:46:54Z
dc.date.available 2022-10-20T13:46:54Z
dc.identifier Mfugo, D. (2017). Performance evaluation of NoSQL databases on streaming data. Dodoma: The University of Dodoma.
dc.identifier http://hdl.handle.net/20.500.12661/505
dc.identifier.uri http://hdl.handle.net/20.500.12661/505
dc.description Dissertation (MSc Computer Science)
dc.description The main purpose of this dissertation was to evaluate the performance of Cassandra and HBase NoSQL Databases, that present at Column-oriented category on handling streaming data. The dataset used for this evaluation was constructed with the help of the Twitter Streaming API. The environment which used to evaluate the performance of Cassandra and HBase on Streaming Data was Apache Spark with its ability to plot streaming data from source using Spark-R. Several studies have been considered and came out with evaluation metrics. Among the metrics found include computation time, memory used, read and write bytes, and CPU usage. The benchmark performance of the two column family NoSQL Databases (Cassandra and HBase) were completed. The researcher, benchmark 4 different implementations by setting the time interval of 5seconds, 10 seconds, 5 minutes and 10 minutes for 10 iterations with 20 days. The performance on two NoSQL databases was evaluated in terms of computation time where throughput and latency time were the metrics. Cassandra seems to have the overall good performance in write operation when the streaming workload increase compared to HBase while HBase shows the overall low performance in computation for having high average latencies time, particularly in writing operation. To have accuracy result, each test results were averaging to came out with average results.
dc.language en
dc.publisher The University of Dodoma
dc.subject Database
dc.subject Database Performance Evaluation
dc.subject Data
dc.subject Streaming data
dc.subject Cassandra
dc.subject HBase
dc.subject Data set
dc.subject Database performance
dc.subject NoSQL databases
dc.title Performance evaluation of NoSQL databases on streaming data
dc.type Dissertation


Files in this item

Files Size Format View
DANI MFUNGO - REPORT.pdf 1.603Mb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search COSTECH


Advanced Search

Browse

My Account