CRN 15393 Database Systems and Approaches Assignment Sample 2023
Introduction
Recent development of web applications, advancements of distributed computing and cloud computing lead a way to obtain a huge volume of data set which is difficult to manage with simple database system. To provide high availability, scalability and improved performance distributed storage is a good solution which is non-relation storage. Amazon Dynamo and Google’s Bit Table is an example for the distributed storage. Initially records are maintained manually the dramatic invent of the technology make it easier by creating database and store the records. Using queries the data retrieval from the stored database is made easy. A database can store simple text to complex data. Database created are periodically refined to delete any redundant, unwanted and inconsistent data to perform the operation effectively. Then the relational model helps to create database which split the single data base in smaller database to avoid redundancy and the information in the data base can be combined and retrieved using the Structured Query Language which extract the data from the pool of database. RDBMS is the most commonly database due to its simplicity. It supports many operations like addition, retrieval, modification, join, aggregate function, deletion, update etc which can be performed on the existing table and help to retrieve the data stored in different tables and produce the result as if it is in a single table. SQL server, Oracle Database, MySQL are some of the example database used for RDBMS. The emerging growth of data which is not uniform is a major challenge for database. To solve the problems the non-relational database emerged during the recent years which help to scale the growing needs of the industry and it also highly efficient. NoSQL database gave rise since it can store large amount of data which is highly scalable and efficient.
Comparison of RDBMS with NoSQL
In RDBMS we can store three kinds of data structured data, semi-structured data and unstructured data but it require more compromises and labor to achieve efficient storage of unstructured data and semi-structured data. Handling of structured data is easy with RDBMS since the data is available in the required format. Storing semi-structured and unstructured data in RDBMS have complexities. To store semi-structured data it should be firs converted into relational data before storing it and in unstructured data it is stored as a blob object. To handle the non-uniformity of data that are to be stored find way to the creation of NoSQL database this is also called as not only SQL. It emerged as a substitute to relational database and chooses them as per the requirement features like availability of data, scalability and fault tolerance. The data stored in the NoSQL database are not in table form. It does not use the row and column approach to store the data. Since it is a distributed non-relational database it supports horizontal scalability which allows increasing the number of servers to store the data instead of upgrading the hardware as in the vertical scalability like RDBMS.
Importance of NoSQL
NoSQL databases are developed to manage large data with varied data set that has continuously changing. Mainly used in distributed system and for cloud databases. In NoSQL database limitation like rigid schemes are removed. It was created as an alternative for Relational database. The ability of scalability, fault tolerance and availability is the most important factor to make NoSQL as an alternate to RDBMS it does not have any strict schema to be followed as in RDBMS. NoSQL can handle big data efficiently by providing high velocity and handle various complex data. Since NoSQL is horizontally scalable it is easy to manage by adding new nodes to the cluster to balance the load efficiently. Since it is a distributed data it avoid failure even if one server fails other servers will be in working condition which will continue to work for the faulty node so it has a true fault tolerance it also has a built-in redundancy. NoSQL databases are classified into four types based on its properties
- Graph database: It is based on Graph Theory. Example Neo4j and Titan
- Key-value store: data is stored as two parts as key and value. Example Redis, Riak, DyanmoDB
- Column store: Data are stored as sections of columns. Example HBase, BigTable, Cassandra
- Document database: It is the higher version of key-value stores. The values are stored as a document which is in the form of complex structure. Example MangoDB, CouchDB.
Limitations of all the databases are explained clearly by CAP theorem where C stands for Consistency; A for Availability and P stands for Partition tolerance. It states that anyone can pick any two out of these three features. According to Brewer’s theorem it states that may shared-data system two properties can be exist at a maximum.
Most commonly used relational database is for its simplicity. Data are divided into multiple tables in RDBMS by applying normalization technique to handle the data more efficiently. Data can be reassembled as per the requirement of the user using the Structured Query Language (SQL) which supports 4 types of queries DDL (Data Definition Language, DML (Data Manipulation Language and DCL (Data Control Language). In NoSQL only a single record transactions it supports replica and it is eventual consistency and the transaction in NoSQL are commutative.
Comparison of NoSQL database based on its functional features
S.No | Features | Key Value Store | Document Store | Wide column store | Graph Store |
1 | Denormalizaition | Support | Not Support | Support | Support |
2 | Single Aggregate | Support | Applicable | Support | Not Support |
3 | Atomicity | Support | Applicable | Support | Not Support |
4 | Unordered keys | Support | Not Support | Support | Not Support |
5 | Derived table | Not Support | Not Support | Support | Not Support |
6 | Composite key | Not Support | Not Support | Applicable | Not Support |
7 | Aggregation | Support | Applicable | Not Support | Not Support |
8 | Adjacency lists | Support | Applicable | Not Support | Not Support |
9 | Nested Sets | Support | Applicable | Not Support | Not Support |
10 | Join | Not Support | Not Support | Not Support | Not Support |
The functional features of the four categories of NoSQL are shown in the above table. The table shows that derived table, composite key and join features should be avoided in the key value store. Document store should avoid denormalization, unordered keys, derived table, and composite key and join operations. While wide column store should avoid aggregation, adjacency list, Nested sets and join. Graph store will support only the denormalization feature. Join feature is not supported by any of the NoSQL database. All the functional features that are mentioned in the table is supported by RDBMS.
Influence of NoSQL in organization
Many organizations are now making towards the implementation of big data analytics to predict the future needs of the organization and how to improve their business. Depending upon the available data useful predictions are made and suggestions are provided to the business organization to improve their business by data analyst. Due to advancement in technology most of the organization have moved their data to cloud and also based on web application, so they are migrating their data to NoSQL since the data collected through web are of different form.
Journals on data analytics
Three journals that are referencing the emerging technology and ideas of big data analytics include Big Data Research, International Journal of Data Science and Analytics and IEEE transactions on Big Data.
Conclusion
SQL database are based on vertical scaling (hardware) while NoSQL database support horizontal scaling (Server). NoSQL is the recently emerged database as an alternative to RDBMS which supports Big data. RDBMS is commonly used in the structured data storage. NoSQL is used in all the web based applications and IoT application where the data generated is huge and the structure of the generated data also differ. Increase in web application and IoT application has made NoSQL as a commonly used database for big data since it is more flexible to handle unstructured data.
References
Zachary Parker, Scott Poe, Susan V. Vrbsky,2013. “Comparing NoSQLMongoDB to an SQL DB”, ACMSE ’13 Proceedings of the 51st ACM Southeast Conference.
Tilmann Rabl, Mohammad Sadoghi, Hans-Arno Jacobsen, Sergio G´omez Villamor, Victor Munt´es Mulero and Serge Mankovskii,2012. “Solving Big Data Challenges for Enterprise Application Performance Management”, The 38th International Conference on Very Large Data Bases, August 27th – 31st 2012, Istanbul, Turkey. Proceedings of the VLDB Endowment.
John Klein, Ian Gorton, Neil Ernst, Patrick Donohoe, Kim Pham, Chriisjan Master,2015. “A comparison between several NoSQL databases with comments and notes”, PABS ’15 Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems.
Antonios Makrisa, Konstantinos Tserpesa, Vassiliki Andronikoub, Dimosthenis Anagnostopoulosa,2016. “A classification of NoSQL data stores based on key design characteristics”, Cloud Futures: From Distributed to Complete Computing, CF2016, 18-20 .
Yishan Li and Sathiamoorthy,2013. “A performance comparison of SQL and NoSQL databases.”, IEEE Pacific Rim Conference on Communications, Computers and Signal Processing.
Hua Fan, Aditya Ramaraju, Marlon McKenzie, Wojciech Golab, Bernard Wong,2015. “Understanding the Causes of Consistency Anomalies in Apache Cassandra”, Proceedings of the VLDB Endowment.
Know more about UniqueSubmission’s other writing services: