已收录 273506 条政策
 政策提纲
  • 暂无提纲
Complex SQL-NoSQL Query Translation for Data Lake Management
[摘要] A data lake refers to an extremely large data resource or repository. Data lakes store large amounts of data and use advanced analytics to pair data from multiple sources with different types of structured, semi-structured, and unstructured information. NoSQL databases such as Mongodb, Redis, Neo4j, and Cassandra are nontabular and they store data differently rather than use relational tables. NoSQL databases come in many forms, mostly documents, key values, wide columns, and graphs based on their data model. NoSQL gives less complicated scalability and higher overall performance as compared with traditional relational databases. NoSQL databases can store different types of data, but they cannot fully support Automation, Consistency, Isolation, and endurance (ACID) features, i.e., trigger functions in multi-transaction management, because a NoSQL database uses a non-relational database system. Thus, an interpreter is necessary for SQL-to-NoSQL queries. We used ANTLR (ANother Tool for Language Recognition), which has five main stages: The input SQL query, the tokenizer, the parser, the parser tree, and lastly the generation of the query results in NoSQL. The tool gave users to write a flexible multi-pass language parser that is expected to solve problems in querying complex ACID functions and other problems in complex queries in NoSQL databases. In the measurement, analysis, and evaluation of the translation results through the comparison of each NoSQL criterion against a Relational Database Management System (MySQL), the scores obtained were as follows. The performance criterion achieved the highest score (98.40%) by the MongoDB database, followed by scalability (97.40%) and accuracy (97.00%). The criterion with the lowest score was complexity (91.65%).
[发布日期]  [发布机构] 
[效力级别]  [学科分类] 计算机科学(综合)
[关键词] Translation;SQL;NoSQL;Data Lake;ANTLR [时效性] 
   浏览次数:1      统一登录查看全文      激活码登录查看全文