已收录 273724 条政策
 政策提纲
  • 暂无提纲
Analysis Ready Data in Analytics Optimized Data Stores for Analysis of Big Earth Data in the Cloud
[摘要] Cloud computing offers the possibility of making the analysis of Big Data approachable for a wider community due to affordable access to computing power, an ecosystem of usable tools for parallel processing, and migration of many large datasets to archives in the cloud, allowing data-proximal computing. Generally, data analysis acceleration in the cloud comes from running multiple nodes in a split-combine-apply strategy. Data systems such as the Earth Observing System Data and Information System are in a position to "pre-split" the data by storing them in a data store that is optimized for data parallel computing, i.e., an Analytics-Optimized Data Store (AODS). A variety of approaches to AODS are possible, from highly scalable databases to scalable filesystems to data formats optimized for cloud access (e.g., zarr and cloud-optimized datasets), with the optimal choice dependent on both the types of analysis and the geospatial structure of the data. A key question is how much preprocessing of the data to do, both before splitting and as the first part of the apply step. Again, the geospatial structure of the data and the analysis type influence the decision, with the added complexity of the user type. Trans-disciplinary users who are not well-versed in the nuances of quality-filtering and georeferencing of remote sensing orbit/swath/scene data tend to ask for more highly processed data, relying on the data provider to make sensible decisions on preprocessing parameters. (This accounts for the popularity of "Level 3" gridded data, despite the lower spatial resolution it provides.) In this case, data can be preprocessed before the split, resulting in higher performance in the rest of the "apply" step, which can be transformative for use cases such as interactive data exploration at scale. Discipline researchers who are experienced with remote sensing data often prefer more flexibility in customizing the preprocessing data into Analysis Ready Data, resulting in more need for on-the-fly preprocessing.
[发布日期] 2019-12-09 [发布机构] 
[效力级别]  [学科分类] 软件
[关键词] PREPROCESSING;REMOTE SENSING;DATA SYSTEMS;DATA STORAGE;PARALLEL PROCESSING (COMPUTERS);FUNCTIONAL DESIGN SPECIFICATIONS;INFORMATION SYSTEMS;MIGRATION [时效性] 
   浏览次数:11      统一登录查看全文      激活码登录查看全文