已收录 273088 条政策
 政策提纲
  • 暂无提纲
Time-Based Data Streams: Fundamental Concepts for a Data Resource for Streams
[摘要] Real time data, which we call data streams, are readings from instruments, environmental, bodily or building sensors that are generated at regular intervals and often, due to their volume, need to be processed in real time. Often a single pass is all that can be made on the data, and a decision to discard or keep the instance is made on the spot. Too, the stream is for all practical purposes indefinite, so decisions must be made on incomplete knowledge. This notion of data streams has a different set of issues from a file, for instance, that is byte streamed to a reader. The file is finite, so the byte stream is becomes a processing convenience more than a fundamentally different kind of data. Through the duration of the project we examined three aspects of streaming data: the first, techniques to handle streaming data in a distributed system organized as a collection of web services, the second, the notion of the dashboard and real time controllable analysis constructs in the context of the Fermi Tevatron Beam Position Monitor, and third and finally, we examined provenance collection of stream processing such as might occur as raw observational data flows from the source and undergoes correction, cleaning, and quality control. The impact of this work is severalfold. We were one of the first to advocate that streams had little value unless aggregated, and that notion is now gaining general acceptance. We were one of the first groups to grapple with the notion of provenance of stream data also.
[发布日期] 2009-10-10 [发布机构] 
[效力级别]  [学科分类] 数学(综合)
[关键词] distributed systems;data management [时效性] 
   浏览次数:31      统一登录查看全文      激活码登录查看全文