已收录 272606 条政策
 政策提纲
  • 暂无提纲
Tools and theory to improve data analysis
[摘要] This thesis proposes a scientific model to explain the data analysis process. I argue that data analysis is primarily a procedure to build un- derstanding and as such, it dovetails with the cognitive processes of the human mind. Data analysis tasks closely resemble the cognitive process known as sensemaking. I demonstrate how data analysis is a sensemaking task adapted to use quantitative data. This identification highlights a uni- versal structure within data analysis activities and provides a foundation for a theory of data analysis. The model identifies two competing chal- lenges within data analysis: the need to make sense of information that we cannot know and the need to make sense of information that we can- not attend to. Classical statistics provides solutions to the first challenge, but has little to say about the second. However, managing attention is the primary obstacle when analyzing big data. I introduce three tools for managing attention during data analysis. Each tool is built upon a different method for managing attention. ggsubplot creates embedded plots, which transform data into a format that can be easily processed by the human mind. lubridate helps users automate sensemaking out- side of the mind by improving the way computers handle date-time data. Visual Inference Tools develop expertise in young statisticians thatcan later be used to efficiently direct attention. The insights of this thesis are especially helpful for consultants, applied statisticians, and teachers of data analysis.
[发布日期]  [发布机构] Rice University
[效力级别] analysis [学科分类] 
[关键词]  [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文