已收录 268921 条政策
 政策提纲
  • 暂无提纲
Clustering Survival Data using Random Forest and Persistent Homology Open Access
[摘要] Survival data is mostly analyzed using Cox proportional hazards model to identify factors associated with survival time of patients. However recently random survival forest (RSF), a non-parametric method for ensemble estimation constructed by bagging of classification trees for survival data, is used as an alternative method for better survival prediction and ranking the importance of covariates associated with it. In addition to identification of variable importance for survival prediction, exploring clusters in survival data using the variables identified as important in RSF analysis were applied. Clustering survival data (patients) to assess their survival experience was investigated using random forest clustering based on partitioning around the medoids and persistent homology (PH), a topological data analysis (TDA) technique for cluster identification in lower dimension (dimension zero). In both methods, we were able to identify different groups of patients possessing different survival experience accounting for those covariates most important in determining survival experience. The clusters formed were assessed for significant difference in their survival experience (log-rank test) and were found to have difference in survival experience between them. Further investigation was applied using PH to explore more detailed characteristic features of patients at higher dimension (dimension one). Both clustering methods result in a promising exploration of groups within patients that will give insight into to patient handling and give valuable information in providing quality service to patients who need more attention. All analysis procedures in this thesis were done using two datasets: the kidney and liver dataset.
[发布日期]  [发布机构] University of Alberta
[效力级别] Survival [学科分类] 
[关键词]  [时效性] 
   浏览次数:3      统一登录查看全文      激活码登录查看全文