Modeling users' powertrain preferences

[摘要] Our goal is to construct a system that can determine a drivers preferences and goals and perform appropriate actions to aid the driver achieving his goals and improve the quality of his road behavior. Because the recommendation problem could be achieved effectively once we know the driver;;s intention, in this thesis, we are going to solve the problem to determine the driver;;s preferences. A supervised learning approach has already been applied to this problem. However, because the approach locally classify a small interval at a time and is memoryless, the supervised learning does not perform well on our goal. Instead, we need to introduce new approach which has following characteristics. First, it should consider the entire stream of measurements. Second, it should be tolerant to the environment. Third, it should be able to distinguish various intentions. In this thesis, two different approaches, Bayesian hypothesis testing and inverse reinforcement learning, will be used to classify and estimate the user;;s preferences. Bayesian hypothesis testing classifies the driver as one of several driving types. Assuming that the probability distributions of the features (i.e. average, standard deviation) for a short period of measurement are different among the driving types, Bayesian hypothesis testing classifies the driver as one of driving types by maintaining a belief distribution for each driving type and updating it online as more measurements are available. On the other hand, inverse reinforcement learning estimates the users;; preferences as a linear combination of driving types. The inverse reinforcement learning approach assumes that the driver maximizes a reward function while driving, and his reward function is a linear combination of raw / expert features. Based on the observed trajectories of representative drivers, apprenticeship learning first calculates the reward function of each driving type with raw features, and these reward functions serve as expert features. After, with observed trajectories of a new driver, the same algorithm calculates the reward function of him, not with raw features, but with expert features, and estimates the preferences of any driver in a space of driving types.

[发布日期] [发布机构] Massachusetts Institute of Technology

[效力级别] [学科分类]

[关键词] [时效性]

浏览次数：3

统一登录查看全文激活码登录查看全文