概率多智能体系统中的模型检验
Model Checking for Probabilistic Multiagent Systems
-
摘要:研究背景 在多智能体系统中,智能体往往并不具有整个系统的完全信息。因此一个自然的限制就是智能体在选择动作时仅能依靠自己的局部信息,这种仅根据局部信息进行动作选择的策略称为一致策略。考虑一致策略使得多智能体系统的分析与验证变得困难。在非概率的多智能体系统中,不完全信息下的分析与验证已经有很多研究工作,但是在概率多智能体系统中,相关研究工作仅停留在理论层面,而且主要是复杂度分析,没有可用的算法。目的 本文旨在研究概率多智能体系统中不完全信息场景下的模型检验问题,研究的系统包含四种因素:概率行为、非确定性动作选择、不完全信息以及多智能体参与,研究的逻辑包含概率计算树逻辑和认知逻辑,研究目的是提供可用的算法。方法 对于无记忆的一致策略下的模型检验问题,本文提出了两种算法,一种是将模型检验问题转换为非线性整数规划问题(MINLP),然后通过SMT(Satisfiability Modulo Theories)求解器解决。另一种是近似算法,基于UCT(the Upper Confidence bounds applied to Trees)算法,这是一种基于蒙特卡洛树搜索的算法。结果 实验结果表明,基于MINLP的算法能够给出精确的结果,但能处理的模型规模较小,且表现严重依赖SMT求解器。而基于UCT的算法能够处理的模型规模较大,能够在更短时间内给出结果,而且它是一个任意时间算法,即能够在运行过程中给出当前发现的结果。基于UCT算法的局限性在于它是近似算法,表现可能会随着随机数的改变发生变化,本文建议增大时间限制以缓解这个问题。结论 本文研究了概率多智能体系统中的概率认知时序逻辑的模型检验问题,并对可判定的子类问题提出了两种算法,包括一个精确算法和一个近似算法。实验结果证明了这两种算法的有效性。本文工作的局限在于未考虑交替时序逻辑(Alternating Temporal Logic)和随机化策略(Randomised Schedulers),在下一步工作中,我们将进行这些方向上的尝试。Abstract: In multiagent systems, agents usually do not have complete information of the whole system, which makes the analysis of such systems hard. The incompleteness of information is normally modelled by means of accessibility relations, and the schedulers consistent with such relations are called uniform. In this paper, we consider probabilistic multiagent systems with accessibility relations and focus on the model checking problem with respect to the probabilistic epistemic temporal logic, which can specify both temporal and epistemic properties. However, the problem is undecidable in general. We show that it becomes decidable when restricted to memoryless uniform schedulers. Then, we present two algorithms for this case: one reduces the model checking problem into a mixed integer non-linear programming (MINLP) problem, which can then be solved by Satisfiability Modulo Theories (SMT) solvers, and the other is an approximate algorithm based on the upper confidence bounds applied to trees (UCT) algorithm, which can return a result whenever queried. These algorithms have been implemented in an existing model checker and then validated on experiments. The experimental results show the efficiency and extendability of these algorithms, and the algorithm based on UCT outperforms the one based on MINLP in most cases.