Journal of Computer Science and Technology


Shapelet Based Two-step Time Series Positive and Unlabeled Learning

Han-Bo Zhang(张翰博), Peng Wang*(王鹏) Member, CCF, Ming-Ming Zhang(张明明) and Wei Wang(汪卫), Member, CCF   

  1. School of Computer Science, Fudan University, Shanghai, China, 200438
  • Received:2021-01-25 Revised:2022-12-16 Accepted:2022-12-22
  • Contact: Peng Wang
  • About author:Peng Wang received the Ph.D. degree from Fudan University, Shanghai, in 2007. Now he is a professor in School of Computer Science, Fudan University, Shanghai. His research interests include database, data mining, and series data processing. He has published more than 30 papers in refereed international journals and conference proceedings.

In the last decade, there has been significant progress in time series classification. However, in real-world industrial settings, it is expensive and difficult to obtain high-quality labeled data. Therefore, the positive and unlabeled learning (PU-learning) problem becomes more and more popular recently. The current PU-learning approaches of the time series data suffer from low accuracy due to the lack of negative labeled time series. In this paper, we propose a novel shapelet based two-step (2STEP) PU-learning approach. In the first step, we generate shapelet features based on the positive time series, which are used to select a set of negative examples. In the second step, based on both positive and negative time series, we select the final features and build the classification model. The experiments results show that our 2STEP approach can improve the average F 1 score on 15 datasets by 9.1% compared with baseline, and achieves the highest F 1 score on 10 out of 15 time series datasets.


1、 研究背景(context):
然后用寻找与P-shapelet类似的方法拼接N序列生成代表N例子集合的N-shapelet集合与P-shapelet合并形成所有shapelet candidate备用。
利用shapelet candidate 中的shapelet与DS中的时间序列样本算最近距离(这个过程称为Shapelet transformation(形状变换))生成时间序列特征,通过我们的CSI分数对特征进行筛选得到训练集合。P和N作为训练集。U集合中其他例子作为测试集构建svm分类器进行分类。
4、结果(Result & Findings):

Key words: positive unlabeled learning; time series; Shapelet;

