SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.
Citation: | Qian TW, Wang Y, Xu YJ et al. A model-agnostic hierarchical framework towards trajectory prediction. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 40(2): 322−339, Mar. 2025. DOI: 10.1007/s11390-023-3013-4 |
Predicting the future trajectories of multiple agents is essential for various applications in real life, such as surveillance systems, autonomous driving, and social robots. The trajectory prediction task is influenced by many factors, including the individual historical trajectory, interactions between agents, and the fuzzy nature of the observed agents’ motion. While existing methods have made great progress on the topic of trajectory prediction, they treat all the information uniformly, which limits the effectiveness of information utilization. To this end, in this paper, we propose and utilize a model-agnostic framework to regard all the information in a two-level hierarchical view. Particularly, the first-level view is the inter-trajectory view. In this level, we observe that the difficulty in predicting different trajectory samples varies. We define trajectory difficulty and train the proposed framework in an “easy-to-hard” schema. The second-level view is the intra-trajectory level. We find the influencing factors for a particular trajectory can be divided into two parts. The first part is global features, which keep stable within a trajectory, i.e., the expected destination. The second part is local features, which change over time, i.e., the current position. We believe that the two types of information should be handled in different ways. The hierarchical view is beneficial to take full advantage of the information in a fine-grained way. Experimental results validate the effectiveness of the proposed model-agnostic framework.
[1] |
Fang Z, Wu D, Pan L. When transfer learning meets cross-city urban flow prediction: Spatio-temporal adaptation matters. In Proc. the 31st International Joint Conference on Artificial Intelligence, Jul. 2022, pp.2030–2036. DOI: 10.24963/ijcai.2022/282.
|
[2] |
Wan F, Li L, Wang K, Chen L, Gao Y, Jiang W, Pu S. MTTPRE: A multi-scale spatial-temporal model for travel time prediction. In Proc. the 30th International Conference on Advances in Geographic Information Systems, Nov. 2022, Article No. 51. DOI: 10.1145/3557915.3560986.
|
[3] |
Alahi A, Goel K, Ramanathan V, Robicquet A, Fei-Fei L, Savarese S. Social LSTM: Human trajectory prediction in crowded spaces. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp.961–971. DOI: 10.1109/CVPR.2016.110.
|
[4] |
Sadeghian A, Kosaraju V, Sadeghian A, Hirose N, Rezatofighi H, Savarese S. SoPhie: An attentive GAN for predicting paths compliant to social and physical constraints. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.1349–1358. DOI: 10.1109/CVPR.2019.00144.
|
[5] |
Gupta A, Johnson J, Fei-Fei L, Savarese S, Alahi A. Social GAN: Socially acceptable trajectories with generative adversarial networks. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.2255–2264. DOI: 10.1109/CVPR.2018.00240.
|
[6] |
Zhang P, Ouyang W, Zhang P, Xue J, Zheng N. SR-LSTM: State refinement for LSTM towards pedestrian trajectory prediction. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.12077–12086. DOI: 10.1109/CVPR.2019.01236.
|
[7] |
Xu Y, Liu X, Cao X, Huang C, Liu E, Qian S, Liu X, Wu Y, Dong F, Qiu C W, Qiu J, Hua K, Su W, Wu J, Xu H, Han Y, Fu C, Yin Z, Liu M, Roepman R, Dietmann S, Virta M, Kengara F, Zhang Z, Zhang L, Zhao T, Dai J, Yang J, Lan L, Luo M, Liu Z, An T, Zhang B, He X, Cong S, Liu X, Zhang W, Lewis J P, Tiedje J M, Wang Q, An Z, Wang F, Zhang L, Huang T, Lu C, Cai Z, Wang F, Zhang J. Artificial intelligence: A powerful paradigm for scientific research. The Innovation, 2021, 2(4): 100179. DOI: 10.1016/j.xinn.2021.100179.
|
[8] |
Giuliari F, Hasan I, Cristani M, Galasso F. Transformer networks for trajectory forecasting. In Proc. the 25th International Conference on Pattern Recognition, Jan. 2021, pp.10335–10342. DOI: 10.1109/ICPR48806.2021.9412190.
|
[9] |
Qian T, Wang F, Xu Y, Jiang Y, Sun T, Yu Y. CABIN: A novel cooperative attention based location prediction network using internal-external trajectory dependencies. In Proc. the 29th International Conference on Artificial Neural Networks, Sept. 2020, pp.521–532. DOI: 10.1007/978-3-030-61616-8_42.
|
[10] |
Shao Z, Zhang Z, Wang F, Xu Y. Pre-training enhanced spatial-temporal graph neural network for multivariate time series forecasting. In Proc. the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Aug. 2022, pp.1567–1577. DOI: 10.1145/3534678.3539396.
|
[11] |
Mohamed A, Qian K, Elhoseiny M, Claudel C G. Social-STGCNN: A social spatio-temporal graph convolutional neural network for human trajectory prediction. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.14412–14420. DOI: 10.1109/CVPR42600.2020.01443.
|
[12] |
Salzmann T, Ivanovic B, Chakravarty P, Pavone M. Trajectron++: Dynamically-feasible trajectory forecasting with heterogeneous data. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.683–700. DOI: 10.1007/978-3-030-58523-5_40.
|
[13] |
Amirian J, Hayet J B, Pettré J. Social ways: Learning multi-modal distributions of pedestrian trajectories with GANs. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2019, pp.2964–2972. DOI: 10.1109/CVPRW.2019.00359.
|
[14] |
Kosaraju V, Sadeghian A, Martín-Martín R, Reid I, Rezatofighi S H, Savares S. Social-BiGAT: Multimodal trajectory forecasting using bicycle-GAN and graph attention networks. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, Article No. 13. DOI: 10.5555/3454287.3454300.
|
[15] |
Lee N, Choi W, Vernaza P, Choy C B, Torr P H S, Chandraker M. DESIRE: Distant future prediction in dynamic scenes with interacting agents. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.2165–2174. DOI: 10.1109/CVPR.2017.233.
|
[16] |
Qian T, Xu Y, Zhang Z, Wang F. Trajectory prediction from hierarchical perspective. In Proc. the 30th ACM International Conference on Multimedia, Oct. 2022, pp.6822–6830. DOI: 10.1145/3503161.3548092.
|
[17] |
Pellegrini S, Ess A, Schindler K, van Gool L. You’ll never walk alone: Modeling social behavior for multi-target tracking. In Proc. the 12th IEEE International Conference on Computer Vision, Sept. 29 -Oct. 2, 2009, pp.261–268. DOI: 10.1109/ICCV.2009.5459260.
|
[18] |
Lerner A, Chrysanthou Y, Lischinski D. Crowds by example. Computer Graphics Forum, 2007, 26(3): 655–664. DOI: 10.1111/j.1467-8659.2007.01089.x.
|
[19] |
Robicquet A, Sadeghian A, Alahi A, Savarese S. Learning social etiquette: Human trajectory understanding in crowded scenes. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.549–565. DOI: 10.1007/978-3-319-46484-8_33.
|
[20] |
Wang Y, Wu H, Zhang J, Gao Z, Wang J, Yu P S, Long M. PredRNN: A recurrent neural network for spatiotemporal predictive learning. IEEE Trans. Pattern Analysis and Machine Intelligence, 2023, 45(2): 2208–2225. DOI: 10.1109/TPAMI.2022.3165153.
|
[21] |
Yamaguchi K, Berg A C, Ortiz L E, Berg T L. Who are you with and where are you going? In Proc. the 2011 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2011, pp.1345–1352. DOI: 10.1109/CVPR.2011.5995468.
|
[22] |
Zhu Y, Qian D, Ren D, Xia H. StarNet: Pedestrian trajectory prediction using deep neural network in star topology. In Proc. the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nov. 2019, pp.8075–8080. DOI: 10.1109/IROS40897.2019.8967811.
|
[23] |
Sun J, Jiang Q, Lu C. Recursive social behavior graph for trajectory prediction. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.657–666. DOI: 10.1109/CVPR42600.2020.00074.
|
[24] |
Mangalam K, Girase H, Agarwal S, Lee K H, Adeli E, Malik J, Gaidon A. It is not the journey but the destination: Endpoint conditioned trajectory prediction. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.759–776. DOI: 10.1007/978-3-030-58536-5_45.
|
[25] |
Su Y, Cai D, Zhou Q, Lin Z, Baker S, Cao Y, Shi S, Collier N, Wang Y. Dialogue response selection with hierarchical curriculum learning. In Proc. the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug. 2021, pp.1740–1751. DOI: 10.18653/v1/2021.acl-long.137.
|
[26] |
Liu F, Ge S, Wu X. Competence-based multimodal curriculum learning for medical report generation. In Proc. the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Aug. 2021, pp.3001–3012. DOI: 10.18653/v1/2021.acl-long.234.
|
[27] |
Bengio Y, Louradour J, Collobert R, Weston J. Curriculum learning. In Proc. the 26th Annual International Conference on Machine Learning, Jun. 2009, pp.41–48. DOI: 10.1145/1553374.1553380.
|
[28] |
Wang X, Chen Y, Zhu W. A survey on curriculum learning. IEEE Trans. Pattern Analysis and Machine Intelligence, 2022, 44(9): 4555–4576. DOI: 10.1109/TPAMI.2021.3069908.
|
[29] |
Bishop C M. Mixture density networks. Technical Report, Aston University, 1994. https://research.aston.ac.uk/en/publications/mixture-density-networks, Jan. 2025.
|
[30] |
Chen G, Li J, Lu J, Zhou J. Human trajectory prediction via counterfactual analysis. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.9804–9813. DOI: 10.1109/ICCV48922.2021.00968.
|
[31] |
Liu Y, Cadei R, Schweizer J, Bahmani S, Alahi A. Towards robust and adaptive motion forecasting: A causal representation perspective. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.17060–17071. DOI: 10.1109/CVPR52688.2022.01657.
|