We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Pajon Q, Serre S, Wissocq H et al. Balancing accuracy and training time in federated learning for violence detection in surveillance videos: A study of neural network architectures. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(5): 1029−1039 Sept. 2024. DOI: 10.1007/s11390-024-3702-7.
Citation: Pajon Q, Serre S, Wissocq H et al. Balancing accuracy and training time in federated learning for violence detection in surveillance videos: A study of neural network architectures. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 39(5): 1029−1039 Sept. 2024. DOI: 10.1007/s11390-024-3702-7.

Balancing Accuracy and Training Time in Federated Learning for Violence Detection in Surveillance Videos: A Study of Neural Network Architectures

More Information
  • Author Bio:

    Quentin Pajon obtained his engineering degree in information technology, with a focus on artificial intelligence and data science, from Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, in 2023. Currently, he works as a data scientist at Capgemini, where he specializes in using machine learning and artificial intelligence to address complex industrial challenges. His primary research interests lie in the development and application of AI and data science techniques to enhance industrial processes and drive innovation

    Swan Serre obtained his engineering degree in information technology, with a focus on artificial intelligence and data science, from Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, in 2023. Passionating about artificial intelligence, he is now poised to apply his expertise in the fields of AI and data science, leveraging his strong background in programming languages

    Hugo Wissocq obtained his engineering degree in information technology, with a focus on artificial intelligence and data science, from Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, in 2023. His research interests include deep learning, Python programming, and the development of AI models for practical applications

    Léo Rabaud obtained his engineering degree in information technology, with a focus on artificial intelligence and data science, from Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, in 2023. His research interests include data analysis and business intelligence

    Siba Haidar obtained her Master degree in computer science, from the Faculty of Sciences of the Lebanese University, Beirut, in 2001, and her Ph.D. degree in computer science, focusing on multimedia processing, from the University of Paul Sabatier, Toulouse, in 2005. She is currently an associate professor at Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, where her research is primarily in machine learning, video analysis, and intelligent tutoring systems

    Antoun Yaacoub received his Master's degree in computer science from the Lebanese University, Beirut, and University Paul Sabatier, Toulouse, in 2008, followed by his Ph.D. degree in artificial intelligence from University of Toulouse, Toulouse, in 2012. He currently serves as a lecturer and researcher at Ecole Supérieure d'Informatique Electronique Autornatique (ESIEA), Paris, specializing in AI. His current research interests span various facets of AI, with a particular focus on anomaly detection

  • Corresponding author:

    antoun.yaacoub@esiea.fr

  • Received Date: August 21, 2023
  • Accepted Date: June 26, 2024
  • This paper presents an original investigation into the domain of violence detection in videos, introducing an innovative approach tailored to the unique challenges of a federated learning environment. The study encompasses a comprehensive exploration of machine learning techniques, leveraging spatio-temporal features extracted from benchmark video datasets. In a notable departure from conventional methodologies, we introduce a novel architecture, the “Diff Gated” network, designed to streamline preprocessing and training while simultaneously enhancing accuracy. Our exploration of advanced machine learning techniques, such as super-convergence and transfer learning, expands the horizons of federated learning, offering a broader range of practical applications. Moreover, our research introduces a method for seamlessly adapting centralized datasets to the federated learning context, bridging the gap between traditional machine learning and federated learning approaches. The outcome of this study is a remarkable advancement in the field of violence detection, with our federated learning model consistently outperforming state-of-the-art models, underscoring the transformative potential of our contributions. This work represents a significant step forward in the application of machine learning techniques to critical societal challenges.

  • [1]
    Youssef W F, Haidar S, Joly P. Automatic textual description of interactions between two objects in surveillance videos. SN Applied Sciences, 2021, 3(7): 695. DOI: 10.1007/s42452-021-04534-3.
    [2]
    Gosselin R, Vieu L, Loukil F, Benoit A. Privacy and security in federated learning: A survey. Applied Sciences, 2022, 12(19): 9901. DOI: 10.3390/app12199901.
    [3]
    Hu R, Guo Y X, Li H N, Pei Q Q, Gong Y M. Personalized federated learning with differential privacy. IEEE Internet of Things Journal, 2020, 7(10): 9530–9539. DOI: 10.1109/JIOT.2020.2991416.
    [4]
    Cheng Y, Liu Y, Chen T J, Yang Q. Federated learning for privacy-preserving AI. Communications of the ACM, 2020, 63(12): 33–36. DOI: 10.1145/3387107.
    [5]
    Crawford K, Schultz J. Big data and due process: Toward a framework to redress predictive privacy harms. Boston College Law Review, 2014, 55(1): 93–128.
    [6]
    Zou J, Schiebinger L. AI can be sexist and racist—It’s time to make it fair. Nature, 2018, 559(7714): 324–326. DOI: 10.1038/d41586-018-05707-8.
    [7]
    Buolamwini J, Gebru T. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proc. the 1st Conference on Fairness, Accountability and Transparency, Feb. 2018, pp.77–91.
    [8]
    Mittelstadt B D, Allo P, Taddeo M, Wachter S, Floridi L. The ethics of algorithms: Mapping the debate. Big Data & Society, 2016, 3(2): 1–21. DOI: 10.1177/2053951716679679.
    [9]
    Greene D, Hoffmann A L, Stark L. Better, nicer, clearer, fairer: A critical assessment of the movement for ethical artificial intelligence and machine learning. In Proc. the 52nd Hawaii International Conference on System Sciences, Jan. 2019, pp.2122–2131. DOI: 10.24251/HICSS.2019.258.
    [10]
    Hassner T, Itcher Y, Kliper-Gross O. Violent flows: Real-time detection of violent crowd behavior. In Proc. the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Jun. 2012, pp.1–6. DOI: 10.1109/CVPRW.2012.6239348.
    [11]
    Gao Y, Liu H, Sun X H, Wang C, Liu Y. Violence detection using oriented violent flows. Image and Vision Computing, 2016, 48-49: 37–41. DOI: 10.1016/j.imavis.2016.01.006.
    [12]
    Ding C H, Fan S K, Zhu M, Feng W G, Jia B Z. Violence detection in video by using 3D convolutional neural networks. In Proc. the 10th International Symposium on Advances in Visual Computing, Dec. 2014, pp.551–558. DOI: 10.1007/978-3-319-14364-4_53.
    [13]
    Li J, Jiang X H, Sun T F, Xu K. Efficient violence detection using 3D convolutional neural networks. In Proc. the 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Sept. 2019, pp.1–8. DOI: 10.1109/AVSS.2019.8909883.
    [14]
    Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.2261–2269. DOI: 10.1109/CVPR.2017.243.
    [15]
    Shi X J, Chen Z R, Wang H, Yeung D Y, Wong W K, Woo W C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Proc. the 28th International Conference on Neural Information Processing Systems, Dec. 2015, pp.802–810. DOI: 10.5555/2969239.2969329.
    [16]
    Sudhakaran S, Lanz O. Learning to detect violent videos using convolutional long short-term memory. In Proc. the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Aug. 29-Sept. 1, 2017, pp.1–6. DOI: 10.1109/AVSS.2017.8078468.
    [17]
    Cheng M, Cai K J, Li M. RWF-2000: An open large scale video database for violence detection. In Proc. the 25th International Conference on Pattern Recognition (ICPR), Jan. 2021, pp.4183–4190. DOI: 10.1109/ICPR48806.2021.9412502.
    [18]
    McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In Proc. the 20th International Conference on Artificial Intelligence and Statistics, Apr. 2017, pp.1273–1282.
    [19]
    Li T, Sahu A K, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. In Proc. the 3rd Conference on Machine Learning and Systems, Mar. 2020.
    [20]
    Karimireddy S P, Jaggi M, Kale S, Mohri M, Reddi S J, Stich S U, Suresh A T. Breaking the centralized barrier for cross-device federated learning. In Proc. the 35th International Conference on Neural Information Processing Systems, Dec. 2021, Article No.2196
    [21]
    Reddi S J, Charles Z, Zaheer M, Garrett Z, Rush K, Konečnỳ J, Kumar S, McMahan H B. Adaptive federated optimization. In Proc. the 9th International Conference on Learning Representations, May 2020.
    [22]
    Silva V E D S, Lacerda T B, Miranda P B C, Nascimento A C A, Furtado A P C. Federated learning for physical violence detection in videos. In Proc. the 2022 International Joint Conference on Neural Networks (IJCNN), Jul. 2022. DOI: 10.1109/IJCNN55064.2022.9892150.
    [23]
    Sernani P, Falcionelli N, Tomassini S, Contardo P, Dragoni A F. Deep learning for automatic violence detection: Tests on the AIRTLab dataset. IEEE Access, 2021, 9: 160580–160595. DOI: 10.1109/ACCESS.2021.3131315.
    [24]
    Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.4489–4497. DOI: 10.1109/ICCV.2015.510.
    [25]
    Smith L N. Cyclical learning rates for training neural networks. In Proc. the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Mar. 2017, pp.464–472. DOI: 10.1109/WACV.2017.58.
    [26]
    Smith L N, Topin N. Super-convergence: Very fast training of neural networks using large learning rates. In Proc. SPIE 11006, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications, May 2019, Article No.1100612. DOI: 10.1117/12.2520589.
    [27]
    Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M, Adam H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017. https://arxiv.org/abs/1704.04861, Jul. 2024.
    [28]
    Qiu Z F, Yao T, Mei T. Learning spatio-temporal representation with pseudo-3D residual networks. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.5534–5542. DOI: 10.1109/ICCV.2017.590.
    [29]
    Islam Z, Rukonuzzaman M, Ahmed R, Kabir M H, Farazi M. Efficient two-stream network for violence detection using separable convolutional LSTM. In Proc. the 2021 International Joint Conference on Neural Networks (IJCNN), Jul. 2021, pp.1–8. DOI: 10.1109/IJCNN52387.2021.9534280.
    [30]
    Bermejo Nievas E, Deniz Suarez O, Bueno García G, Sukthankar R. Violence detection in video using computer vision techniques. In Proc. the 14th International Conference on Computer Analysis of Images and Patterns, Aug. 2011, pp.332–339. DOI: 10.1007/978-3-642-23678-5_39.
    [31]
    Bianculli M, Falcionelli N, Sernani P, Tomassini P, Contardo P, Lombardi M, Dragoni A F. A dataset for automatic violence detection in videos. Data in Brief, 2020, 33: 106587. DOI: 10.1016/j.dib.2020.106587.
    [32]
    Beutel D J, Topal T, Mathur A, Qiu X C, Fernandez-Marques J, Gao Y, Sani L, Li K H, Parcollet T, de Gusmão P P B, Lane N D. Flower: A friendly federated learning research framework. arXiv: 2007.14390, 2020. https://arxiv.org/abs/2007.14390, Jul. 2024.
    [33]
    Li T, Sahu A K, Talwalkar A, Smith V. Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 2020, 37(3): 50–60. DOI: 10.1109/MSP.2020.2975749.
  • Related Articles

    [1]Zhong-Zheng Peng, Yi-Xin Yang, Jin-Hui Tang, Jin-Shan Pan. Video Colorization: A Survey[J]. Journal of Computer Science and Technology, 2024, 39(3): 487-508. DOI: 10.1007/s11390-024-4143-z
    [2]Cui-Xia Ma, Jian-Cheng Song, Qian Zhu, Kevin Maher, Ze-Yuan Huang, Hong-An Wang. EmotionMap: Visual Analysis of Video Emotional Content on a Map[J]. Journal of Computer Science and Technology, 2020, 35(3): 576-591. DOI: 10.1007/s11390-020-0271-2
    [3]Zhi-Neng Chen, Chong-Wah Ngo, Wei Zhang, Juan Cao, Yu-Gang Jiang. Name-Face Association in Web Videos: A Large-Scale Dataset, Baselines, and Open Issues[J]. Journal of Computer Science and Technology, 2014, 29(5): 785-798. DOI: 10.1007/s11390-014-1468-z
    [4]Cheng-De Zhang, Xiao Wu, Mei-Ling Shyu, Qiang Peng. A Novel Web Video Event Mining Framework with the Integration of Correlation and Co-Occurrence Information[J]. Journal of Computer Science and Technology, 2013, 28(5): 788-796. DOI: 10.1007/s11390-013-1377-6
    [5]Fei Wang, Jin-Tao Li, Yong-Dong Zhang, Shou-Xun Lin. Semantic and Structural Analysis of TV Diving Programs[J]. Journal of Computer Science and Technology, 2004, 19(6).
    [6]PENG ZhenYun, AI HaiZhou, Hong Wei, LIANG LuHong, XU GuangYou. Multi-Cue-Based Face and Facial Feature Detection on Video Segments[J]. Journal of Computer Science and Technology, 2003, 18(2).
    [7]WANG Weiqiang, GAO Wen. Automatic Segmentation of News Items Based on Video and Audio Features[J]. Journal of Computer Science and Technology, 2002, 17(2).
    [8]LIAO Husheng. An Action Analysis for Combining Partial Evaluation[J]. Journal of Computer Science and Technology, 2000, 15(2): 196-201.
    [9]ZHUANG Yueting, RUI Yong, Thomas S.Huang. Video Key Frame Extraction by Unsupervised Clustering and Feedback Adjustment[J]. Journal of Computer Science and Technology, 1999, 14(3): 283-287.
    [10]Wang Haiqin, Dai Ruwei. Document Analysis by Crosscount Approach[J]. Journal of Computer Science and Technology, 1998, 13(1): 32-40.
  • Others

Catalog

    Article views (247) PDF downloads (40) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return