Journal of Computer Science and Technology

   

Multi-Feature Fusion Based Structural Deep Neural Network for Predicting Answer Time on Stack Overflow

Shi-Kai Guo1(郭世凯), Member, CCF, Si-Wen Wang1,2(王思文), Hui Li1,∗(李 辉), Member, CCF, Yu-Long Fan1(范玉龙), Ya-Qing Liu1(刘亚清), and Bin Zhang3(张 斌)   

  1. 1College of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
    2Navigation College, Dalian Maritime University, Dalian 116026, China
    3College of Computer Science and Technology, Shandong Technology and Business University, Yantai 264005, China
  • Received:2021-03-15 Revised:2023-02-09 Accepted:2023-02-24
  • Contact: Hui Li E-mail:li_hui@dlmu.edu.cn
  • About author:Hui Li received his bachelor, mas- ter and doctoral degrees from Northeastern University, Shenyang, in 2006, 2008 and 2013, respectively. He is an associate professor of Information Science and Technology College, Dalian Maritime University. His research interests include software test, artificial intelligence and complex networks.

Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various topics. However, many questions are usually not answered quickly enough. Since the questioners are eager to know the specific time interval at which a question can be answered, it becomes an important task for Stack Overflow to feedback the answer time to the questions. To address this issue, we propose a model for predicting the answer time of questions, named Predicting Answer Time (i.e., PAT model), which consists of two parts: a feature acquisition and fusion model, and a deep neural network model. The framework uses a variety of features mined from questions in Stack Overflow, including the question description, question title, question tags, the creation time of the question, and other temporal features. These features are fused and fed into the deep neural network to predict the answer time of the question. As a case study, post data from Stack Overflow are used to assess the model. We use traditional regression algorithms as the baselines, such as Linear Regression, K-Nearest Neighbors Regression, Support Vector Regression, Multilayer Perceptron Regression, and Random Forest Regression. Experimental results show that PAT model can predict the answer time of questions more accurately than traditional regression algorithms, and shorten the error of the predicted answer time by nearly 10 hours.


中文摘要

1、研究背景(Context)
程序开发人员在软件开发的过程中往往会遇到许多技术问题,提出具体问题并从在线专家那里得到有针对性回答的方式是当前最常用的方式之一。但是程序开发人员所提问题回复时间的长短取决于很多因素,包括问题的表述方式、问题表述的细致度、问题类别的数量、在线并对问题感兴趣的用户数,等等。目前的相关研究工作集中在预测问题是否会在给定的时间间隔内被回复,而预测出其具体的回复时间在目前尚未见报道。如果能够准确高效地预测问题的回复时间,且能够让用户对其回复时间有一个明确的认知,开发人员就能够更合理安排自己的工作,既能提高工作效率,也能提升用户对平台的体验感受。
2、目的(Objective)
我们的研究目标是预测问答网站上问题的回复时间。由于目前在线问答网站有一个明显的缺点,对于用户发布的问题它没有一个明确的预计回复时间。如果问答网站可以为用户发布的问题提供一个预计的回复时间,可以帮助用户更合理的安排他们的时间,提高工作效率,也可以提升用户对平台的体验感受,问答网站也会变得越来越受用户欢迎。
3、方法(Method)
我们提出了PRT模型,这是一种将深度神经网络与多特征融合相结合的方法,提取出问题的多种特征并进行分析,将相关的特征进行提取和融合,再结合全连接神经网络模型来预测问答网站上问题的回复时间,然后通过平均相对误差来衡量PRT模型的性能。通过这种方法,我们采用Stack Overflow平台上的问题数据进行实验分析,以证明PRT模型的有效性。
4、结果(Result & Findings)
对于问答网站上用户提出的问题,PRT模型预测的问题的回复时间与该问题的实际回复时间相比平均误差在5.5个小时左右,而传统的回归模型的预测误差在15个小时左右。PRT模型使得平均相对误差缩短了将近10个小时,因此我们提出的PRT模型在预测Stack Overflow上问题的回复时间方面相比于传统的回归算法具有更好的性能。
5、结论(Conclusions)
我们将预测回复时间这类问题当作回归问题来处理,找到影响问题的回复时间的特征集合,通过特征融合和深度神经网络方法相结合,用于预测问题的具体回复时间。对于一个新发布的问题帖子,通过该模型可以直接预测出其具体的回复时间,用户可以根据模型的预测结果决定选择另一个解决方案还是继续等待一个可接受的答案,可以帮助用户更好的安排时间。通过一系列的实验分析,表明本文提出的用于预测问题回复时间的模型框架具有较好的性能。此外,我们还讨论了其它潜在的改进方向,例如使用卷积神经网络或者递归神经网络代替全连接神经网络模型,通过模型改进和参数优化过程使得模型具有更好的性能。

Key words: answer time; neural network; Stack Overflow;

[1] Xiao-Bing Chen, Hao Qi, Shao-Hui Peng, Yi-Min Zhuang, Tian Zhi, and Yun-Ji Chen. Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators [J]. Journal of Computer Science and Technology, 2022, 37(6): 1255-1270.
[2] Xu-Gang Wu, Hui-Jun Wu, Xu Zhou, Xiang Zhao, and Kai Lu. Towards Defense Against Adversarial Attacks on Graph Neural Networks via Calibrated Co-Training [J]. Journal of Computer Science and Technology, 2022, 37(5): 1161-1175.
[3] Xiao-Qing Deng, Bo-Lin Chen, Wei-Qi Luo, and Da Luo. Universal Image Steganalysis Based on Convolutional Neural Network with Global Covariance Pooling [J]. Journal of Computer Science and Technology, 2022, 37(5): 1134-1145.
[4] Zhi-Jing Wu, Yi-Qun Liu, Jia-Xin Mao, Min Zhang, and Shao-Ping Ma. Leveraging Document-Level and Query-Level Passage Cumulative Gain for Document Ranking [J]. Journal of Computer Science and Technology, 2022, 37(4): 814-838.
[5] Zheng Chen, Xiao-Nan Fang, and Song-Hai Zhang. Local Homography Estimation on User-Specified Textureless Regions [J]. Journal of Computer Science and Technology, 2022, 37(3): 615-625.
[6] Hua-Peng Wei, Ying-Ying Deng, Fan Tang, Xing-Jia Pan, and Wei-Ming Dong. A Comparative Study of CNN- and Transformer-Based Visual Style Transfer [J]. Journal of Computer Science and Technology, 2022, 37(3): 601-614.
[7] Xiao-Zheng Xie, Jian-Wei Niu, Xue-Feng Liu, Qing-Feng Li, Yong Wang, Jie Han, and Shaojie Tang. DG-CNN: Introducing Margin Information into Convolutional Neural Networks for Breast Cancer Diagnosis in Ultrasound Images [J]. Journal of Computer Science and Technology, 2022, 37(2): 277-294.
[8] Xin-Feng Wang, Xiang Zhou, Jia-Hua Rao, Zhu-Jin Zhang, and Yue-Dong Yang. Imputing DNA Methylation by Transferred Learning Based Neural Network [J]. Journal of Computer Science and Technology, 2022, 37(2): 320-329.
[9] Xin Zhang, Siyuan Lu, Shui-Hua Wang, Xiang Yu, Su-Jing Wang, Lun Yao, Yi Pan, and Yu-Dong Zhang. Diagnosis of COVID-19 Pneumonia via a Novel Deep Learning Architecture [J]. Journal of Computer Science and Technology, 2022, 37(2): 330-343.
[10] Dan-Hao Zhu, Xin-Yu Dai, Jia-Jun Chen. Pre-Train and Learn: Preserving Global Information for Graph Neural Networks [J]. Journal of Computer Science and Technology, 2021, 36(6): 1420-1430.
[11] Yi Zhong, Jian-Hua Feng, Xiao-Xin Cui, Xiao-Le Cui. Machine Learning Aided Key-Guessing Attack Paradigm Against Logic Block Encryption [J]. Journal of Computer Science and Technology, 2021, 36(5): 1102-1117.
[12] Feng Wang, Guo-Jie Luo, Guang-Yu Sun, Yu-Hao Wang, Di-Min Niu, Hong-Zhong Zheng. Area Efficient Pattern Representation of Binary Neural Networks on RRAM [J]. Journal of Computer Science and Technology, 2021, 36(5): 1155-1166.
[13] Shao-Jie Qiao, Guo-Ping Yang, Nan Han, Hao Chen, Fa-Liang Huang, Kun Yue, Yu-Gen Yi, Chang-An Yuan. Cardinality Estimator: Processing SQL with a Vertical Scanning Convolutional Neural Network [J]. Journal of Computer Science and Technology, 2021, 36(4): 762-777.
[14] Chen-Chen Sun, De-Rong Shen. Mixed Hierarchical Networks for Deep Entity Matching [J]. Journal of Computer Science and Technology, 2021, 36(4): 822-838.
[15] Yang Liu, Ruili He, Xiaoqian Lv, Wei Wang, Xin Sun, Shengping Zhang. Is It Easy to Recognize Baby's Age and Gender? [J]. Journal of Computer Science and Technology, 2021, 36(3): 508-519.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved