SRNET:用于解析奇点的基于浅跳跃连接的卷积神经网络

doi:10.1007/s11390-019-1950-8

SRNET:用于解析奇点的基于浅跳跃连接的卷积神经网络

SRNET: A Shallow Skip Connection Based Convolutional Neural Network Design for Resolving Singularities

Robail Yasrab

摘要

摘要: 近年来，卷积神经网络（CNNs）已经取得巨大的进展和并拥有卓越的性能。自问世以来，CNN在许多分类和分割任务中展现了出色的性能。目前，CNN家族包括诸多不同的构架，并广泛应用于大多基于视觉的识别任务。然而，通过简单地叠加卷积模块构建神经网络，不可避免地限制了其优化能力并导致过度拟合和梯度消失的问题。网络奇点是引起前面提到问题的关键原因之一，并在损失状况中，最近已经引起了损失表面中的流形退化。这导致了缓慢的学习过程和低性能。因此，跳跃连接成为CNN设计中缓解网络奇点的重要部分。本文旨在采用NN构架中的跳跃连接以增强信息流，缓解奇点并改善性能。本研究检验了不同层次的跳跃连接，并针对任一CNN提出了这些链接的替代策略。为了验证本文提出的假设，我们设计了一个实验CNN构架，称为Shallow Wide ResNet或SRNet。它使用了宽残差网络为基础网络设计。我们已经做了大量实验以评价本文工作的有效性。我们使用了2个众所周知的数据集，CIF AR-10和CIF AR-100，来训练和测试CNNs。最终实证结果表明在网络奇点方面，其性能、效率和奇点缓解均取得不错的成绩。

Abstract: Convolutional neural networks (CNNs) have shown tremendous progress and performance in recent years. Since emergence, CNNs have exhibited excellent performance in most of classification and segmentation tasks. Currently, the CNN family includes various architectures that dominate major vision-based recognition tasks. However, building a neural network (NN) by simply stacking convolution blocks inevitably limits its optimization ability and introduces overfitting and vanishing gradient problems. One of the key reasons for the aforementioned issues is network singularities, which have lately caused degenerating manifolds in the loss landscape. This situation leads to a slow learning process and lower performance. In this scenario, the skip connections turned out to be an essential unit of the CNN design to mitigate network singularities. The proposed idea of this research is to introduce skip connections in NN architecture to augment the information flow, mitigate singularities and improve performance. This research experimented with different levels of skip connections and proposed the placement strategy of these links for any CNN. To prove the proposed hypothesis, we designed an experimental CNN architecture, named as Shallow Wide ResNet or SRNet, as it uses wide residual network as a base network design. We have performed numerous experiments to assess the validity of the proposed idea. CIFAR-10 and CIFAR-100, two well-known datasets are used for training and testing CNNs. The final empirical results have shown a great many of promising outcomes in terms of performance, efficiency and reduction in network singularities issues.

HTML全文

参考文献()

施引文献

资源附件()