TY - JOUR
T1 - Designing architectures of convolutional neural networks to solve practical problems
AU - Ferreira, Martha Dais
AU - Corrêa, Débora Cristina
AU - Nonato, Luis Gustavo
AU - de Mello, Rodrigo Fernandes
PY - 2018/3/15
Y1 - 2018/3/15
N2 - The Convolutional Neural Network (CNN) figures among the state-of-the-art Deep Learning (DL) algorithms due to its robustness to support data shift, scale variations, and its capability of extracting relevant information from large-scale input data. However, setting appropriate parameters to define CNN architectures is still a challenging issue, mainly to tackle real-world problems. A typical approach consists in empirically assessing different CNN settings in order to select the most appropriate one. This procedure has clear limitations, including the choice of suitable predefined configurations as well as the high computational cost involved in evaluating each of them. This work presents a novel methodology to tackle the previously mentioned issues, providing mechanisms to estimate effective CNN configurations, including the size of convolutional masks (convolutional kernels) and the number of convolutional units (CNN neurons) per layer. Based on the False Nearest Neighbors (FNN), a well-known tool from the area of Dynamical Systems, the proposed method helps estimating CNN architectures that are less complex and produce good results. Our experiments confirm that architectures estimated through the proposed approach are as effective as the complex ones defined by empirical and computationally intensive strategies.
AB - The Convolutional Neural Network (CNN) figures among the state-of-the-art Deep Learning (DL) algorithms due to its robustness to support data shift, scale variations, and its capability of extracting relevant information from large-scale input data. However, setting appropriate parameters to define CNN architectures is still a challenging issue, mainly to tackle real-world problems. A typical approach consists in empirically assessing different CNN settings in order to select the most appropriate one. This procedure has clear limitations, including the choice of suitable predefined configurations as well as the high computational cost involved in evaluating each of them. This work presents a novel methodology to tackle the previously mentioned issues, providing mechanisms to estimate effective CNN configurations, including the size of convolutional masks (convolutional kernels) and the number of convolutional units (CNN neurons) per layer. Based on the False Nearest Neighbors (FNN), a well-known tool from the area of Dynamical Systems, the proposed method helps estimating CNN architectures that are less complex and produce good results. Our experiments confirm that architectures estimated through the proposed approach are as effective as the complex ones defined by empirical and computationally intensive strategies.
KW - Architecture assessment
KW - Convolutional neural network
KW - Dynamical systems
KW - Face recognition
KW - Handwritten digit recognition
KW - Object recognition
UR - http://www.scopus.com/inward/record.url?scp=85032942184&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2017.10.052
DO - 10.1016/j.eswa.2017.10.052
M3 - Article
AN - SCOPUS:85032942184
SN - 0957-4174
VL - 94
SP - 205
EP - 217
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -