Mathematical Foundations of Symmetrized Hyperbolic Tangent Activation in Neural Networks
Authors/Creators
Description
This paper presents a comprehensive theoretical analysis of neural network operators activated by the symmetrized hyperbolic tangent function, with a focus on robustness and convergence in multivariate function approximation. The symmetrized hyperbolic tangent activation function has emerged as a promising tool in neural network research due to its ability to enhance approximation accuracy. This study extends the current understanding by introducing novel theorems that demonstrate the stability and accuracy of these operators in more general function spaces and under adversarial conditions. The research delves into the mathematical formulation of the symmetrized hyperbolic tangent activation function and its associated density functions, highlighting their symmetry properties crucial for neural network analysis. The paper introduces several key theorems that explore the convergence properties of these operators in general function spaces, their robustness under adversarial perturbations, stability in high-dimensional spaces, uniform convergence in Sobolev spaces, and adaptive robustness under noise. The findings reveal that the symmetrized hyperbolic tangent activation function exhibits enhanced convergence rates in general function spaces, maintaining stability even in high-dimensional settings. Furthermore, the study shows that these operators are robust to adversarial conditions and noise, making them suitable for real-world applications where data integrity cannot be guaranteed. By providing a solid theoretical foundation, this work contributes to the development of more reliable and efficient neural network models for complex approximation tasks. The insights gained from this analysis have the potential to inform the design of neural network architectures and training algorithms, ultimately advancing the field of machine learning and its applications.
Files
document.pdf
Files
(357.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3584eeb3747af322ed06ef1d72c452b5
|
357.6 kB | Preview Download |
Additional details
References
- Cybenko, George. "Approximation by superpositions of a sigmoidal function." \textit{Mathematics of control, signals and systems} 2.4 (1989): 303-314. \url{https://doi.org/10.1007/BF02551274}.
- Hornik, Kurt, Maxwell Stinchcombe, and Halbert White. "Multilayer feedforward networks are universal approximators." \textit{Neural networks} 2.5 (1989): 359-366. \url{https://doi.org/10.1016/0893-6080(89)90020-8}.
- Nair, Vinod, and Geoffrey E. Hinton. "Rectified linear units improve restricted boltzmann machines." \textit{Proceedings of the 27th international conference on machine learning} (ICML-10). 2010.
- Maas, Andrew L., Awni Y. Hannun, and Andrew Y. Ng. "Rectifier nonlinearities improve neural network acoustic models." \textit{Proc. icml.} Vol. 30. No. 1. 2013.
- He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." \textit{Proceedings of the IEEE international conference on computer vision}. 2015.
- Ramachandran, Prajit, Barret Zoph, and Quoc V. Le. "Searching for activation functions." \textit{arXiv preprint arXiv:1710.05941} (2017).
- Yarotsky, Dmitry. "Error bounds for approximations with deep ReLU networks." \textit{Neural networks} 94 (2017): 103-114. \url{https://doi.org/10.1016/j.neunet.2017.07.002}.
- Hanin, Boris. "Universal function approximation by deep neural nets with bounded width and relu activations." \textit{Mathematics} 7.10 (2019): 992. \url{https://doi.org/10.3390/math7100992}.
- Kidger, Patrick, and Terry Lyons. "Universal approximation with deep narrow networks." \textit{Conference on learning theory}. PMLR, 2020.
- Lu, Jianfeng, et al. "Deep network approximation for smooth functions." \textit{SIAM Journal on Mathematical Analysis} 53.5 (2021): 5465-5506. \url{https://doi.org/10.1137/20M134695X}.
- Bengio, Yoshua, Yann Lecun, and Geoffrey Hinton. "Deep learning for AI." \textit{Communications of the ACM} 64.7 (2021): 58-65. \url{https://doi.org/10.1145/3448250}.