Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality
Description
Inserting an SVD meta-layer into neural networks is prone to
make the covariance ill-conditioned, which could harm the model in the
training stability and generalization abilities. In this paper, we systematically
study how to improve the covariance conditioning by enforcing
orthogonality to the Pre-SVD layer. Existing orthogonal treatments on
the weights are first investigated. However, these techniques can improve
the conditioning but would hurt the performance. To avoid such a side
effect, we propose the Nearest Orthogonal Gradient (NOG) and Optimal
Learning Rate (OLR). The effectiveness of our methods is validated
in two applications: decorrelated Batch Normalization (BN) and Global
Covariance Pooling (GCP). Extensive experiments on visual recognition
demonstrate that our methods can simultaneously improve the covariance
conditioning and generalization. Moreover, the combinations with
orthogonal weight can further boost the performances.
Files
136840352 (1).pdf
Files
(6.8 MB)
Name | Size | Download all |
---|---|---|
md5:cf7d0235819eff07d1e9e1c73076b0e6
|
6.8 MB | Preview Download |