Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?
Description
Global Covariance Pooling (GCP) aims at exploiting
the second-order statistics of the convolutional feature. Its
effectiveness has been demonstrated in boosting the classification
performance of Convolutional Neural Networks
(CNNs). Singular Value Decomposition (SVD) is used in
GCP to compute the matrix square root. However, the
approximate matrix square root calculated using Newton-
Schulz iteration [14] outperforms the accurate one computed
via SVD [15]. We empirically analyze the reason
behind the performance gap from the perspectives of data
precision and gradient smoothness. Various remedies for
computing smooth SVD gradients are investigated. Based
on our observation and analyses, a hybrid training protocol
is proposed for SVD-based GCP meta-layers such that
competitive performances can be achieved against Newton-
Schulz iteration. Moreover, we propose a new GCP metalayer
that uses SVD in the forward pass, and Pad´e approximants
in the backward propagation to compute the gradients.
The proposed meta-layer has been integrated into
different CNN models and achieves state-of-the-art performances
on both large-scale and fine-grained datasets.
Files
Song_Why_Approximate_Matrix_Square_Root_Outperforms_Accurate_SVD_in_Global_ICCV_2021_paper (2).pdf
Files
(4.8 MB)
Name | Size | Download all |
---|---|---|
md5:c94ad16e9d650b95c6bc38b5554d4d2c
|
4.8 MB | Preview Download |