Conference paper Open Access

Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification

Gao, Yingbo; Herold, Christian; Wang, Weiyue; Ney, Hermann

Prominently used in support vector machines and logistic re-gressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In this work, by replacing the inner product function in the softmax layer, we explore the use of kernels for contextual word classification. In order to compare the individual kernels, experiments are conducted on standard language modeling and machine translation tasks. We observe a wide range of performances across different kernel settings. Extending the results, we look at the gradient properties, investigate various mixture strategies and examine the disambiguation abilities.

Files (394.9 kB)
Name Size
IWSLT2019_paper_15.pdf
md5:01f629061ed819d11ef2f96bc6169d5b
394.9 kB Download
74
48
views
downloads
All versions This version
Views 7474
Downloads 4848
Data volume 19.0 MB19.0 MB
Unique views 6666
Unique downloads 4343

Share

Cite as