Published June 16, 2023 | Version v1
Dataset Open

KuaiSAR: A Unified Search And Recommendation Dataset

  • 1. Renmin University of China
  • 2. Kuaishou Technology Co., Ltd.

Description

The confluence of Search and Recommendation (S&R) services is a vital aspect of online content platforms like Kuaishou and TikTok. The integration of S&R modeling is a highly intuitive approach adopted by industry practitioners. However, there is a noticeable lack of research conducted in this area within the academia, primarily due to the absence of publicly available datasets. Consequently, a substantial gap has emerged between academia and industry regarding research endeavors in this field. To bridge this gap, we introduce the first large-scale, real-world dataset KuaiSAR of integrated Search And Recommendation behaviors collected from Kuaishou, a leading short-video app in China with over 300 million daily active users. Previous research in this field has predominantly employed publicly available datasets that are semi-synthetic and simulated, with artificially fabricated search behaviors. Distinct from previous datasets, KuaiSAR records genuine user behaviors, the occurrence of each interaction within either search or recommendation service, and the users’ transitions between the two services. This work aids in joint modeling of S&R, and the utilization of search data for recommenders (and recommendation data for search engines). Additionally, due to the diverse feedback labels of user-video interactions, KuaiSAR also supports a wide range of other tasks, including intent recommendation, multi-task learning, and long sequential multi-behavior modeling etc. We believe this dataset will facilitate innovative research and enrich our understanding of S&R services integration in real-world applications.

Files

KuaiSAR.zip

Files (1.3 GB)

Name Size Download all
md5:daea8cbf605db6bd5841740f0e4a12d9
488.1 MB Preview Download
md5:ed630f47fb013c5b31bb020c483ef262
845.1 MB Preview Download