Published March 28, 2018 | Version v1
Journal article Open

Linking User Online Behavior across Domains with Internet Traffic

  • 1. Beijing University of Posts and Telecommunications, Beijing, China
  • 2. China Electronics Technology Group Corp., Shenzhen, China
  • 3. Aisino Corporation, Beijing, China

Description

We are facing an era of Online With Offline (OWO) in the smart city - almost everyone is using various online services to connect friends, watch videos, listen to the music, download resources, and so on. Our online behaviors are separated by different domains, which may cause serious problem in the area of cross-domain recommendation, advertising, and criminal tracking in online and offline world, since it is a very challenging task to link user online behaviors belonging to the same natural person. Existing methods usually tackle user online behavior linkage problem by estimating the profile content similarity between two different online services. However, the profile contents in heterogeneous online services are unreliable or misaligned, and the proposed methods are always limited to several services in a specific domain. In order to link individual's online behavior across domains, in this paper, we propose user Online Behavior Linkage across Domains (OBLD), a novel hybrid model, to link user online behavior across domains with Internet traffic. It derives several signifficant attributes from users' online behaviors, such as user digital identity, various fingerprints of terminals and browsers, spatio-temporal behavior of users, and leverages a supervised classi_cation method to discover the relationship between users' online behaviors. Also, the proposed model has unsupervised setting for dataset with non or few label data if a certain percentage of user digital identities can be extracted from original dataset. By using real-world network traffic collected from two large provinces in China, we evaluate the OBLD model and the linkage precision achieves 89% and 97.9% for two datasets respectively. Especially, the inputs of OBLD, i.e., network traffic flows, cover all online behavior of users who connect with Internet through monitored networks, which makes it possible to link online behaviors of users in whole online world.

Files

jucs_article_23072.pdf

Files (1.0 MB)

Name Size Download all
md5:6016a18161754362af3e284ac49b121a
1.0 MB Preview Download