Published February 23, 2025
| Version v1
Dataset
Open
Artifacts for "Did I Vet You Before? Assessing the Chrome Web Store Vetting Process through Browser Extension Similarity"
Creators
Description
This record contains the artifacts for the aforementioned paper.
All artifacts are provided in Apache Parquet format. We recommend using Python's pandas library to open these files.
The following is a summary of the provided artifacts:
ground-truth.parquet
: Contains pairs of extensions used as ground truth for evaluating the pipeline employed in our research.infringing-extensions.parquet
: Metadata of vetted extensions labeled by Google and infringing extensions found by our pipeline. Includes the cluster assigned by HDBSCAN and the UMAP 2D coordinates used for visualizations.embeddings.parquet
: Vector embeddings of infringing extensions generated by our pipeline.
Files
Files
(1.1 GB)
Name | Size | Download all |
---|---|---|
md5:52697059fb22af793ab99270d59f5258
|
1.1 GB | Download |
md5:d37b9041400bec087a1183e9e33190a0
|
29.8 kB | Download |
md5:bb7ee81958630f15dcfd8465301ca35b
|
7.2 MB | Download |