ember_with_avclass_dataset.csv metadata for EMBER 2018 dataset with AVClass2 run columns: ['sha256', 'appeared', 'label', 'avclass_prev', 'subset', 'vt_detections', 'avclass_curr'], avclass_tag_co_occurrence.alias tag co-occurrence information given AVClass2 run on the EMBER 2018 dataset sim_test_vs_train_test.csv xgboost-based leaf similarity query: test knowledge base: train & test sim_unlabelled_vs_train.csv xgboost-based leaf similarity query: train knowledge base: train