Published June 3, 2021 | Version 1.0
Presentation Open

Developing a novel feature space for sequence data analysis; a use-case on SARS-CoV-2 data

  • 1. Institute of Applied Biosciences, Centre for Research and Technology, Hellas

Description

Create a novel feature space based on k – mers that can be retrieved from unaligned sequence data. Purpose of these new features is to facilitate the effective application of machine learning algorithms in various scenarios. The method examines all values of k within a user-defined range, starting from lower k-values, assigning scores to k-mers, keeping those of highest scores, and proceeding to higher k – values (Pruning trees).

Files

Files (7.4 MB)

Name Size Download all
md5:790162aa970068935d9e69c1dcaac1a5
7.4 MB Download

Additional details

Related works

Describes
Journal article: 10.3389/fgene.2021.618170 (DOI)