Published January 1, 2024 | Version v1
Conference paper Open

PentaPen: Combining Penalized Models to Identify Important SNPs on Whole-genome Arabidopsis thaliana Data

Description

In the rapidly advancing field of genomics, the identification of Single Nucleotide Polymorphisms (SNPs) plays a crucial role in understanding complex phenotypic traits. This study introduces "PentaPen", an innovative computational workflow which combines the strengths of five penalized models to achieve improved accuracy in SNP detection. We compare the performance of PentaPen with existing models, highlighting its advantages in solving problems arising from when the number of predictors exceeds the number of samples. Beyond model comparison, we provide insights into PentaPen's effectiveness in utilizing all SNPs as input, streamlines data pre-processing, and leverages parallel computation, enabling the workflow a considerable stride in SNP detection. Furthermore, a thorough evaluation and comparison of computational complexities signifies competitive edge of the workflow over individual penalized models. As future research directions, we propose applications of PentaPen to plant-specific characteristics and suggest further explorations to assess the robustness of its findings. In summary, this manuscript presents the genomics community with a tool that combines computational efficiency with high-precision SNP detection, making a strong contribution to the field of genomic research.

Files

kohli-2024-pentapen.pdf

Files (631.7 kB)

Name Size Download all
md5:0a0baf704ac768491fcd5b1b6333a444
631.7 kB Preview Download

Additional details