Published February 3, 2026
| Version v1
Journal article
Open
Synthetic-Data Generation for Enhancing Malware and Phishing Determining Performance
Description
The ML applications like Malware and phishing detection require security datasets, which should be of good quantity, quality, and diversity, but in real-world applications, they may deficit future (zero-day) or avoid variants, are not balanced, and provide privacy issues. Synthetic-Data Generation (SDG) (including Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), transformer or large language model (LLM) generation) can be used to expand training corpora as well as simulate obscure variants as well as allow privacy-
preserving collaboration. The proposed research model encompasses the literary background, recent developments
(2021-2025), an experimental design, guidelines, ethics, and threat assessment, as well as the expected outcomes. Recent
studies, such as those by Mal Data Gen, malware benchmarks,
phishing synthesis using LLM, and improvements based on GANs, are used to support the affirmation.
Files
synthetic-data-generation-for-enhancing-malware-and-phishing-determining-performance-IJERTV15IS010504.pdf
Files
(574.9 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:917d77ace04a8ca6e28142ccf4271ab3
|
574.9 kB | Preview Download |
Additional details
Related works
- Is identical to
- Journal article: https://www.ijert.org/synthetic-data-generation-for-enhancing-malware-and-phishing-determining-performance (URL)