Image dataset to train a deep learning model to decode Leetspeak obfuscated characters
Creators
- 1. Mondragon Unibertsitatea
 - 2. Instituto Universitário de Lisboa (ISCTE-IUL)
 - 3. University of Vigo
 
Description
The dataset contains an image database (18,981 images) that could be used to train a deep learning model to accurately detect characters. We have successfully used it to create a model that identifies characters encoded using LeetSpeak. The original dataset can be found in the Mondragon Unibertsitatea Repository -- https://gitlab.danz.eus/datasharing/ski4spam
The training dataset consists of:
- Alphabetic letters (a-z) written using different fonts and styles (regular, cursive, bold, cursive+bold)
- Handwritten letters: English handwriting from the Chars74k dataset [2] which is available at http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/.
Files
      
        images.zip
        
      
    
    
      
        Files
         (22.1 MB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| 
            
            md5:fb400473248296b9a0497a2f072b77a9
             | 
          22.1 MB | Preview Download |