Published January 22, 2025
                      
                       | Version v1
                    
                    
                      
                        
                          Project deliverable
                        
                      
                      
                        
                          
                        
                        
                          Restricted
                        
                      
                    
                  From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
Creators
Contributors
Annotators:
Description
This repository contains the source code, datasets, generated content, and scripts for the paper "From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models."
In this paper, we present an in-depth evaluation of VLMs' ability to interpret hateful memes by curating a dataset of 39 hateful memes and over 12,000 responses from seven representative VLMs using carefully designed prompts. We also assess how malicious users could exploit VLMs and hateful memes to generate hateful content systematically.