# Python wrapper for the webis Twitter sentiment evaluation ensemble

This is a Python wrapper around the Java implementation of a Twitter sentiment evaluation framework presented by [Hagen et al. (2015)](http://www.aclweb.org/anthology/S15-2097). The example script fetches Tweets from a PostgreSQL database, uses [PyJnius](https://github.com/kivy/pyjnius/tree/master/jnius) to call the Java modules to evaluate the sentiment, and saves results to a table in the same database.

### Dependencies

The script is written in Python 3 and depends on the Python modules [PyJnius](https://github.com/kivy/pyjnius/tree/master/jnius), [pandas](https://pandas.pydata.org/) and [emojientities](https://gitlab.com/christoph.fink/python-emoji-range). 

On top of that, a Java Runtime Environment (jre) is required, plus a matching Java Development Kit (jdk). We used Java 8, but other versions might work just as well. [OpenJDK](https://openjdk.java.net/) works fine.

To install all dependencies on a Debian-based system, run:

```shell
apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv cython3 openjdk-8-jdk-headless openjdk-8-jre-headless ca-certificates-java
```

### Installation

- *using `pip` or similar:*

```shell
pip3 install webis
```

- *OR: manually:*

    - Clone this repository

    ```shell
    git clone https://gitlab.com/christoph.fink/python-webis.git
    ```

    - Change to the cloned directory    
    - Use the Python `setuptools` to install the package:

    ```shell
    cd python-webis
    python3 ./setup.py install
    ```

- *OR: (Arch Linux only) from AUR:*

```shell
# e.g. using yaourt
yaourt python-webis
```

### Usage

First, make sure the environment variable `JAVA_HOME` is set and pointing to your Java installation. For instance, add the following line to `~./bashrc`:

```shell
export JAVA_HOME="$(readlink -f $(which javac) 2>/dev/null | sed "s:/bin/javac::")"
```

Import the `webis` module in a Python 3 script. On first run, *python-webis* will download and compile the Java backend – this might take a few minutes.

Then instantiate a `webis.SentimentIdentifier` object and use its `identifySentiment()` function, passing in a list of tuples (`[(tweetId, tweetText),(tweetId, tweetText), … ]`) or a `pandas.DataFrame` (first column is treated as identifier, second as tweetText). 

The function returns a list of dicts (`[{"tweetId": tweetId, "sentiment": sentiment}, … ]`) or a data frame (first column id, second column sentiment) of rows it successfully identified a sentiment of.

```python
import webis

sentimentIdentifier = webis.SentimentIdentifier()

tweets = [
    (1, "What a beautiful morning! There’s nothing better than cycling to work on a sunny day 🚲."),
    (2, "Argh, I hate it when you find seven (7!) cars blocking the bike lane on a five-mile commute")
]

sentimentIdentifier.identifySentiment(tweets)
# [(1, "positive"), (2, "negative")]

import pandas
tweets = pandas.DataFrame(tweets)
sentimentIdentifier.identifySentiment(tweets)
#   sentiment tweetId
# 0  positive       1
# 1  negative       2


```
