Small sentiment analyser and bot to post the plots on mastodon.
Find a file
2023-03-17 21:29:16 +01:00
.gitignore made hedonodon server ready 2023-01-27 21:08:25 +01:00
CRUDManager.py add code documentation 2023-03-17 20:06:01 +01:00
DbSetup.py add code documentation 2023-03-17 20:06:01 +01:00
Main.py add code documentation 2023-03-17 20:06:01 +01:00
MastodonAccountManager.py add code documentation 2023-03-17 20:06:01 +01:00
README.md Fixed some typos 2023-03-17 21:29:16 +01:00
requirements.txt implement rough wordcount 2023-03-15 14:27:07 +01:00
SentiTooter.py take the large spacy model 2023-03-17 21:25:44 +01:00
Tables.py add code documentation 2023-03-17 20:06:01 +01:00
TootCrawler.py add code documentation 2023-03-17 20:06:01 +01:00

Hedonodon

Prerequisites

Install the dependencies with python -m pip install -r requirements.txt. Install SpaCys nlp model with python -m spacy download en_core_web_lg. If the automatic download of the twitter-roberta-base-sentiment model and tokenizer fail, go to the model pages on hugging face (see models section) and download the to the respective folder (cardiffnlp/twitter-roberta-base-sentiment).

Purpose

Hedonodon fetch toots from fedihum.org and calculates the sentiments, sentiment mean and word frequencies of each day, and creates fancy diagrams from the data.

Motivation

This tool was created to understand how sentiment analyses and nlp methods works, so it may lacks of proper use of models etc...

Models

It uses "germansentiment"](https://huggingface.co/oliverguhr/german-sentiment-bert) for german toots, []"twitter-roberta-base-sentiment"](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment) for english toots, and "vaderSentiment" for other languages. For the word counts I translate the toots to english with the GoogleTranslator from deep_translater first and then use SpaCys nlp model "en_core_web_lg" to calculate the word frequencies.

Weaknesses

Since some moduls do not return sentiment compounds I have to use the nominal sentiment values (positive, neutral, negative) to calculate the mean of the day, which is statisticaly not okay (;-_-).