This is a quick exploration of twitter reactions to the right-wing terrorist shootings in Hanau, Germany on 19th february 2020. I played around with this data as an excercise to get in first touch with Twitter analysis with R for sociological research. Most of the code is based this this tutorial.

Load Packages

# Data Import
library(jsonlite)
# Data Wrangling and Visualization
library(magrittr) # for piped function calls with %>%
library(plotly)
library(tidyverse)
# Date & Time Manipulation.
library(hms)
library(lubridate)
# Text Mining
library(tidytext)
library(wordcloud)
#Set notebook directory.
MAIN.DIR <- here::here()

Read the data

Load the data dump from the JSON-file.

all_tweets <- jsonlite::stream_in(file("../../../r-playground/data/hashtag_hanau_2020-02-19_2020_02_29_dump.json"), verbose = FALSE)

We have a total of 80.5k tweets with the hashtag #Hanau in our dataframe.

nrow(all_tweets)
## [1] 80549

Grab a subset of the data.

tweets_raw <- all_tweets %>%
  select(datetime, text) %>%
  filter(!str_detect(string = text, pattern = '@')) %>% # Remove account names
  as_tibble()

Let’s parse the date string into a datetime in order to do a timeline analysis.

tweets_raw %<>% 
  mutate(
    datetime = datetime %>% 
    parse_date_time(orders = ' %Y-%m-%d %H%M%S') # Parse date.
  )

Timeline Analysis

Let’s first have a look at when the reactions have been posted.

# Set the time from UTC to CET (+1h).
tweets_raw %<>% 
  mutate(datetime = datetime + 1*60*60)

# Remove the seconds, so we can get a meaninful plot.
tweets_raw %<>% 
  mutate(created = datetime %>% round(units = 'mins') %>% as.POSIXct())

# Let's plot it.
plot <- tweets_raw %>% 
  count(created) %>% 
  ggplot(mapping = aes(x = created, y = n)) +
    theme_light() +
    geom_line() +
    xlab(label = 'Date') +
    ylab(label = NULL) +
    ggtitle(label = 'Number of Tweets per Minute')

plot %>% ggplotly()

The shooting happened at 22:00 CET. There is a first peak of 27 tweets/minute a few hours after the shooting at 01:08 CET. The biggest peak in the data with 80 tweets/minute is at 20-02.20 14:04 CET, the day after the shooting. On the 2020-02-21 the number of tweets per minute is more than halved and continually decreases the next few days.

Hashtag Count

Which other hashtags have been combined with the hashtag #Hanau can shed some light on how the reactions are evaluated and framed by the public. So let’s inspect that.

# Function that gets hashtags of a tweet.
getHashtags <- function(tweet) {
  hashtag.vector <- str_extract_all(string = tweet, pattern = '#\\S+', simplify = TRUE) %>%
    as.character()

  hashtag.string <- NA
  if (length(hashtag.vector) > 0) {
    hashtag.string <- hashtag.vector %>% str_c(collapse = ', ')
  }
  return(hashtag.string)
}

# Get the hashtags from our raw data.
hashtags <- tibble(
  Hashtags = tweets_raw$text %>% map_chr(.f = ~ getHashtags(tweet = .x))
)

# Bind the hashtags to our normalized tweets.
tweets_raw %<>% bind_cols(hashtags)

# Unnest the hashtags.
hashtags.unnested.df <- tweets_raw %>%
  select(created, Hashtags) %>%
  unnest_tokens(input = Hashtags, output = hashtag)
 
# Count the usage of specific hashtags. 
hashtags.unnested.count <- hashtags.unnested.df %>%
  filter(hashtag != "hanau") %>% # Filter out #hanau, because that's the hashtag we used to scrape our data.
  count(hashtag) %>%
  drop_na()

# Plot the wordcloud.
wordcloud(
  words = str_c('#',hashtags.unnested.count$hashtag),
  freq = hashtags.unnested.count$n,
  min.freq = 200,
  colors=brewer.pal(8, 'Dark2')
)

The tag cloud sheds light on the most used hashtags within the reactions to hanau. Alot of the reactions use either hashtags calling the shooting by the name, either as right-wing terrorism/extremism (e.g. #rechtsterrorismus, #rechterterror or #nazisraus) or racism (e.g. #rassismus or #rassismustötet). On the two top ranks are hashtags linking the extreme right AfD-party with the event, as visible in this table of top hashtags.

hashtags.unnested.count %>%
  arrange(desc(n)) %>%
  head(20)
## # A tibble: 20 × 2
##    hashtag               n
##    <chr>             <int>
##  1 afd                4304
##  2 noafd              2509
##  3 hanaushooting      2500
##  4 rassismus          2117
##  5 rechterterror      1882
##  6 halle              1688
##  7 germany            1229
##  8 rechtsterrorismus  1121
##  9 nazisraus          1092
## 10 fckafd             1019
## 11 terror              839
## 12 fcknzs              835
## 13 nsu                 749
## 14 hanauattack         695
## 15 deutschland         678
## 16 cdu                 673
## 17 merkel              590
## 18 volkmarsen          588
## 19 hamburg             559
## 20 hamburgwahl         544

Conclusion

The results suggest that media attention on Twitter was highest on the day after the attack and then declined continuously, which corresponds to a common attention pattern for such events. The hashtag analysis also shows that the attack was interpreted by the public as an expression of right-wing extremism and directly linked to the AfD.