This repository contains the code of the analysis of the project "Towards a Monitoring of Instagram". The project was led by AlgorithmWatch and funded by SIDN Fonds.
The analysis was designed and coded by Boris Reuderink.
This project is a follow-up of the Monitoring Instagram project, which was funded by the European Data Journalism Network.
We asked volunteers to install a browser add-on that scans their Instagram newsfeeds at regular intervals. Each data donor was told to follow three accounts of Dutch politicians or political parties.
We recorded what politicians posted on Instagram on the one hand. On the other, we recorded what volunteers saw at the top of their newsfeed. This way, we could see when a volunteer encountered a post by a politician – and when not.
The browser plugin was developed by Édouard Richard.
The preparatory steps for the analysis can be read in doc/analysis_plan.pdf
. The code of the analysis can be consulted at notebooks/TAMI Dutch politics.ipynb
The graphs used in the article use the following data:
Ratio of posts seen vs not seen in cell [19] of the analysis from Feb 22.
The odds-ratio that a post be seen, taking into account the results of the model predicting visibility, is calculated using exp(x) (x being the data from cell [26] of the analysis from Feb 22, which represents the log-odds).
The graphs used in the article focus on the categories for which we can formulate hypotheses regarding why they would be favored. Categories were automatically generated using a Latent Dirichlet allocation and sometimes were difficult to interpret. For instance, an 'indoor' category seems to be favored by Instagram's algorithm in some analyses, but not all.
Upon reception of the database, process the data for the analysis:
make all
Build collage of images to help the analysis:
make images
Run the analysis:
jupyter notebook
Save the current notebook and the collages of images:
make archive date=current_date