Skip to content

Latest commit

 

History

History
485 lines (250 loc) · 28.2 KB

logs.md

File metadata and controls

485 lines (250 loc) · 28.2 KB

Logs

Notes from conversations and meetings and to self.
Gems found along the way waiting to be further sorted or put to use.


Jan 15

Call with the SABIO stakeholders:

Forwarded by Marieke:

  • Google Article on addressing gender bias in Google Translate

  • the CRAPL License for software & related resources written in academic contexts

Mrinalini has mentioned:

Julia has mentioned:

Jan 20

Chat with Corey:

Jan 27

Meeting with Richard van Alphen from Tropenmuseum:

  • item descriptions are curated when digitised; provenance trail is however available
  • item titles are digitised without change, so the their time of creation likely matches the item's indicated year
  • will talk to his people about getting access to their DB to be able to query it directly and get back to us

SABIO/AI:CULT Kick-off meeting (meeting notes):

  • Jesse de Vos and Johan from B&G are part of CaptureBias; he had the following notes:
    • what about the bias that manifests in the gaps of the museums' collections? bias as gaps?
    • they found it useful to speak of framing rather than of bias and to distinguish thematic and episodic framing
  • Markus Bakenhol, postdoc in ethnology at Meertens, is a domain expert
  • Johan mentioned the Inward Outward symposium as an outlet and place to get in touch with others
  • Johan asked if we could find any published benchmarks or competitions (perhaps at Kaggle?), or even publish one ourselves? => benchmarking is a good point!
  • the Europeana Challenge which we'll potentially go for (shared document?)
  • abstract to be submitted at the LIBER Conference; this shared doc contains the abstract

Jan 28

Mattia's (cultural analyst) intuition, depending on definitions of course, is that bias is opposed polyvocality: bias is nurtured by the lack of voices and conversely put, a diversity of voices makes bias unlikely to prevail
=> Marieke agrees with the gist of this position this article from an initiative could be interesting

The Portrait of a Lady: Close and distant reading of media gender bias: abstract for a paper by Laura Hollink on quantifying (binary) gender bias in Dutch newspaper; the approach takes advantage of the lexicalised marking of gender in Dutch (hij/zij) and measures bias as the degree to which an algorithm can predict gender; predicition is not done on the pure text but on features extracted based on previous conceptual work

Jan 29

Marieke mentioned ARIAS, a 'Platform for Research through the Arts and Sciences', that has grants for work sitting at the interesection of art and science

there is the Stanford Encyclopedia of Philosophy, it has entries e.g. on Feminist Epistemology, race, implicit bias => really good reference, commonly used by philosophers and very accessible

Feb 01

Found the term "Marrons" in the collection RDF (URI https://hdl.handle.net/20.500.11840/206868) -> interesting first task: track the term and its usage across the graph, find adjacent terms, etc

Feb 02

Andrei has mentioned From Cartography To Cookbooks, "a web of Dutch colonialism" and online exhibition which shines a light on issues such as race and gender in the context of colonialism with maps and cookbooks as media (Allard Pierson page for the exhibition)

Feb 03

Victor has mentioned this blog post by MIT - yet another example of racist and sexist image processing AI

Feb 04

Cindy mentioned:

  • 'facets' (of which there are 5) categorise terms in the thesaurus into semantic function -> could be seen as 'prescribed' associations
  • the term 'Bosneger' (an old, derogatory label for 'Marron') is in the Words Matter publication -> compare associations from there with those from DB
  • the AAT (Arts and Achritecture Thesaurus) by Getty
    • thesauruses made made entirely from Western museum perspectives -> bias is deep
    • could be a valuable source of shared knowledge (Wereldculturen thesaurus is rather specific)
  • histories of museums' objects after acquisition could provide an useful 'grounding facts' (represent some of the materialised, as opposed to linguistic aspects of bias) -> e.g. royal objects end up in Rijksmuseum, others in the heritage museums
  • relogious bias as interesting case?
  • Rembrandt labelled as painting, buddhist work labelled as decorative art -> clustering based on properties (i.e. 'semantics') should reveal
  • 'Rapanui' (a natively used term) not in DB, Rapa Nui and Paaseiland are -> labels are difficult even if not obviously discriminatory

Feb 11

Mattia has mentioned Johannes Fabian (UvA Emeritus), specifically his 'Time and the Other'
-> to quote Mattia: 'Different situated notions of time could constitute another vector of bias'; idea: use the fact that Dutch marks time, i.e. look at verb tenses

Feb 12

Feb 15

  • SemEval2020 Task 12 on the identification of offensive language coul be an interesting test case

Feb 18

~~Andrei mentioned:

  • Zotero, a library management tool; we could all share our libraries through that~~

Feb 20

through an INDELAB connection found this blog post about decolonising AI
-> cound contain useful pointers

Feb 23

books on how current AI reinforces biases and inequalities and how to do it differently:

Feb 24

Johan put us in touch with someone who's working on a similar project which deals with biases in meta-data of heritage collections

Feb 27

Niels ten Oever mentioned:

scholar search on social identity & categorisation returned interesting-looking results

March 02

CulturalAI Meeting:

  • Wereldculturen data:
    • Jacco mentioned that Wereldculturen should eventually have their own data exposure process for researchers (& others) -> make sure that Cindy is aware that SABIO is building essentially that, so that they could perhaps use some of it (+ the process we went through to get the data)

March 04

SABIO meeting:

  • Cindy mentioned:
    • she's working on the Pressing Matter project
    • metaphor of a funnel for the program -> related to my own thoughts on bias detection as search, but a nicer metaphor
    • bias as absence: the systematic absence of people in the data or the absence of fine-grained attributes for people is an instance of bias -> choice of words can indicate that, too (e.g. the choice of identifier for a person, cf. 'a man has a name') => this is closely related to, if not the same as, silencing
  • MVP: limit use cases to profressionals
  • Marieke mentioned a nice idea: heat maps on the collection/subgraphs/etc to direct users' attention in a non-binary way, to visualise/uncover patterns -> talk to Werner about this
  • user should be able to input cues -> not only: detect bias in a given text/object/collection, but also: find everything in a given collection that is similar to a given cue

March 05

March 08

  • Jesse mentioned Philo van Kemenade, who works for Beeld & Geluid (and who is in an AI4GLAM task force at Europeana)
  • Jesse mentioned Tobias Blanke recent professor at UvA and ILLC, who works epistemological implications of AI (and generally the interface of philosophy and computer science)
  • Mrinalini mentioned Michel-Rolph Trouillot who conceived the term 'silencing' (most notably in his book Silencing the Past: Power and the Production of History (1995))

March 09

  • Jelle mentioned that he is part of a project that has something to do with bias detection?

  • Marieke mentioned this tweet for the website (where the Q&A is on)

  • Jacco mentioned the FAccT Conference which has interesting papers

March 10

meeting with Richard:

  • CollectionConnection is the tool, the NMVW used to convert their databases to RDF
    -> Richard will share the schema they used for the conversion -> can we maybe use the schema to do the conversion ourselves?
  • the Objects table has a field title, but the table ObjectTitles was created since objects can have multiple titles (either replacing each other or living side-by-side) -> the table TitleType contains information that can/could allow to reconstruct a version order of the titles
  • Richard thinks that the procedures from the database to ML-ready input could be interesting for future and general use -> potentially make processing scripts and procedures reusable for publication

March 11

meeting with the Goethe Inistute (of Finnland and of NL):

  • website of Artificially Correct, where they address bias in (machine) translation
  • poco.lit a Berlin-based platform for postcolonial literature -> collaborators, have written articles for them
  • Workshop for translation practitioners on 23 & 24 April
  • Hackathon planned to develop tools to reduce/detect biases in MT at some point in autumn

Marieke mentioned Nexus Linguarum, a platform to promote synergies between European linguistic data science practitioners

meeting with Vendela (university homepage & personal website):

  • is part of the project The Politics of Metadata
  • shared her slides on her investigation into the representation of Sámi heritage in a (which?) Swedish museum (attached in an email)
  • the Politics of Metadata project is a part of the Metadata Culture research group

March 18

  • Cindy mentioned: Decolonize the Museum Conference by FramerFramed (there's also a document on Decolonizing Museums which could be a valuable resource)
  • Jelle Zuidema is part of the Bias Barometer, to quote: "We explore the relationship between what we read on (social) media and the effects on our (stereotypical) beliefs and actions."

March 26

  • Julia mentioned a talk on The Logic of Decoloniality by Jonathan Chimakonam, who does philosphical research on decolonising research (see links for papers); the point of such research is that in the tripartition of content, method and foundation, the foundation needs to be decolonised alongside content and method (which is what is usually focussed on)
    => this line provides a good guideline for how the field (cultural AI) as a whole should evolve towards

March 29

Meeting with Andrei & Ryan:

  • idea: correlate sentiment of words with their contentiousness (as labelled in ConConCOr) -> can answer the question: 'is sentiment a good predictor of whether a word is contentious?' -> perhaps use BERTje's word embeddings for sentiment (or sentiment analysis)

  • idea: do the analyses of semantic change from Jurafsky's paper on semantic change in the context of the ConConCor -> does contentiousness correlate with factors of semantic change? can we predict contentiousness from semantic change?

  • idea: phrase annotation task for ConConCor in terms of the participants themselves: "how comfortable would you feel saying this word/sentence in public/private/in your head?" / "would you feel hurt if someone said this word/sentence to you?" -> contentiousness is an emotional matter -> getting people's embodied perspective is necessary

  • this video talks about re-designing Bayes' theorem into: O(D|+) = O(D) * P(+|D)/P(+| not D), where D is the RV of whether or not a disease is present and + is the RV of whether a given test was positive =>

April 01

Meeting with Marieke:

  • her PhD student (Philipp) tried stereotype detection on KGs
  • possible publication at the CLARIN Conference (3-4 page abstract due on April 14):
    • the infrastructure/procedure from the Wereldculturen database to data set for cultural AI/AI4GLAM (use-case: bias detection system in colonial contexts)
    • procedure and analysis of questionnaires for heritage professionals: defining the tasks and approach of the professionals in order to automate and enhance with ML
  • idea: concordance: get concordances of words (word pairs): for a given word (based on PMO, or other measures), find and expose the other contexts it occurs in; there is also statistical measure which measures concordance
  • paper by Jacco and others: model transparency through interface and presentation and empirical study of its impact when historians work with ML

April 02

Meeting with the bias B.Sc. project:

  • idea: identify extra-linguistic variables about object (region/culture/etc), then correlate them with for instance sentiment analysis of the description
    e.g.: group objects by culture, then do sentiment analysis on their titles/descriptions and correlate; typicality could help as a concept, examples could be extracted
  • BERTje ([paper, code) & RobBERT (paper, code) are Dutch transformer LMs, also available on Huggingface

April 13

Seminar by TU Wien Digital Humanities, recording: Hi Ryan, Valentin and Andrea,

The Meertens Institute will have a staff meeting next Monday between 10 am and 11 am. This is a quarterly (casual) meeting in which we catch up with each other. We always introduce new employees in this meeting, and Antal and I would very much like to invite you to come this Monday. It would be great if you could introduce yourself briefly during this gathering and tell something about your work in the HuC. Would you be willing to do this?

Best wishes,

Simone

April 20

Johan is organising a EuropeanaTech X CulturalAI lab event; the agenda contains many interesting resources on decolonial approaches, practices and problems in the museum, e.g.:

April 21

Chat with Senka:

  • this person (instagram) in the non-binary community has a strong social media presence
  • same for this person (instagram)
  • Senka also has friend (instagram) who does workshops and art around language and gender might be interested to collaborate

April 29

SABIO meeting:

April 30

Andrei shared this Medium post about 'visualising whose stories are missing'

May 06

Jesse pointed to:

a tool for inspecting word pairs, very basic

someone who participated in the questionnaire is part of the LGBTI Heritage Ogranisation (IHLIA)

May 09

Julia keeps mentioning:

  • standpoint theory (proposed a.o. by Sandra Harding (who has been affiliated with the UvA), as a formal philosophical basis for definitions of bias

May 11

Marieke forwarded (Jelle retweeted):

meeting with Marieke:

  • Black Archives
  • Nijmegen Afrika Museum?
  • Imagine IC
  • IHLIA
  • perhaps a focussed workshop for non-ninary people/on hetero-normative biases in collections around gay pride in Aug?

Oskar mentioned:

Julia Noordegraaf sent an email about a conference with a speaker from MIT's Data + Feminism Lab

May 14

Martijn mentioned:

the National Arcvhies have historical language in their collections and are aware that that might contain undesirable language (explanation page)

May 15

Marieke mentioned:

The Cultural Life of Machine Learning, An Incursion into Critical AI Studies

May 19

Saskia pointed to Rijksmuseum's new exhibition on slavery, co-curated by Valika Smeulders

WORKSHOP:

  • Wayne:

    • why 'bias' instead of e.g. 'racism'?
    • even the word 'human' is biased (indicated anthropocentric bias) -> probably means: bias is everywhere
  • Hodan:

    • bias navigation should be disruptive: disrupt the ways we search and find information in collections

May 20

Cindy forwarded Fantastic Futures 21 Call for Abstracts, due June 15th

RCMC's webinar's speaker Wendy Hui Kyong Chun has written interesting books

Nishant Shah

May 24

ARK - MU presentation:

  • Angelique (Director of MU Eindhoven) mentioned Documenting complexity (funded by NOW, carried out at RUG, in collab with B&G)
  • Roosje has the Mnemosyne Bilderatlas by Aby Warburg => created a system for visually -remembering things together (=association) AND also a system for organising archives

May 26

Paul has mentioned crowdtruth.org, a source of papers on how source ground truth and deal with inter-annotator disagreement

Saskia has mentioned a presentation about trust and utility in heritage LOD

Marieke hsa mentioned Google People + Ai Research

May 31

Cindy shared:

June 12

Eyob (neighbour at 32B) has created a website to connect facial recognition to criminal records (uses facial recognition to categorise mughsot, then displays Dutch criminal stats and similar faces in the DB)

GitHub; Database of crimal records