cross-reader/templates/cross-readings.html

182 lines
11 KiB
HTML
Raw Normal View History

2019-07-10 21:19:51 +02:00
{% extends "base.html" %}
{% block view %}black{% endblock %}
{% block title %}- cross-readings{% endblock %}
{% block search %}
{% endblock %}
{% block content %}
<div id="howto">
<br>
<br>
<p class="note">[How to use this cross-reader?]</p>
<br>
<blockquote>
This tool allows for cross-readings through a collection <br>
of <em>cyber/technofeminist manifestos</em> and the <em>TF-IDF algorithm</em>.
</blockquote>
<div class="guides">
<ol>
<li>
SEARCH — You can search through these manifestos by typing a keyword in the search bar or by clicking on the list of suggested keywords.
</li>
<li>
READ — You can simply browse through the list of manifestos (see right-hand column) and read them in their entirety and in their original web environment.
</li>
<li>
CROSS-READ — If you click on the ◐ icon, you can read the manifesto through the prism of the TD-IDF algorithm. This algorithm identifies the most specific words within a document. It was developed in part by British computer scientist Karen Spärck Jones and has been a crucial algorithm for a large number of online search engines.
</li>
<li>
INSPECT — Next to this icon, you can click on the <small><a href="">TF</a></small>, <small><a href="">IDF</a></small> and <small><a href="">TF-IDF</a></small> buttons to inspect the values of the TF-IDF algorithm for each manifesto.
</li>
</ol>
</div>
</div>
<div id="about">
<div id="intro" class="cross">
<br>
<p class="note">[cross-readings]</p>
<br>
<p>This cyber/technofeminist cross-reader does not follow one but two axes, bridging the act of reading a collection of texts, with the act of reading a tool.</p>
<p>These cross-readings connect ...</p>
<p class="tfidf">... the <em>Term Frequency Inverse Document Frequency</em> algorithm, or <em>TF-IDF</em> in short</p>
<p class="techfem">... a collection of <em>cyber- and technofeminist manifestos</em></p>
<p class="tfidf">The TF-IDF is a commonly used algorithm to find the most important words of a document. The algorithm is (partly) written by the female computer scientist Karen Spärck Jones in the 1970s and has become one of the important algorithms of many search tools online, such as digital library systems or corporate search engines like Yandex or Google. The algorithm turns written documents into a sorted lists of search results, using a specific relative and inversed way of counting, that is sensitive for contrast in written documents. </p>
<p class="techfem">The cyber/technofeminist manifestos connect feminist thinking to technology, introducing feminist servers, cyborg figures, cyberwitches, or pleas for the glitch as cultural digital artefact. This collection, which is obviously incomplete, brings a diverse set of technofeminist documents together that are published between 1912 and 2019. The manifestos speak about very different concerns and questions, but they connect in terms of energy level. Urging to make a statement, ready to activate.
<br><br>
An interesting note to mention: Karen Spärck Jones was an advocate for the position of women in computing. <em>“Ive been trying to think a little bit—but its very dispiriting!—about how to try to get more women into computer science. On the whole, everybody who thinks about this is depressed, because were going backwards rather than forwards.”</em> (https://ethw.org/Oral-History:Karen_Sp%C3%A4rck_Jones#On_Getting_More_Women_into_Computer_Science)</p>
<p>These two axes, the algorithm and the manifestos, interoperate. They support and strengthen eachother as the X and Y of this cross-reading tool. </p>
<p>The TF-IDF algorithm, while responding to a search request, creates cross-readings through the manifestos. It outputs a list of search results around the subject of search, creating a field of statements, questions and concerns around one single word. Meanwhile, the algorithm starts to interoperate with the manifesto as a format. Sensitive as it is for bulletpointed writing, repetition and unique words -- elements that are used a lot in these statement driven documents. The algorithm prioritizes higher contrastful language over academic writing, repetition over very diverse vocabularies and the use of unique words over the use of common ones.</p>
<p>See this cross-reading tool as an exercise in reading, across a field of technofeminist thinking and a tool for algorithmic sorting.</p>
</div>
<div class="cross">
<br><br>
<p class="note">[TF-IDF algorithm]</p>
<br>
<p class="tfidf" style="float: right;margin-left:1em;">
<code>
def tfidf(query, words, corpus):<br /><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Term Frequency<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tf_count = 0<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for word in words:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if query == word:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tf_count += 1<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tf = tf_count/len(words)<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;# Inverse Document Frequency<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;idf_count = 0<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for words in corpus:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if query in words:<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;idf_count += 1<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tfidf_value = tf * idf<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return tf_count, tf_count, tfidf_value
</code>
<br><br>
The TF-IDF algorithm, shown above in the programming language Python, weaves a layer of contrast into the text. Not literally, but in the form of numbers. The most contrastful words are those that the algorithm consideres as the most specific words for that text.
<br><br>
The TF-IDF values are calculated in two steps. The algorithm first counts the <em>Term Frequency</em> (TF) by counting the appearance of a word in the text, relatively to the total number of words in the document. This way of relative frequency counting makes it possible to compare wordcounts between documents with variating lengths. This makes it possible to compare Donna Harraway's long essay <em><em>A Cyborg Manifesto</em></em> (1984) with the relatively short text of <em><em>The Call for Feminist Data</em></em> written by Caroline Sinders (2018).
<br><br>
In the second step, the algorithm counts relatively against all the other documents in the same dataset, using the <em>Inversed Document Frequency</em> (IDF). This part of the algorithm, which is Karen Spärck Jones addition, introduced a subtle form of inversed relative counting throughout all the documents in the dataset. Instead of just counting word-frequency in one document, Karen proposed to count in a relative inter-document way.
<br><br>
This means that when a word only appears in one or a few documents, that its value is greatly enlarged. The concequence being that words as <em><em>the</em></em> or <em><em>it</em></em> will be given a very low number, as they appear in all the documents. And specific words, such as <em>paranodal</em> in <em>A Feminist Server Manifesto</em>, will get a very high value as this word is only used 4 times in the whole dataset and all of those 4 occurances where in this manifesto.
<br><br>
Another example is <em>SCUM</em>. Although the word <em>SCUM</em> is not the most commonly used word in the <em>S.C.U.M. Manifesto</em>, it is the word that gets the highest score: relative to all the other manifesto's, <em>SCUM</em> is mostly used in this manifesto. This increases the score a lot.
</p>
</div>
<div class="cross">
<br><br>
<p class="note">[cyber/technofeminist manifestos]</p>
<br>
<p class="techfem">
The collection of cyber/technofeminist manifestos includes the following documents:
<br><br>
<em>The Manifesto of Futurist Woman</em> [EN] <br>
written by Valentine de Saintpoint (1912)<br><br>
<em>S.C.U.M manifesto</em> [EN]<br>
written by Valerie Solanas (1967)<br><br>
<em>A Cyborg Manifesto</em> [EN] <br>
written by Donna Haraway (1984)<br><br>
<em>RIOT GRRRL MANIFESTO</em> [EN] <br>
published in Bikini Zine (1989)<br><br>
<em>Cyberfeminist manifesto for the 21st century</em> [EN] <br>
written by VNS Matrix (1991)<br><br>
<em>Bitch Mutant Manifesto</em> [EN] <br>
written by VNS Matrix (1996)<br><br>
<em>Cyberfeminism is not</em> [EN, DE, NL, FR] <br>
written by Old Boys Network (OBN) (1997)<br><br>
<em>Refugia</em> [EN] <br>
written by SubRosa (2002)<br><br>
<em>Glitch Manifesto </em>[EN] <br>
written by Rosa Menkman (2009)<br><br>
<em>Glitch Feminism Manifesto</em> [EN] <br>
written by Legacy Russell (2012)<br><br>
<em>The Mundane Afrofuturist Manifesto</em> [EN] <br>
written by Martine Syms (2013)<br><br>
<em>Wages for Facebook</em> [EN] <br>
written by Laurel Ptak (2013)<br><br>
<em>A Feminist Server Manifesto </em>[EN] <br>
published by Constant (2014)<br><br>
<em>Gynepunk Manifesto</em> [EN] <br>
written by Gynepunk (2014)<br><br>
<em>tRANShACKfEMINISta</em> [EN] <br>
written by Pechblenda Lab (2014)<br><br>
<em>Manifesto for the Gynecene</em> [EN] <br>
written by Alexandra Pirici and Raluca Voinea (2015)<br><br>
<em>The 3D Additivist Manifesto</em> [EN]<br>
written by Morehshin Allahyari and Daniel Rourke (2015)<br><br>
<em>Xenofeminist manifesto</em> [EN]<br>
written by Laboria Cuboniks (2015)<br><br>
<em>Feminist Principles of the Internet </em>[EN] <br>
collective authorship, organized by Association for Progressive Communications (APC) (2016)<br><br>
<em>Hackers of Resistance Manifesto</em> [EN] <br>
written by HORS (2018)<br><br>
<em>Purple Noise Manifesto</em> [EN] <br>
written by Cornelia Sollfrank (2018)<br><br>
<em>The Call for Feminist Data</em> [EN] <br>
written by Caroline Sinders (2018)<br><br>
<em>Cyberwitches Manifesto </em>[EN] <br>
written by Lucile Haute (2019)<br>
<br>
</p>
<p class="tfidf">The algorithm introduces the idea of a <em>context specific way</em> of counting words.
<br />
<br />
Karen's IDF part of the TF-IDF algorithm creates an ecosystem where the resulting numbers heavily depend on the presence of the other words. The deletion or addition of a document would change all the interrelations in the dataset, as the calculations fully depend on each other. Altough the practice of algorithmic text processing is inherently pretty brutal, as language is regarded as nothing but a <em>bag-of-words</em>, the TF-IDF algorithm and its algorithmic character, give us a way of counting that creates situated datasets where values are determined by their self-created context.
</p>
<br>
<br>
</div>
</div>
{% endblock %}