Title: Workshop: Say It Ain’t So
Subtitle: A simple Speech-To-Text experiment with serious implications
#Introduction
*“Say It Ain’t So”, a workshop organised by artist Amy Pickles and designer and researcher Cristina Cochior. The topic is speech to text processing, including technical aspects of speech recognition software such as the open source engine PocketSphinx, and issues of visibility and invisibility.*
![Photo taken by INC Amsterdam, By Simon Browne](01_SIAS.png)
It doesn’t go well for Amy; she is rejected due to data drawn from not just what she said, but also how she said it. Her fate is sealed by low percentages of the things that matter, such as confident delivery and use of predetermined key words.
In contrast with the perception that discrete parts of language are mostly stable, speech recordings contain more dynamic, complex elements than we imagine. Speech to text uses a ‘bag of words‘ model; utterances are sliced into basic units of language and indexed by frequency. More frequent combinations are matched with corresponding equivalents from sourced dictionaries; speech to text and vice-versa. This is illustrated in a quick demonstration of PocketSphinx transcription with mixed results; either rendering (relatively) faithfully or producing comical phrases that barely resemble natural language, especially when confronted with accents.
Writer Ursula K. Le Guin’s “carrier bag theory of fiction” suggests that the first tool was a bag (rather than a weapon), with contents that allowed us to form narratives through powerful relational qualities. In this workshop, spread out on a carpet, are a collection of plastic bags filled with printed texts. It’s easy to laugh at the mess made of what comes so naturally to us; language. But there are more serious implications, as we see in a screening of a video of academic Halcyon Lawrence, who maintains that homophony is engrained, and confronting accent bias is a crucial part of ensuring access to technology. The hallmark of algorithmic natural language applications is invisibility, relying on a participant’s lack of awareness of the process. However, invisibility is also a result of these applications, in their ability to discriminate between the contents of the bags of words they employ, and so hide differences; discarding what is considered to be indistinct.
Report by Barbara Dubbeldam, May 28, 2019 at 11:55 am. Remixed by the editors