# plain text workflow Files for the plain text publication for Data Workers, an exhibition by Algolit at the Mundaneum in Mons from 28 March until 28 April 2019. ` _ `
` _ __ ___ | |_ ___ ___ `
`| '_ \ / _ \| __/ _ \/ __|`
`| | | | (_) | || __/\__ \`
`|_| |_|\___/ \__\___||___/` line width: 110 char lines per page: 70 70, 140, 210, 280, 350, 420, 490, 560, 630, 700 --------- ### --- txt to pdf --- options ... #### weasyprint (stretched the page size, font size, etc, in order to place everything) #### enscript (using postscript to create pdf) `$ enscript --word-wrap --margins=40:10:10:20 --fancy-header writers.intro.txt -o - | ps2pdf - test.pdf` `$ cat writers.intro.txt | iconv -c -f utf-8 -t ISO-8859-1 | enscript --word-wrap --margins=40:10:10:20 --fancy-header -o - | ps2pdf - test.pdf` #### txt2pdf (uses reportlab) `$ python3 txt2pdf/txt2pdf.py -T 1 -B 2 -L 2 -R 1 writers.intro.txt -o test.pdf` `$ python3 txt2pdf/txt2pdf.py -m A4 -f fonts/fantasque/TTF/FantasqueSansMono-Regular.ttf -s 10 -v 0 -T 1 -B 1 -L 1.5 -R 1.5 data-workers.txt -o test.pdf` currently using: `$ python3 txt2pdf/txt2pdf.py -m A4 -f fonts/fantasque/TTF/FantasqueSansMono-Regular.ttf -s 9 -v 0.05 -T 1 -B 0.9 -L 1.5 -R 1.5 data-workers.txt -o test.pdf` #### PDF2txt miner The inverted tool of this process *"What's It? PDFMiner is a tool for extracting information from PDF documents. Unlike other PDF-related tools, it focuses entirely on getting and analyzing text data. PDFMiner allows one to obtain the exact location of text in a page, as well as other information such as fonts or lines."* ------ ### --- hyphenation --- #### Hyphenator #### textwrap ------ ### --- commands --- Generate the publication to PDF: `$ python3 create_all.py && python3 txt2pdf/txt2pdf.py -m A4 -f fonts/unifont-11.0.03.ttf -s 9 -v 0.05 -T 1 -B 0.9 -L 1.6 -R 1.4 data-workers.en.txt -o data-workers.en.pdf` Add logos.pdf on last page with PDFTK `$ pdftk data-workers.en.pdf A=data-workers.en.pdf cat A52 output data-workers.en.backcover.pdf` `$ pdftk data-workers.en.backcover.pdf multistamp logos.pdf output data-workers.en.logos.pdf ` `$ pdftk A=data-workers.en.pdf B=data-workers.en.backcover.logos.pdf cat A1-51 B output data-workers.en.logos.pdf` PDFTK in one command: `$ pdftk data-workers.en.pdf A=data-workers.en.pdf cat A52 output data-workers.en.backcover.pdf && pdftk data-workers.en.backcover.pdf multistamp logos.pdf output data-workers.en.logos.pdf && pdftk A=data-workers.en.pdf B=data-workers.en.backcover.logos.pdf cat A1-51 B output data-workers.en.publication.pdf` ------ ### --- ASCII/UNICODE fonts --- Unicode art :) ------ ### --- unifont --- ------ ### --- DUMP --- `[\/\]\<\?\'\)\(\[\\\"\w]` `░` work many authors write every human being who has access to the internet interacts we chat, write, click, like and share we leave our data we find ourselves writing in Python some neural networks write human editors assist poets, playwrights or novelists assist --- Writers write ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒ Data workers ░░░░░░░░░░░░ need data to ▒▒▒▒ with. work The data that is used in the context of Algolit, is written language. Machine learning relies on many types Many authors of writing. ░░░░░░░░░░░░ ▒▒▒▒▒ in the write form of publications, like books or articles. These are part of organised archives and are sometimes digitized. But there are other kinds of writing every human too. We could say that ░░░░░░░░░░░░ being who has access to the internet ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ is a writer each time they ▒▒▒▒▒▒▒▒▒ interact with algorithms. We ░░ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒. chat, write, click, like and share we In return for free services, ░░ ▒▒▒▒▒ leave ▒▒▒▒▒▒▒▒ that is compiled into profiles our data and sold for advertisement and research. Machine learning algorithms are not critics: they take whatever they're given, no matter the writing style, no matter the CV of the author, no matter their spelling mistakes. In fact, mistakes make it better: the more variety, the better they learn to anticipate unexpected text. But often, human authors are not aware of what happens to their work. Most of the writing we use is in English, some is in French, some in Dutch. Most often we find ourselves writing in Python, the programming language we use. Algorithms can be writers too. Some neural networks write their own rules and generate their own texts. And for the models that are still wrestling with the ambiguities of natural language, there are human editors to assist them. Poets, playwrights or novelists start their new careers as assistants of AI. --- P r o g r a mm e r s are wr iting the datawork P r o g r am m e rs are writing the dataworker P r o g ra m mers are writing the dataworke P r o g r ammers a re writing the datawor P r o g r ammers are writing the dataw P r o gram mers a re writing the data P r ogra m mer s are writing the d P r o gramm e r s are writing the d P r ogram m ers a re writing the d P r o g ram m e rs a re writing the P r o g r a m m ers a r e writing the P r o g r a m m e rs are writing the d P r o g r a m m e r s a r e wr i ting the dat P r o g r a mme r s ar e writ ing the dataw P r o g r amm e r s are writing the datawo P r o g r amm e r s are writing t he datawo P r o g ra m m er s a r e writ ing the datawork P r o g r a mm e r s are wr iting the datawork P r o g r a m m e rs are writing the datawo P r o gra m m e rs are w riting the datawork P r og r a m mers a re writing the datawor P r o g r a mmers a re writing the datawo P r o g r ammers a r e writing the dataw P r o g ra mmers a re writing the dat P r o g ramm ers a re writing the da P r ogramm e rs a r e writing the da P r o gramm e r s a r e writing the d P r o g ram m e rs a r e writing the d P r o g r a m m e rs are writing the d P r o g r a m m e r s a r e w r iting the da P r o g r a mme r s ar e writ ing the dataw P r o g ramme r s are writing the datawo P r o g r ammer s a re w riting th e datawor P r o g r a m mers a r e writ ing the datawork P r o g r am me r s are wr iting the datawork