mb
2 years ago
commit
40b86ee7dc
34 changed files with 162294 additions and 0 deletions
@ -0,0 +1,115 @@ |
|||
[ Copyleft Attitude with a difference ] |
|||
|
|||
COLLECTIVE CONDITIONS FOR RE-USE (CC4r) |
|||
version 1.0 |
|||
|
|||
============ |
|||
REMINDER TO CURRENT AND FUTURE AUTHORS: |
|||
The authored work released under the CC4r was never yours to begin with. The CC4r considers authorship to be part of a collective cultural effort and rejects authorship as ownership derived from individual genius. This means to recognize that it is situated in social and historical conditions and that there may be reasons to refrain from release and re-use. |
|||
============= |
|||
|
|||
PREAMBLE |
|||
|
|||
The CC4r articulates conditions for re-using authored materials. This document is inspired by the principles of Free Culture – with a few differences. You are invited to copy, distribute, and transform the materials published under these conditions, and to take the implications of (re-)use into account. |
|||
|
|||
The CC4r understands authorship as inherently collaborative and already-collective. It applies to hybrid practices such as human-machine collaborations and other-than-human contributions. The legal framework of copyright ties authorship firmly in property and individual human creation, and prevents more fluid modes of authorial becoming from flourishing. Free Culture and intersectional, feminist, anti-colonial work reminds us that there is no tabula rasa, no original or single author; that authorial practice exist within a web of references. |
|||
|
|||
The CC4r favours re-use and generous access conditions. It considers hands-on circulation as a necessary and generative activation of current, historical and future authored materials. While you are free to (re-)use them, you are not free from taking the implications from (re-)use into account. |
|||
|
|||
The CC4r troubles the binary approach that declares authored works either ‘open’ or ‘closed'. It tries to address how a universalist approach to openness such as the one that Free licenses maintain, has historically meant the appropriation of marginalised knowledges. It is concerned with the way Free Culture, Free Licenses and Open Access do not account for the complexity and porosity of knowledge practices and their circulation, nor for the power structures active around it. This includes extractive use by software giants and commercial on-line platforms that increasingly invest into and absorb Free Culture. |
|||
|
|||
The CC4r asks CURRENT and FUTURE AUTHORS, as a collective, to care together for the implications of appropriation. To be attentive to the way re-use of materials might support or oppress others, even if this will never be easy to gauge. This implies to consider the collective conditions of authorship. |
|||
|
|||
The CC4r asks you to be courageous with the use of materials that are being licensed under the CC4r. To discuss them, to doubt, to let go, to change your mind, to experiment with them, to give back to them and to take responsibility when things might go wrong. |
|||
|
|||
Considering the Collective Conditions for (re-)use involves inclusive crediting and speculative practices for referencing and resourcing. To consider the circulation of materials on commercial platforms as participating in extractive data practices; platform capitalism appropriates and abuses collective authorial practice. To take into account that the defaults of openness and transparency have different consequences in different contexts. To consider the potential necessity for opacity when accessing and transmitting knowledge, especially when it involves materials that matter to marginalized communities. |
|||
|
|||
This document was written in response to the Free Art License (FAL) in a process of coming to terms with the colonial structuring of knowledge production. It emerged out of concerns with the way Open Access and Free Culture ideologies by foregrounding openness and freedom as universal principles might replicate some of the problems with conventional copyright. |
|||
|
|||
DEFINITIONS |
|||
----------- |
|||
« LEGAL AUTHOR » In the CC4r, LEGAL AUTHOR is used for the individual that is assigned as "author" by conventional copyright. Even if the authored work was never theirs to begin with, he or she is the only one that is legally permitted to license a work under a CC4r. This license is therefore not about liability, or legal implications. It cares about the ways copyright contributes to structural inequalities. |
|||
« CURRENT AUTHOR » can be used for individuals and collectives. It is the person, collective or other that was involved in generating the work created under a CC4r license. CURRENT and FUTURE AUTHOR are used to avoid designations that overly rely on concepts of 'originality' and insist on linear orders of creation. |
|||
« FUTURE AUTHOR » can be used for individuals and collectives. They want to use the work under CC4r license and are held to its conditions. All future authors are considered coauthors, or anauthors. They are anauthorized because this license provides them with an unauthorized authorization. |
|||
« LICENSE » due to its conditional character, this document might actually not qualify as a license. It is for sure not a Free Culture License. see also: UNIVERSALIST OPENNESS. |
|||
« (RE-)USE » the CC4r opted for bracketing "RE" out of necessity to mess up the time-space linearity of the original. |
|||
« OPEN <-> CLOSED » the CC4r operates like rotating doors... it is a swinging license, or a hinged license. |
|||
« UNIVERSALIST OPENNESS » the CC4r tries to propose an alternative to universalist openness. A coming to terms with the fact that universal openness is "safe" only for some. |
|||
|
|||
|
|||
0. CONDITIONS |
|||
|
|||
The invitation to (re-)use the work licenced under CC4r applies as long as the FUTURE AUTHOR is convinced that this does not contribute to oppressive arrangements of power, privilege and difference. These may be reasons to refrain from release and re-use. |
|||
If it feels paralyzing to decide whether or not these conditions apply, it might point at the need to find alternative ways to activate the work. In case of doubt, consult for example https://constantvzw.org/wefts/orientationspourcollaboration.en.html |
|||
|
|||
1. OBJECT |
|||
The aim of this license is to articulate collective conditions for re-use. |
|||
|
|||
2. SCOPE |
|||
The work licensed under the CC4r is reluctantly subject to copyright law. By applying CC4r, the legal author extends its rights and invites others to copy, distribute, and modify the work. |
|||
|
|||
2.1 INVITATION TO COPY (OR TO MAKE REPRODUCTIONS) |
|||
When the conditions under 0. apply, you are invited to copy this work, for whatever reason and with whatever technique. |
|||
|
|||
2.2 INVITATION TO DISTRIBUTE, TO PERFORM IN PUBLIC |
|||
As long as the conditions under 0. apply, you are invited to distribute copies of this work; modified or not, whatever the medium and the place, with or without any charge, provided that you: |
|||
- attach this license to each of the copies of this work or indicate where the license can be found. |
|||
- make an effort to account for the collective conditions of the work, for example what contributions were made to the modified work and by whom, or how the work could continue. |
|||
- specify where to access other versions of the work. |
|||
|
|||
2.3 INVITATION TO MODIFY |
|||
As long as the conditions under 0. apply, you are invited to make future works based on the current work, provided that you: |
|||
- observe all conditions in article 2.2 above, if you distribute future works; |
|||
- indicate that the work has been modified and, if possible, what kind of modifications have been made. |
|||
- distribute future works under the same license or any compatible license. |
|||
|
|||
3. INCORPORATION OF THE WORK |
|||
Incorporating this work into a larger work (i.e., database, anthology, compendium, etc.) is possible. If as a result of its incorporation, the work can no longer be accessed apart from its appearance within the larger work, incorporation can only happen under the condition that the larger work is as well subject to the CC4r or to a compatible license. |
|||
|
|||
4. COMPATIBILITY |
|||
A license is compatible with the CC4r provided that: |
|||
- it invites users to take the implications of their appropriation into account; |
|||
- it invites to copy, distribute, and modify copies of the work including for commercial purposes and without any other restrictions than those required by the other compatibility criteria; |
|||
- it ensures that the collective conditions under which the work was authored are attributed unless not desirable, and access to previous versions of the work is provided when possible; |
|||
- it recognizes the CC4r as compatible (reciprocity); |
|||
- it requires that changes made to the work will be subject to the same license or to a license which also meets these compatibility criteria. |
|||
|
|||
5. LEGAL FRAMEWORK |
|||
Because of the conditions mentioned under 0., this is not a Free License. It is reluctantly formulated within the framework of both the Belgian law and the Berne Convention for the Protection of Literary and Artistic Works. |
|||
“We recognize that private ownership over media, ideas, and technology is rooted in European conceptions of property and the history of colonialism from which they formed. These systems of privatization and monopolization, namely copyright and patent law, enforce the systems of punishment and reward which benefit a privileged minority at the cost of others’ creative expression, political discourse, and cultural survival. The private and public institutions, legal frameworks, and social values which uphold these systems are inseparable from broader forms of oppression. Indigenous people, people of color, queer people, trans people, and women are particularly exploited for their creative and cultural resources while hardly receiving any of the personal gains or legal protections for their work. We also recognize that the public domain has jointly functioned to compliment the private, as works in the public domain may be appropriated for use in proprietary works. Therefore, we use copyleft not only to circumvent the monopoly granted by copyright, but also to protect against that appropriation.” [Decolonial Media License https://freeculture.org/About/license] |
|||
|
|||
6. YOUR RESPONSIBILITIES |
|||
The invitation to use the work as defined by the CC4r (invitation to copy, distribute, modify) implies to take the implications of the appropriation of the materials into account. |
|||
|
|||
7. DURATION OF THE LICENSE |
|||
This license takes effect as of the moment that the FUTURE AUTHOR accepts the invitation of the CURRENT AUTHOR. The act of copying, distributing, or modifying the work constitutes a tacit agreement. This license will remain in effect for the duration of the copyright which is attached to the work. If you do not respect the terms of this license, the invitation that it confers is void. |
|||
If the legal status or legislation to which you are subject makes it impossible for you to respect the terms of this license, you may not make use of the rights which it confers. |
|||
|
|||
8. VARIOUS VERSIONS OF THE LICENSE |
|||
You are invited to reformulate this license by way of new, renamed versions. [link to license on gitlab]. You can of course make reproductions and distribute this license verbatim (without any changes). |
|||
|
|||
USER GUIDE |
|||
|
|||
– How to use the CC4r? |
|||
To apply the CC4r, you need to mention the following elements: |
|||
[Name of the legal author, title, date of the work. When applicable, names of authors of the common work and, if possible, where to find other versions of the work]. |
|||
Copyleft with a difference: This is a collective work, you are invited to copy, distribute, and modify it under the terms of the CC4r [link to license]. |
|||
Short version: Legal author=name, date of work (? ask SD). CC4r [link to license] |
|||
|
|||
– Why use the CC4r? |
|||
1. To remind yourself and others that you do not own authored works |
|||
2. To not allow copyright to hinder works to evolve, to be extended, to be transformed |
|||
3. To allow materials to circulate as much as they need to |
|||
4. Because the CC4r offers a legal framework to disallow mis-appropriation by insisting on inclusive attribution. Nobody can take hold of the work as one’s exclusive possession. |
|||
|
|||
– When to use the CC4r? |
|||
Any time you want to invite others to copy, distribute and transform authored works without exclusive appropriation but with considering the implications of (re-)use, you can use the CC4r. You can for example apply it to collective documentation, hybrid productions, artistic collaborations or educational projects. |
|||
|
|||
– What kinds of works can be subject to the CC4r? |
|||
The Collective Conditions for re-use can be applied to digital as well as physical works. |
|||
You can choose to apply the CC4r for any text, picture, sound, gesture, or whatever material as long as you have legal author’s rights. |
|||
|
|||
– Background of this license: |
|||
The CC4r was developed for the Constant worksession Unbound libraries (spring 2020) and followed from discussions during and contributions to the study day Authors of the future (Fall 2019). It is based on the Free Art License http://artlibre.org/licence/lal/en/ and inspired by other licensing projects such as The (Cooperative) Non-Violent Public License https://thufie.lain.haus/NPL.html and the Decolonial Media license https://freeculture.org/About/license. |
|||
|
|||
Copyleft Attitude with a difference, 6 October 2020. |
@ -0,0 +1,56 @@ |
|||
# wiki-to-print |
|||
|
|||
Slightly adapted version of <https://github.com/hackersanddesigners/wiki2print>, in continuation of <https://gitlab.constantvzw.org/titipi/wiki-to-pdf> and <https://git.vvvvvvaria.org/mb/volumetric-regimes-book>. |
|||
|
|||
Installed at: <https://cc.vvvvvvaria.org/wiki/Wiki2print>. |
|||
|
|||
The code of the wiki2print instance that is running on the *creative |
|||
crowd* server is published at [Varia's Gitea](https://git.vvvvvvaria.org/varia/wiki-to-print) under the |
|||
[CC4r](https://constantvzw.org/wefts/cc4r.en.html) license. |
|||
|
|||
## Continuations |
|||
|
|||
This project is inspired by and builds upon several previous iterations |
|||
of and experiments with mediawiki-to-pdf workflows: |
|||
|
|||
- [Hackers & Designer](https://hackersanddesigners.nl/)\'s work on |
|||
[Making |
|||
Matters](https://wiki2print.hackersanddesigners.nl/wiki/Publishing:Making_Matters_Lexicon) |
|||
- [TITiPI](http://titipi.org/)\'s work on [Infrastructural |
|||
Interactions](http://titipi.org/wiki-to-pdf/unfold/Infrastructural_Interactions) |
|||
- [Manetta](https://git.vvvvvvaria.org/mb)\'s work on [Volumetric |
|||
Regimes](https://volumetricregimes.xyz/index.php?title=Volumetric_Regimes) |
|||
- [Constant](https://constantvzw.org/site/)\'s and |
|||
[OSP](https://osp.kitchen/)\'s work on |
|||
[Diversions](https://diversions.constantvzw.org/wiki/index.php?title=Main_Page) |
|||
- [many |
|||
more\...](https://constantvzw.org/wefts/webpublications.en.html) |
|||
|
|||
## How does it work? |
|||
|
|||
When you create a page in the `Pdf` namespace on <https://cc.vvvvvvaria.org/wiki/>, it will load the wiki-to-print buttons in the navigation bar: |
|||
|
|||
- `CSS!` |
|||
- `View HTML` |
|||
- `View PDF` |
|||
- `Update text` |
|||
- `Update Media` |
|||
|
|||
You can transclude pages into this page, structure your publication and edit the CSS. |
|||
|
|||
- When you click `View HTML`: the Flask application returns you a HTML version of the page. |
|||
- When you click `View PDF`: the Flask application returns you a HTML version of the page, loaded with Paged.js. The HTML page is rendered into pages, giving you a preview of the PDF. You can use the inspector to work on the lay out. |
|||
- When you click `Update text`: the Flask application makes a copy of all the text of the page and saves it to a file on the server (in the `static` folder). |
|||
- When you click `Update media`: the Flask application downloads all the images on the page and saves tem to a folder on the server (in the `static` folder). |
|||
|
|||
## In this repository |
|||
|
|||
* **command-line**: Python script to work on a local copy of your publication |
|||
* **wiki-to-print**: Flask application that renders a wiki page into HTML |
|||
|
|||
## Links |
|||
|
|||
* <https://cc.vvvvvvaria.org/wiki/Wiki2print> |
|||
* <https://constantvzw.org/wefts/webpublications.en.html> |
|||
* <https://titipi.org/wiki/index.php/Wiki-to-pdf> |
|||
* <https://pad.vvvvvvaria.org/wiki-printing> |
@ -0,0 +1,11 @@ |
|||
all: run |
|||
|
|||
run: |
|||
python3 -m http.server |
|||
|
|||
wiki: |
|||
# --- |
|||
# update the materials from the wiki, save it as Unfolded.html |
|||
python3 update.py |
|||
@echo "Pulling updates from the wiki: Unfolded (wiki) --> Unfolded.html (file)" |
|||
|
@ -0,0 +1,34 @@ |
|||
|
|||
# CLI for wiki-to-print |
|||
|
|||
The script uses the MediaWiki API to download all content (text + images) from a specified wiki page. |
|||
|
|||
It saves it as a HTML page, which can be turned into a PDF with Paged.js. |
|||
|
|||
## Folder structure |
|||
|
|||
``` |
|||
. |
|||
├── css |
|||
│ ├── baseline.css |
|||
│ ├── pagedjs.css |
|||
│ └── print.css |
|||
├── fonts |
|||
├── images |
|||
├── js |
|||
│ ├── paged.js |
|||
│ └── paged.polyfill.js |
|||
├── Makefile |
|||
├── templates |
|||
│ ├── template.html |
|||
│ └── template.inspect.html |
|||
└── update.py |
|||
``` |
|||
|
|||
## How to use it? |
|||
|
|||
1. Change the `wiki` and `pagename` variables in `update.py` on line 221 + 222. |
|||
2. Copy paste your CSS into print.css |
|||
3. Run `$ python3 update.py` |
|||
4. Run `$ make` |
|||
5. Open `localhost:8000` in your browser |
@ -0,0 +1,19 @@ |
|||
/* This baseline.css stylesheet is derived from: https://gist.github.com/julientaq/08d636a7a2b5f2824025256de0fca467 */ |
|||
/* Thanks a lot to julientaq for publishing it! */ |
|||
|
|||
:root { |
|||
--baseline: 18px; |
|||
--baseline-color: blue; |
|||
} |
|||
|
|||
/* grid baseline */ |
|||
.pagedjs_page { |
|||
/* background: |
|||
repeating-linear-gradient( |
|||
white 0, |
|||
white calc(var(--baseline) - 1px), var(--baseline-color) var(--baseline)); |
|||
background-size: cover; |
|||
background-repeat: repeat-y; */ |
|||
/* start of the first baseline: half of the line-height: 9px */ |
|||
/* background-position-y: 9px; */ |
|||
} |
@ -0,0 +1,180 @@ |
|||
/* CSS for Paged.js interface – v0.2 */ |
|||
|
|||
/* Change the look */ |
|||
:root { |
|||
--color-background: whitesmoke; |
|||
--color-pageSheet: #cfcfcf; |
|||
--color-pageBox: violet; |
|||
--color-paper: white; |
|||
--color-marginBox: transparent; |
|||
--pagedjs-crop-color: black; |
|||
--pagedjs-crop-shadow: white; |
|||
--pagedjs-crop-stroke: 1px; |
|||
} |
|||
|
|||
/* To define how the book look on the screen: */ |
|||
@media screen { |
|||
body { |
|||
background-color: var(--color-background); |
|||
} |
|||
|
|||
.pagedjs_pages { |
|||
display: flex; |
|||
width: calc(var(--pagedjs-width) * 2); |
|||
flex: 0; |
|||
flex-wrap: wrap; |
|||
margin: 0 auto; |
|||
} |
|||
|
|||
.pagedjs_page { |
|||
background-color: var(--color-paper); |
|||
box-shadow: 0 0 0 1px var(--color-pageSheet); |
|||
margin: 0; |
|||
flex-shrink: 0; |
|||
flex-grow: 0; |
|||
margin-top: 10mm; |
|||
} |
|||
|
|||
.pagedjs_first_page { |
|||
margin-left: var(--pagedjs-width); |
|||
} |
|||
|
|||
.pagedjs_page:last-of-type { |
|||
margin-bottom: 10mm; |
|||
} |
|||
|
|||
.pagedjs_pagebox{ |
|||
box-shadow: 0 0 0 1px var(--color-pageBox); |
|||
} |
|||
|
|||
.pagedjs_left_page{ |
|||
z-index: 20; |
|||
width: calc(var(--pagedjs-bleed-left) + var(--pagedjs-pagebox-width))!important; |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-crop { |
|||
border-color: transparent; |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-middle{ |
|||
width: 0; |
|||
} |
|||
|
|||
.pagedjs_right_page{ |
|||
z-index: 10; |
|||
position: relative; |
|||
left: calc(var(--pagedjs-bleed-left)*-1); |
|||
} |
|||
|
|||
/* show the margin-box */ |
|||
|
|||
.pagedjs_margin-top-left-corner-holder, |
|||
.pagedjs_margin-top, |
|||
.pagedjs_margin-top-left, |
|||
.pagedjs_margin-top-center, |
|||
.pagedjs_margin-top-right, |
|||
.pagedjs_margin-top-right-corner-holder, |
|||
.pagedjs_margin-bottom-left-corner-holder, |
|||
.pagedjs_margin-bottom, |
|||
.pagedjs_margin-bottom-left, |
|||
.pagedjs_margin-bottom-center, |
|||
.pagedjs_margin-bottom-right, |
|||
.pagedjs_margin-bottom-right-corner-holder, |
|||
.pagedjs_margin-right, |
|||
.pagedjs_margin-right-top, |
|||
.pagedjs_margin-right-middle, |
|||
.pagedjs_margin-right-bottom, |
|||
.pagedjs_margin-left, |
|||
.pagedjs_margin-left-top, |
|||
.pagedjs_margin-left-middle, |
|||
.pagedjs_margin-left-bottom { |
|||
box-shadow: 0 0 0 1px inset var(--color-marginBox); |
|||
} |
|||
|
|||
/* uncomment this part for recto/verso book : ------------------------------------ */ |
|||
/* |
|||
|
|||
.pagedjs_pages { |
|||
flex-direction: column; |
|||
width: 100%; |
|||
} |
|||
|
|||
.pagedjs_first_page { |
|||
margin-left: 0; |
|||
} |
|||
|
|||
.pagedjs_page { |
|||
margin: 0 auto; |
|||
margin-top: 10mm; |
|||
} |
|||
|
|||
.pagedjs_left_page{ |
|||
width: calc(var(--pagedjs-bleed-left) + var(--pagedjs-pagebox-width) + var(--pagedjs-bleed-left))!important; |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-crop{ |
|||
border-color: var(--pagedjs-crop-color); |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-middle{ |
|||
width: var(--pagedjs-cross-size)!important; |
|||
} |
|||
|
|||
.pagedjs_right_page{ |
|||
left: 0; |
|||
} |
|||
*/ |
|||
|
|||
|
|||
|
|||
/*--------------------------------------------------------------------------------------*/ |
|||
|
|||
|
|||
|
|||
/* uncomment this par to see the baseline : -------------------------------------------*/ |
|||
|
|||
/* |
|||
.pagedjs_pagebox { |
|||
--pagedjs-baseline: 22px; |
|||
--pagedjs-baseline-position: 5px; |
|||
--pagedjs-baseline-color: cyan; |
|||
background: linear-gradient(transparent 0%, transparent calc(var(--pagedjs-baseline) - 1px), var(--pagedjs-baseline-color) calc(var(--pagedjs-baseline) - 1px), var(--pagedjs-baseline-color) var(--pagedjs-baseline)), transparent; |
|||
background-size: 100% var(--pagedjs-baseline); |
|||
background-repeat: repeat-y; |
|||
background-position-y: var(--pagedjs-baseline-position); |
|||
} */ |
|||
|
|||
|
|||
/*--------------------------------------------------------------------------------------*/ |
|||
} |
|||
|
|||
|
|||
|
|||
|
|||
|
|||
/* Marks (to delete when merge in paged.js) */ |
|||
|
|||
.pagedjs_marks-crop{ |
|||
z-index: 999999999999; |
|||
|
|||
} |
|||
|
|||
.pagedjs_bleed-top .pagedjs_marks-crop, |
|||
.pagedjs_bleed-bottom .pagedjs_marks-crop{ |
|||
box-shadow: 1px 0px 0px 0px var(--pagedjs-crop-shadow); |
|||
} |
|||
|
|||
.pagedjs_bleed-top .pagedjs_marks-crop:last-child, |
|||
.pagedjs_bleed-bottom .pagedjs_marks-crop:last-child{ |
|||
box-shadow: -1px 0px 0px 0px var(--pagedjs-crop-shadow); |
|||
} |
|||
|
|||
.pagedjs_bleed-left .pagedjs_marks-crop, |
|||
.pagedjs_bleed-right .pagedjs_marks-crop{ |
|||
box-shadow: 0px 1px 0px 0px var(--pagedjs-crop-shadow); |
|||
} |
|||
|
|||
.pagedjs_bleed-left .pagedjs_marks-crop:last-child, |
|||
.pagedjs_bleed-right .pagedjs_marks-crop:last-child{ |
|||
box-shadow: 0px -1px 0px 0px var(--pagedjs-crop-shadow); |
|||
} |
@ -0,0 +1,22 @@ |
|||
:root{ |
|||
--font-size: 12px; |
|||
--line-height: 18px; |
|||
} |
|||
|
|||
@page{ |
|||
size: A4 portrait; |
|||
bleed: 3mm; |
|||
marks: crop; |
|||
|
|||
@bottom-center{ |
|||
content: counter(page); |
|||
font-size: 8pt; |
|||
} |
|||
} |
|||
|
|||
html, body{ |
|||
font-family: serif; |
|||
font-size: var(--font-size); |
|||
line-height: var(--line-height); |
|||
hyphens: auto; |
|||
} |
File diff suppressed because it is too large
File diff suppressed because it is too large
@ -0,0 +1,17 @@ |
|||
<!DOCTYPE html> |
|||
<html lang="en"> |
|||
<head> |
|||
<meta charset="utf-8"> |
|||
<script src="./js/paged.js" type="text/javascript"></script> |
|||
<script src="./js/paged.polyfill.js" type="text/javascript"></script> |
|||
<link href="./css/pagedjs.css" rel="stylesheet" type="text/css"> |
|||
<link href="./css/print.css" rel="stylesheet" type="text/css" media="print"> |
|||
<!-- <link href="./css/baseline.css" rel="stylesheet" type="text/css" media="print"> --> |
|||
</head> |
|||
<body> |
|||
<div id="wrapper"> |
|||
{{ publication_unfolded }} |
|||
</div> |
|||
</body> |
|||
|
|||
</html> |
@ -0,0 +1,12 @@ |
|||
<!DOCTYPE html> |
|||
<html lang="en"> |
|||
<head> |
|||
<meta charset="utf-8"> |
|||
<link href="./css/print.css" rel="stylesheet" type="text/css" media="print"> |
|||
</head> |
|||
<body> |
|||
<div id="wrapper"> |
|||
{{ publication_unfolded }} |
|||
</div> |
|||
</body> |
|||
</html> |
@ -0,0 +1,226 @@ |
|||
import urllib.request |
|||
import os |
|||
import re |
|||
import json |
|||
import jinja2 |
|||
|
|||
STATIC_FOLDER_PATH = '.' # without trailing slash |
|||
PUBLIC_STATIC_FOLDER_PATH = '.' # without trailing slash |
|||
TEMPLATES_DIR = './templates' |
|||
|
|||
# This uses a low quality copy of all the images |
|||
# (using a folder with the name "images-small", |
|||
# which stores a copy of all the images generated with: |
|||
# $ mogrify -quality 5% -adaptive-resize 25% -remap pattern:gray50 * ) |
|||
fast = False |
|||
|
|||
def API_request(url, pagename): |
|||
""" |
|||
url = API request url (string) |
|||
data = { 'query': |
|||
'pages' : |
|||
pageid : { |
|||
'links' : { |
|||
'?' : '?' |
|||
'title' : 'pagename' |
|||
} |
|||
} |
|||
} |
|||
} |
|||
""" |
|||
response = urllib.request.urlopen(url).read() |
|||
data = json.loads(response) |
|||
|
|||
# Save response as JSON to be able to inspect API call |
|||
json_file = f'{ STATIC_FOLDER_PATH }/{ pagename }.json' |
|||
print('Saving JSON:', json_file) |
|||
with open(json_file, 'w') as out: |
|||
out.write(json.dumps(data, indent=4)) |
|||
out.close() |
|||
|
|||
return data |
|||
|
|||
def download_media(html, images, wiki): |
|||
""" |
|||
html = string (HTML) |
|||
images = list of filenames (str) |
|||
""" |
|||
# check if 'images/' already exists |
|||
if not os.path.exists(f'{ STATIC_FOLDER_PATH }/images'): |
|||
os.makedirs(f'{ STATIC_FOLDER_PATH }/images') |
|||
|
|||
# download media files |
|||
for filename in images: |
|||
filename = filename.replace(' ', '_') # safe filenames |
|||
|
|||
# check if the image is already downloaded |
|||
# if not, then download the file |
|||
if not os.path.isfile(f'{ STATIC_FOLDER_PATH }/images/{ filename }'): |
|||
|
|||
# first we search for the full filename of the image |
|||
url = f'{ wiki }/api.php?action=query&list=allimages&aifrom={ filename }&format=json' |
|||
response = urllib.request.urlopen(url).read() |
|||
data = json.loads(response) |
|||
|
|||
# we select the first search result |
|||
# (assuming that this is the image we are looking for) |
|||
image = data['query']['allimages'][0] |
|||
|
|||
# then we download the image |
|||
image_url = image['url'] |
|||
image_filename = image['name'] |
|||
print('Downloading:', image_filename) |
|||
image_response = urllib.request.urlopen(image_url).read() |
|||
|
|||
# and we save it as a file |
|||
image_path = f'{ STATIC_FOLDER_PATH }/images/{ image_filename }' |
|||
out = open(image_path, 'wb') |
|||
out.write(image_response) |
|||
out.close() |
|||
|
|||
import time |
|||
time.sleep(3) # do not overload the server |
|||
|
|||
# replace src link |
|||
image_path = f'{ PUBLIC_STATIC_FOLDER_PATH }/images/{ filename }' # here the images need to link to the / of the domain, for flask :/// confusing! this breaks the whole idea to still be able to make a local copy of the file |
|||
matches = re.findall(rf'src="/images/.*?px-{ filename }"', html) # for debugging |
|||
if matches: |
|||
html = re.sub(rf'src="/images/.*?px-{ filename }"', f'src="{ image_path }"', html) |
|||
else: |
|||
matches = re.findall(rf'src="/images/.*?{ filename }"', html) # for debugging |
|||
html = re.sub(rf'src="/images/.*?{ filename }"', f'src="{ image_path }"', html) |
|||
# print(f'{filename}: {matches}\n------') # for debugging: each image should have the correct match! |
|||
|
|||
return html |
|||
|
|||
def add_item_inventory_links(html): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
# Find all references in the text to the item index |
|||
pattern = r'Item \d\d\d' |
|||
matches = re.findall(pattern, html) |
|||
index = {} |
|||
new_html = '' |
|||
from nltk.tokenize import sent_tokenize |
|||
for line in sent_tokenize(html): |
|||
for match in matches: |
|||
if match in line: |
|||
number = match.replace('Item ', '').strip() |
|||
if not number in index: |
|||
index[number] = [] |
|||
count = 1 |
|||
else: |
|||
count = index[number][-1] + 1 |
|||
index[number].append(count) |
|||
item_id = f'ii-{ number }-{ index[number][-1] }' |
|||
line = line.replace(match, f'Item <a id="{ item_id }" href="#Item_Index">{ number }</a>') |
|||
|
|||
# the line is pushed back to the new_html |
|||
new_html += line + ' ' |
|||
|
|||
# Also add a <span> around the index nr to style it |
|||
matches = re.findall(r'<li>\d\d\d', new_html) |
|||
for match in matches: |
|||
new_html = new_html.replace(match, f'<li><span class="item_nr">{ match }</span>') |
|||
|
|||
# import json |
|||
# print(json.dumps(index, indent=4)) |
|||
|
|||
return new_html |
|||
|
|||
def clean_up(html): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
html = re.sub(r'\[.*edit.*\]', '', html) # remove the [edit] |
|||
html = re.sub(r'href="/index.php\?title=', 'href="#', html) # remove the internal wiki links |
|||
html = re.sub(r'[(?=\d)', '', html) # remove left footnote bracket [ |
|||
html = re.sub(r'(?<=\d)]', '', html) # remove right footnote bracket ] |
|||
return html |
|||
|
|||
def fast_loader(html): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
if fast == True: |
|||
html = html.replace('/images/', '/images-small/') |
|||
print('--- rendered in FAST mode ---') |
|||
|
|||
return html |
|||
|
|||
def parse_page(pagename, wiki): |
|||
""" |
|||
pagename = string |
|||
html = string (HTML) |
|||
""" |
|||
parse = f'{ wiki }/api.php?action=parse&page={ pagename }&pst=True&format=json' |
|||
data = API_request(parse, pagename) |
|||
# print(json.dumps(data, indent=4)) |
|||
if 'parse' in data: |
|||
html = data['parse']['text']['*'] |
|||
images = data['parse']['images'] |
|||
html = download_media(html, images, wiki) |
|||
html = clean_up(html) |
|||
html = add_item_inventory_links(html) |
|||
html = fast_loader(html) |
|||
else: |
|||
html = None |
|||
|
|||
return html |
|||
|
|||
def save(html, pagename): |
|||
""" |
|||
html = string (HTML) |
|||
pagename = string |
|||
""" |
|||
if __name__ == "__main__": |
|||
# command-line |
|||
|
|||
# save final page that will be used with PagedJS |
|||
template_file = open(f'{ STATIC_FOLDER_PATH }/{ TEMPLATES_DIR }/template.html').read() |
|||
template = jinja2.Template(template_file) |
|||
doc = template.render(publication_unfolded=html, title=pagename) |
|||
|
|||
html_file = f'{ STATIC_FOLDER_PATH }/{ pagename }.html' |
|||
print('Saving HTML:', html_file) |
|||
with open(html_file, 'w') as out: |
|||
out.write(doc) |
|||
out.close() |
|||
|
|||
# save extra html page for debugging (CLI only) |
|||
template_file = open(f'{ STATIC_FOLDER_PATH }/{ TEMPLATES_DIR }/template.inspect.html').read() |
|||
template = jinja2.Template(template_file) |
|||
doc = template.render(publication_unfolded=html, title=pagename) |
|||
|
|||
html_file = f'{ STATIC_FOLDER_PATH }/{ pagename }.inspect.html' |
|||
print('Saving HTML:', html_file) |
|||
with open(html_file, 'w') as out: |
|||
out.write(doc) |
|||
out.close() |
|||
|
|||
else: |
|||
# Flask application |
|||
|
|||
with open(f'{ STATIC_FOLDER_PATH }/Unfolded.html', 'w') as out: |
|||
out.write(html) # save the html to a file (without <head>) |
|||
|
|||
def update_material_now(pagename, wiki): |
|||
""" |
|||
pagename = string |
|||
publication_unfolded = string (HTML) |
|||
""" |
|||
publication_unfolded = parse_page(pagename, wiki) |
|||
|
|||
return publication_unfolded |
|||
|
|||
# --- |
|||
|
|||
if __name__ == "__main__": |
|||
|
|||
wiki = 'https://example.com/wiki' # no tail slash '/' |
|||
pagename = 'Unfolded' |
|||
|
|||
publication_unfolded = update_material_now(pagename, wiki) # download the latest version of the page |
|||
save(publication_unfolded, pagename) # save the page to file |
|||
|
@ -0,0 +1,53 @@ |
|||
/* CSS placed here will be applied to all skins */ |
|||
|
|||
.vector-body h1{ |
|||
display: inline; |
|||
} |
|||
|
|||
h1#firstHeading{ |
|||
color: white; |
|||
border-bottom: 0; |
|||
padding: 0.3em 0 0 0.5em; |
|||
background: black; |
|||
} |
|||
|
|||
/* wiki2print button */ |
|||
li.wiki2print{ |
|||
background-image: linear-gradient(to top,fuchsia 0,#ed8ce2 1px,#fff 100%) !important; |
|||
} |
|||
#ca-talk.wiki2print > a, |
|||
#p-views .wiki2print > a{ |
|||
color: magenta; |
|||
background-image: linear-gradient(to bottom,rgba(167,215,249,0) 0,fuchsia 100%); |
|||
background-size: 1px 100%; |
|||
background-repeat: no-repeat; |
|||
} |
|||
|
|||
/* captcha question on the Create Account page -- does not work*/ |
|||
#userloginForm .mw-input { |
|||
font-weight: bold; |
|||
margin-top: 1em; |
|||
} |
|||
|
|||
/* categorytree styling*/ |
|||
.CategoryTreeItem::before{ |
|||
content: "•"; |
|||
font-size: 17px; |
|||
margin-right: -0.6em; |
|||
margin-left: 0.6em; |
|||
vertical-align: middle; |
|||
} |
|||
|
|||
div.pad{ |
|||
display:block; |
|||
height: 1em; |
|||
} |
|||
div.pad iframe{ |
|||
float: right; |
|||
margin-left: 2em; |
|||
margin-bottom: 2em; |
|||
} |
|||
/* hide the pad in the Visual Editor edit view */ |
|||
.ve-init-target-visual div.pad{ |
|||
display: none; |
|||
} |
@ -0,0 +1,104 @@ |
|||
/* Any JavaScript here will be loaded for all users on every page load. */ |
|||
|
|||
// Any JavaScript here will be loaded for all |
|||
// users on every page load. |
|||
|
|||
console.log('hello from common.js') |
|||
|
|||
// rename 'Discussion' tab or context menu button |
|||
// to 'CSS' in the 'Pdf' namespace. |
|||
|
|||
const |
|||
url = window.location.href, |
|||
NS = 'Pdf', // content namespace |
|||
cssNS = NS + 'CSS', // css namespace |
|||
pageName = mw.config.get("wgPageName").split(":")[1] |
|||
|
|||
if (url.includes(NS + ':')) { |
|||
console.log('this page is in namespace', NS) |
|||
|
|||
// Change Discussion into CSS button |
|||
const talkAnchor = document.querySelector('#ca-talk a') |
|||
const talkLink = talkAnchor.href |
|||
talkAnchor.innerText = 'CSS!' |
|||
const talkButton = document.querySelector('#ca-talk') |
|||
talkButton.classList.add('wiki2print') |
|||
|
|||
// adding more buttons |
|||
const pageViews = document.querySelector('#p-views ul') |
|||
|
|||
// View HTML |
|||
const htmlButton = document.createElement('li') |
|||
htmlButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
htmlButton.id = 'ca-html' |
|||
htmlButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/html/' + pageName + '" target="_blank">View HTML</a>' |
|||
pageViews.appendChild(htmlButton) |
|||
|
|||
// View PDF |
|||
const pdfButton = document.createElement('li') |
|||
pdfButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
pdfButton.id = 'ca-pdf' |
|||
pdfButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/pdf/' + pageName + '" target="_blank">View PDF</a>' |
|||
pageViews.appendChild(pdfButton) |
|||
|
|||
// UPDATE |
|||
const updateButton = document.createElement('li') |
|||
updateButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
updateButton.id = 'ca-update' |
|||
updateButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/update/' + pageName + '" target="_blank">Update text</a>' |
|||
pageViews.appendChild(updateButton) |
|||
|
|||
// FULL UPDATE |
|||
const fullupdateButton = document.createElement('li') |
|||
fullupdateButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
fullupdateButton.id = 'ca-full-update' |
|||
fullupdateButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/update/' + pageName + '?full=true" target="_blank">Update media</a>' |
|||
pageViews.appendChild(fullupdateButton) |
|||
|
|||
} else if (url.includes(cssNS + ':')) { |
|||
console.log('this page is in namespace', cssNS) |
|||
|
|||
// Change "Page" button into "Content" button |
|||
const contentAnchor = document.querySelector('#ca-nstab-pdf a') |
|||
const contentLink = contentAnchor.href |
|||
contentAnchor.innerText = 'Content' |
|||
|
|||
// Change "Discussion" button into "CSS" button |
|||
const talkAnchor = document.querySelector('#ca-talk a') |
|||
const talkLink = talkAnchor.href |
|||
talkAnchor.innerText = 'CSS!' |
|||
const talkButton = document.querySelector('#ca-talk') |
|||
talkButton.classList.add('wiki2print') |
|||
|
|||
// adding more buttons |
|||
const pageViews = document.querySelector('#p-views ul') |
|||
|
|||
// View HTML |
|||
const htmlButton = document.createElement('li') |
|||
htmlButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
htmlButton.id = 'ca-html' |
|||
htmlButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/html/' + pageName + '" target="_blank">View HTML</a>' |
|||
pageViews.appendChild(htmlButton) |
|||
|
|||
// View PDF |
|||
const pdfButton = document.createElement('li') |
|||
pdfButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
pdfButton.id = 'ca-pdf' |
|||
pdfButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/pdf/' + pageName + '" target="_blank">View PDF</a>' |
|||
pageViews.appendChild(pdfButton) |
|||
|
|||
// UPDATE |
|||
const updateButton = document.createElement('li') |
|||
updateButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
updateButton.id = 'ca-update' |
|||
updateButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/update/' + pageName + '" target="_blank">Update text</a>' |
|||
pageViews.appendChild(updateButton) |
|||
|
|||
// FULL UPDATE |
|||
const fullupdateButton = document.createElement('li') |
|||
fullupdateButton.classList.add('collapsible', 'mw-list-item', 'wiki2print') |
|||
fullupdateButton.id = 'ca-full-update' |
|||
fullupdateButton.innerHTML = '<a href="https://cc.vvvvvvaria.org/wiki-to-print/update/' + pageName + '?full=true" target="_blank">Update media</a>' |
|||
pageViews.appendChild(fullupdateButton) |
|||
|
|||
} |
@ -0,0 +1,73 @@ |
|||
server { |
|||
listen 80 default_server; |
|||
listen [::]:80 default_server; |
|||
|
|||
return 301 https://cc.vvvvvvaria.org$request_uri; |
|||
} |
|||
server { |
|||
listen 443 ssl; |
|||
server_name cc.vvvvvvaria.org; |
|||
|
|||
root /var/www/html; |
|||
index index.html index.php index.htm index.nginx-debian.html; |
|||
|
|||
location / { |
|||
try_files $uri $uri/ =404; |
|||
autoindex on; |
|||
} |
|||
|
|||
ssl_certificate /etc/letsencrypt/live/cc.vvvvvvaria.org/fullchain.pem; |
|||
ssl_certificate_key /etc/letsencrypt/live/cc.vvvvvvaria.org/privkey.pem; |
|||
|
|||
location ~ \.php$ { |
|||
include snippets/fastcgi-php.conf; |
|||
fastcgi_buffers 16 16k; |
|||
fastcgi_buffer_size 32k; |
|||
fastcgi_pass unix:/run/php/php7.4-fpm.sock; |
|||
# tip from Michael |
|||
include fastcgi_params; |
|||
} |
|||
|
|||
# --------------------------------------------------- |
|||
# WIKI |
|||
|
|||
# Images |
|||
location /wiki/images { |
|||
# Separate location for images/ so .php execution won't apply |
|||
} |
|||
location /wiki/images/deleted { |
|||
# Deny access to deleted images folder |
|||
deny all; |
|||
} |
|||
# MediaWiki assets (usually images) |
|||
location ~ ^/wiki/resources/(assets|lib|src) { |
|||
try_files $uri 404; |
|||
add_header Cache-Control "public"; |
|||
expires 7d; |
|||
} |
|||
# Assets, scripts and styles from skins and extensions |
|||
location ~ ^/wiki/(skins|extensions)/.+\.(css|js|gif|jpg|jpeg|png|svg|wasm)$ { |
|||
try_files $uri 404; |
|||
add_header Cache-Control "public"; |
|||
expires 7d; |
|||
} |
|||
# License and credits files |
|||
location ~ ^/wiki/(COPYING|CREDITS)$ { |
|||
default_type text/plain; |
|||
} |
|||
# Handling for Mediawiki REST API, see [[mw:API:REST_API]] |
|||
location /wiki/rest.php/ { |
|||
try_files $uri $uri/ /wiki/rest.php?$query_string; |
|||
} |
|||
# Handling for the article path (pretty URLs) |
|||
location /wiki/ { |
|||
rewrite ^/wiki/(?<pagename>.*)$ /wiki/index.php; |
|||
} |
|||
|
|||
# ---------------------------------------------------- |
|||
# wiki-to-print |
|||
|
|||
location /wiki-to-print/ { |
|||
proxy_pass http://localhost:5522; |
|||
} |
|||
} |
@ -0,0 +1,9 @@ |
|||
all: local |
|||
|
|||
local: |
|||
./venv/bin/python3 web-interface.py |
|||
|
|||
server: |
|||
@SCRIPT_NAME=/wiki-to-print venv/bin/gunicorn -b localhost:5522 --reload web-interface:APP |
|||
|
|||
|
@ -0,0 +1,14 @@ |
|||
# wiki-to-print |
|||
|
|||
## How to use it? |
|||
|
|||
* `$ make local`: to work locally |
|||
* `$ make server`: to install wiki-to-print on a server with `gunicorn` |
|||
|
|||
## How to install it? |
|||
|
|||
* install a wiki on a server |
|||
* install the Flask application on the same server |
|||
* edit the `Mediawiki:Common.js` page on the wiki (see `wiki-to-print.Common.js.example`) |
|||
* edit the `Mediawiki:Common.css` page on the wiki (see `wiki-to-print.Common.css.example`) |
|||
* configure nginx (see `wiki-to-print.nginx.example`) |
@ -0,0 +1,432 @@ |
|||
from pprint import pprint |
|||
import sys |
|||
import urllib.request |
|||
import urllib.error |
|||
import os |
|||
import re |
|||
import json |
|||
import jinja2 |
|||
import datetime |
|||
from bs4 import BeautifulSoup |
|||
|
|||
STATIC_FOLDER_PATH = './static' # without trailing slash |
|||
PUBLIC_STATIC_FOLDER_PATH = '/static' # without trailing slash |
|||
TEMPLATES_DIR = None |
|||
|
|||
# This uses a low quality copy of all the images |
|||
# (using a folder with the name "images-small", |
|||
# which stores a copy of all the images generated with: |
|||
# $ mogrify -quality 5% -adaptive-resize 25% -remap pattern:gray50 * ) |
|||
|
|||
fast = False |
|||
|
|||
|
|||
# gets or creates index of publications in namespace |
|||
|
|||
def get_index(wiki, subject_ns): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
""" |
|||
return load_file('index', 'json') or create_index( |
|||
wiki, |
|||
subject_ns |
|||
) |
|||
|
|||
|
|||
# gets publication's HTML and CSS |
|||
|
|||
def get_publication(wiki, subject_ns, styles_ns, pagename): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
styles_ns = object |
|||
pagename = string |
|||
""" |
|||
return { |
|||
'html': get_html(wiki, subject_ns, pagename), |
|||
'css': get_css(wiki, styles_ns, pagename) |
|||
} |
|||
|
|||
|
|||
# gets or creates HTML file for a publication |
|||
|
|||
def get_html(wiki, subject_ns, pagename): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
pagename = string |
|||
""" |
|||
return load_file(pagename, 'html') or create_html( |
|||
wiki, |
|||
subject_ns, |
|||
pagename, |
|||
) |
|||
|
|||
|
|||
# gets or creates CSS file for a publication |
|||
|
|||
def get_css(wiki, styles_ns, pagename): |
|||
""" |
|||
wiki = string |
|||
styles_ns = object |
|||
pagename = string |
|||
""" |
|||
return load_file(pagename, 'css') or create_css( |
|||
wiki, |
|||
styles_ns, |
|||
pagename |
|||
) |
|||
|
|||
|
|||
# makes API call to create/update index of publications |
|||
|
|||
def create_index(wiki, subject_ns): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
""" |
|||
url = f'{ wiki }/api.php?action=query&format=json&list=allpages&apnamespace={ subject_ns["id"] }' |
|||
data = do_API_request(url) |
|||
pages = data['query']['allpages'] |
|||
# exclude subpages |
|||
pages = [page for page in pages if '/' not in page['title']] |
|||
for page in pages: |
|||
# removing the namespace from title |
|||
page['title'] = page['title'].replace(subject_ns['name'] + ':', '') |
|||
page['slug'] = page['title'].replace(' ', '_') # slugifying title |
|||
pageJSON = load_file(page['slug'], 'json') |
|||
page['updated'] = pageJSON and pageJSON['updated'] or '--' |
|||
now = str(datetime.datetime.now()) |
|||
index = { |
|||
'pages': pages, |
|||
'updated': now |
|||
} |
|||
save_file('index', 'json', index) |
|||
return index |
|||
|
|||
|
|||
# Creates/updates a publication object |
|||
|
|||
def create_publication(wiki, subject_ns, styles_ns, pagename, full_update): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
styles_ns = object |
|||
pagename = string |
|||
""" |
|||
return { |
|||
'html': create_html(wiki, subject_ns, pagename, full_update), |
|||
'css': create_css(wiki, styles_ns, pagename) |
|||
} |
|||
|
|||
|
|||
# makes API call to create/update a publication's HTML |
|||
|
|||
def create_html(wiki, subject_ns, pagename, full_update): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
pagename = string |
|||
full_update = None or string. Full update when not None |
|||
""" |
|||
url = f'{ wiki }/api.php?action=parse&page={ subject_ns["name"] }:{ pagename }&pst=True&format=json' |
|||
data = do_API_request(url, subject_ns["name"]+":"+pagename, wiki) |
|||
# pprint(data) |
|||
now = str(datetime.datetime.now()) |
|||
data['updated'] = now |
|||
|
|||
save_file(pagename, 'json', data) |
|||
|
|||
update_publication_date( # we add the last updated of the publication to our index |
|||
wiki, |
|||
subject_ns, |
|||
pagename, |
|||
now |
|||
) |
|||
|
|||
if 'parse' in data: |
|||
html = data['parse']['text']['*'] |
|||
# pprint(html) |
|||
imgs = data['parse']['images'] |
|||
|
|||
html = remove_comments(html) |
|||
html = download_media(html, imgs, wiki, full_update) |
|||
html = clean_up(html) |
|||
# html = add_item_inventory_links(html) |
|||
|
|||
if fast == True: |
|||
html = fast_loader(html) |
|||
|
|||
soup = BeautifulSoup(html, 'html.parser') |
|||
soup = remove_edit(soup) |
|||
soup = inlineCiteRefs(soup) |
|||
html = str(soup) |
|||
# html = inlineCiteRefs(html) |
|||
# html = add_author_names_toc(html) |
|||
|
|||
else: |
|||
html = None |
|||
|
|||
save_file(pagename, 'html', html) |
|||
|
|||
return html |
|||
|
|||
|
|||
# makes API call to create/update a publication's CSS |
|||
|
|||
def create_css(wiki, styles_ns, pagename): |
|||
""" |
|||
wiki = string |
|||
styles_ns = object |
|||
pagename = string |
|||
""" |
|||
css_url = f'{ wiki }/api.php?action=parse&page={ styles_ns["name"] }:{ pagename }&prop=wikitext&pst=True&format=json' |
|||
css_data = do_API_request(css_url) |
|||
if css_data and 'parse' in css_data: |
|||
css = css_data['parse']['wikitext']['*'] |
|||
save_file(pagename, 'css', css) |
|||
return css |
|||
|
|||
|
|||
# Load file from disk |
|||
|
|||
def load_file(pagename, ext): |
|||
""" |
|||
pagename = string |
|||
ext = string |
|||
""" |
|||
path = f'{ STATIC_FOLDER_PATH }/{ pagename }.{ ext }' |
|||
if os.path.exists(path): |
|||
print(f'Loading { ext }:', path) |
|||
with open(path, 'r') as out: |
|||
if ext == 'json': |
|||
data = json.load(out) |
|||
else: |
|||
data = out.read() |
|||
out.close() |
|||
return data |
|||
|
|||
|
|||
# Save file to disk |
|||
|
|||
def save_file(pagename, ext, data): |
|||
""" |
|||
pagename = string |
|||
ext = string |
|||
data = object |
|||
""" |
|||
path = f'{ STATIC_FOLDER_PATH }/{ pagename }.{ ext }' |
|||
print(f'Saving { ext }:', path) |
|||
try: |
|||
out = open(path, 'w') |
|||
except OSError: |
|||
print("Could not open/write file:", path) |
|||
sys.exit() |
|||
|
|||
with out: #open(path, 'w') as out: |
|||
if ext == 'json': |
|||
out.write( json.dumps(data, indent = 2) ) |
|||
else: |
|||
out.write( data ) |
|||
out.close() |
|||
return data |
|||
|
|||
|
|||
# do API request and return JSON |
|||
|
|||
def do_API_request(url, filename="", wiki=""): |
|||
""" |
|||
url = API request url (string) |
|||
data = { 'query': |
|||
'pages' : |
|||
pageid : { |
|||
'links' : { |
|||
'?' : '?' |
|||
'title' : 'pagename' |
|||
} |
|||
} |
|||
} |
|||
} |
|||
""" |
|||
purge(filename, wiki) |
|||
print('Loading from wiki: ', url) |
|||
response = urllib.request.urlopen(url) |
|||
response_type = response.getheader('Content-Type') |
|||
|
|||
if response.status == 200 and "json" in response_type: |
|||
contents = response.read() |
|||
data = json.loads(contents) |
|||
return data |
|||
|
|||
# api calls seem to be cached even when called with maxage |
|||
# So call purge before doing the api call. |
|||
# https://www.mediawiki.org/wiki/API:Purge |
|||
def purge(filename, wiki): |
|||
if(filename=="" or wiki==""): return |
|||
print("purge " + filename ) |
|||
|
|||
import requests |
|||
S = requests.Session() |
|||
URL = f'{ wiki }/api.php' |
|||
# url = f'{ wiki }/api.php?action=query&list=allimages&aifrom={ filename }&format=json' |
|||
PARAMS = { |
|||
"action": "purge", |
|||
"titles": filename, |
|||
"format": "json", |
|||
"generator": "alltransclusions", |
|||
} |
|||
R = S.post(url=URL, params=PARAMS) |
|||
# DATA = R.text |
|||
|
|||
# updates a publication's last updated feild in the index |
|||
|
|||
def update_publication_date(wiki, subject_ns, pagename, updated): |
|||
""" |
|||
wiki = string |
|||
subject_ns = object |
|||
pagename = string |
|||
updated = string |
|||
""" |
|||
index = get_index(wiki, subject_ns) |
|||
for page in index['pages']: |
|||
if page['slug'] == pagename: |
|||
page['updated'] = updated |
|||
save_file('index', 'json', index) |
|||
|
|||
def customTemplate(name): |
|||
path = "custom/%s.html" % name |
|||
if os.path.isfile(os.path.join(os.path.dirname(__file__), "templates/", path)): |
|||
return path |
|||
else: |
|||
return None |
|||
|
|||
|
|||
|
|||
|
|||
# Beautiful soup seems to have a problem with some comments, |
|||
# so lets remove them before parsing. |
|||
|
|||
def remove_comments( html ): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
pattern = r'(<!--.*?-->)|(<!--[\S\s]+?-->)|(<!--[\S\s]*?$)' |
|||
return re.sub(pattern, "", html) |
|||
|
|||
|
|||
# Downloading images referenced in the html |
|||
|
|||
def download_media(html, images, wiki, full_update): |
|||
""" |
|||
html = string (HTML) |
|||
images = list of filenames (str) |
|||
""" |
|||
# check if 'images/' already exists |
|||
if not os.path.exists(f'{ STATIC_FOLDER_PATH }/images'): |
|||
os.makedirs(f'{ STATIC_FOLDER_PATH }/images') |
|||
|
|||
# download media files |
|||
for filename in images: |
|||
filename = filename.replace(' ', '_') # safe filenames |
|||
# check if the image is already downloaded |
|||
# if not, then download the file |
|||
if (not os.path.isfile(f'{ STATIC_FOLDER_PATH }/images/{ filename }')) or full_update: |
|||
# first we search for the full filename of the image |
|||
url = f'{ wiki }/api.php?action=query&list=allimages&aifrom={ filename }&format=json' |
|||
# url = f'{ wiki }/api.php?action=query&titles=File:{ filename }&format=json' |
|||
data = do_API_request(url) |
|||
# timestamp = data.query.pages. |
|||
|
|||
# print(json.dumps(data, indent=2)) |
|||
|
|||
if data and data['query']['allimages']: |
|||
|
|||
# we select the first search result |
|||
# (assuming that this is the image we are looking for) |
|||
image = data['query']['allimages'][0] |
|||
|
|||
if image: |
|||
# then we download the image |
|||
image_url = image['url'] |
|||
image_filename = image['name'] |
|||
print('Downloading:', image_filename) |
|||
image_response = urllib.request.urlopen(image_url).read() |
|||
|
|||
# and we save it as a file |
|||
image_path = f'{ STATIC_FOLDER_PATH }/images/{ image_filename }' |
|||
out = open(image_path, 'wb') |
|||
out.write(image_response) |
|||
out.close() |
|||
print(image_path) |
|||
|
|||
import time |
|||
time.sleep(3) # do not overload the server |
|||
|
|||
# replace src links |
|||
e_filename = re.escape( filename ) # needed for filename with certain characters |
|||
image_path = f'{ PUBLIC_STATIC_FOLDER_PATH }/images/{ filename }' # here the images need to link to the / of the domain, for flask :/// confusing! this breaks the whole idea to still be able to make a local copy of the file |
|||
matches = re.findall(rf'src=\"/wiki/mediawiki/images/.*?px-{ e_filename }\"', html) # for debugging |
|||
# pprint(matches) |
|||
if matches: |
|||
html = re.sub(rf'src=\"/wiki/mediawiki/images/.*?px-{ e_filename }\"', f'src=\"{ image_path }\"', html) |
|||
else: |
|||
matches = re.findall(rf'src=\"/wiki/mediawiki/images/.*?{ e_filename }\"', html) # for debugging |
|||
# print(matches, e_filename, html) |
|||
html = re.sub(rf'src=\"/wiki/mediawiki/images/.*?{ e_filename }\"', f'src=\"{ image_path }\"', html) |
|||
print(f'{filename}: {matches}\n------') # for debugging: each image should have the correct match! |
|||
|
|||
return html |
|||
|
|||
|
|||
def clean_up(html): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
# html = re.sub(r'\[.*edit.*\]', '', html) # remove the [edit] # Heerko: this somehow caused problems. Removing it solves it, seeming without side effects... |
|||
html = re.sub(r'href="/index.php\?title=', 'href="#', html) # remove the internal wiki links |
|||
html = re.sub(r'[(?=\d)', '', html) # remove left footnote bracket [ |
|||
html = re.sub(r'(?<=\d)]', '', html) # remove right footnote bracket ] |
|||
return html |
|||
|
|||
def remove_edit(soup): |
|||
""" |
|||
soup = BeautifSoup (HTML) |
|||
""" |
|||
es = soup.find_all(class_="mw-editsection") |
|||
for s in es: |
|||
s.decompose() |
|||
return soup |
|||
|
|||
|
|||
# inline citation references in the html for pagedjs |
|||
# Turns: <sup class="reference" id="cite_ref-1"><a href="#cite_note-1">[1]</a></sup> |
|||
# into: <span class="footnote">The cite text</span> |
|||
|
|||
def inlineCiteRefs(soup): |
|||
""" |
|||
soup = BeautifSoup (HTML) |
|||
""" |
|||
refs = soup.find_all("sup", class_="reference") |
|||
for ref in refs: |
|||
href = ref.a['href'] |
|||
res = re.findall('[0-9]+', href) |
|||
if(res): |
|||
cite = soup.find_all(id="cite_note-"+res[0]) |
|||
text = cite[0].find(class_="reference-text") |
|||
text['class'] = 'footnote' |
|||
ref.replace_with(text) |
|||
# remove the reference from the bottom of the document |
|||
for item in soup.find_all(class_="references"): |
|||
item.decompose() |
|||
return soup |
|||
|
|||
|
|||
def fast_loader(html): |
|||
""" |
|||
html = string (HTML) |
|||
""" |
|||
html = html.replace('/images/', '/images-small/') |
|||
print('--- rendered in FAST mode ---') |
|||
|
|||
return html |
@ -0,0 +1,19 @@ |
|||
{ |
|||
"project_name": "wiki-to-print", |
|||
"port": 5522, |
|||
"dir_path": ".", |
|||
"wiki": { |
|||
"base_url": "https://example.com/wiki/", |
|||
"subject_ns": { "name": "Pdf", "id": 3000 }, |
|||
"styles_ns": { "name": "PdfCSS", "id": 3001 } |
|||
}, |
|||
"pagename": "Test", |
|||
"stylesheet": "print.css", |
|||
"replacements": [ |
|||
{ |
|||
"type": "regex", |
|||
"search": "<h3><span class=\"mw-headline\" id=\"References.*?\">References</span><span class=\"mw-editsection\"><span class=\"mw-editsection-bracket\"></span></span></h3><ul>", |
|||
"replace": "<h3 class=\"references\"><span class=\"mw-headline\" id=\"References\">References</span><span class=\"mw-editsection\"><span class=\"mw-editsection-bracket\"></span></span></h3><ul class=\"references\">" |
|||
} |
|||
] |
|||
} |
@ -0,0 +1,5 @@ |
|||
import json |
|||
import pkg_resources |
|||
|
|||
data = pkg_resources.resource_string(__name__, "config.json") |
|||
config = json.loads(data) |
@ -0,0 +1,20 @@ |
|||
attrs==22.2.0 |
|||
beautifulsoup4==4.11.2 |
|||
bs4==0.0.1 |
|||
certifi==2022.12.7 |
|||
charset-normalizer==3.0.1 |
|||
click==8.1.3 |
|||
Flask==2.2.2 |
|||
gunicorn==20.1.0 |
|||
idna==3.4 |
|||
importlib-metadata==6.0.0 |
|||
itsdangerous==2.1.2 |
|||
Jinja2==3.1.2 |
|||
jsonschema==4.17.3 |
|||
MarkupSafe==2.1.2 |
|||
pyrsistent==0.19.3 |
|||
requests==2.28.2 |
|||
soupsieve==2.3.2.post1 |
|||
urllib3==1.26.14 |
|||
Werkzeug==2.2.2 |
|||
zipp==3.12.0 |
@ -0,0 +1,19 @@ |
|||
/* This baseline.css stylesheet is derived from: https://gist.github.com/julientaq/08d636a7a2b5f2824025256de0fca467 */ |
|||
/* Thanks a lot to julientaq for publishing it! */ |
|||
|
|||
:root { |
|||
--baseline: 18px; |
|||
--baseline-color: blue; |
|||
} |
|||
|
|||
/* grid baseline */ |
|||
.pagedjs_page { |
|||
/* background: |
|||
repeating-linear-gradient( |
|||
white 0, |
|||
white calc(var(--baseline) - 1px), var(--baseline-color) var(--baseline)); |
|||
background-size: cover; |
|||
background-repeat: repeat-y; */ |
|||
/* start of the first baseline: half of the line-height: 9px */ |
|||
/* background-position-y: 9px; */ |
|||
} |
@ -0,0 +1,83 @@ |
|||
@media screen{ |
|||
|
|||
body{ |
|||
background-color: lavender; |
|||
margin: 1vh 5vw 2vh 5vw; |
|||
z-index: 1; |
|||
} |
|||
p { |
|||
max-width: 30em; |
|||
} |
|||
div#nav{ |
|||
position: fixed; |
|||
width: calc(100% - 2em); |
|||
margin: 1em; |
|||
left: 0; |
|||
top: 0; |
|||
z-index: 999; |
|||
} |
|||
div#nav a#home, |
|||
div#nav a#notes, |
|||
div#nav a#update { |
|||
float: left; |
|||
padding: 0.25em 0.125em; |
|||
} |
|||
div#nav div#loading{ |
|||
display: none; |
|||
margin: 0.35em 0; |
|||
color: black; |
|||
clear: both; |
|||
float: right; |
|||
background-color: white; |
|||
padding: 0.5em 1em; |
|||
border-radius: 5px; |
|||
opacity: 0; |
|||
animation: fade 2s infinite linear; |
|||
} |
|||
@keyframes fade { |
|||
0%,100% { opacity: 0 } |
|||
50% { opacity: 1 } |
|||
} |
|||
table { |
|||
border-collapse: collapse; |
|||
} |
|||
table tr th { |
|||
font-weight: normal; |
|||
} |
|||
table tr th, |
|||
table tr td { |
|||
border: 1px solid darkgreen; |
|||
padding: 0.4em 0.8em; |
|||
} |
|||
table tr th:first-of-type { |
|||
text-align: left; |
|||
} |
|||
table tr td:first-of-type { |
|||
min-width: 15em; |
|||
} |
|||
span.updated { |
|||
font-family: monospace; |
|||
font-size: 0.9em; |
|||
} |
|||
table tr td input[type="checkbox"] { |
|||
min-width: 0; |
|||
width: unset; |
|||
} |
|||
|
|||
div#index{ |
|||
/* line-height: 2; */ |
|||
} |
|||
div#index ul{ |
|||
padding: 0; |
|||
margin: 0 0 0 2.5em; |
|||
width: 750px; |
|||
} |
|||
div#index ul li{ |
|||
list-style: none; |
|||
} |
|||
div#index ul li::before{ |
|||
content: "-----"; |
|||
float: left; |
|||
margin-left: -2.5em; |
|||
} |
|||
} |
@ -0,0 +1,214 @@ |
|||
/* CSS for Paged.js interface – v0.2 */ |
|||
|
|||
/* Change the look */ |
|||
:root { |
|||
--color-background: whitesmoke; |
|||
--color-pageSheet: #cfcfcf; |
|||
--color-pageBox: violet; |
|||
--color-paper: white; |
|||
--color-marginBox: transparent; |
|||
--pagedjs-crop-color: black; |
|||
--pagedjs-crop-shadow: white; |
|||
--pagedjs-crop-stroke: 1px; |
|||
} |
|||
|
|||
/* To define how the book look on the screen: */ |
|||
@media screen { |
|||
|
|||
/* adding this here from main.css to style the div#nav */ |
|||
div#nav{ |
|||
position: fixed; |
|||
width: calc(100% - 2em); |
|||
margin: 1em; |
|||
text-align: right; |
|||
left: 0; |
|||
top: 0; |
|||
z-index: 999; |
|||
} |
|||
div#nav a#home, |
|||
div#nav a#notes{ |
|||
float: left; |
|||
padding: 0.25em 0.125em; |
|||
} |
|||
div#nav div#loading{ |
|||
display: none; |
|||
margin: 0.35em 0; |
|||
color: black; |
|||
clear: both; |
|||
float: right; |
|||
background-color: white; |
|||
padding: 0.5em 1em; |
|||
border-radius: 5px; |
|||
opacity: 0; |
|||
animation: fade 2s infinite linear; |
|||
} |
|||
@keyframes fade { |
|||
0%,100% { opacity: 0 } |
|||
50% { opacity: 1 } |
|||
} |
|||
|
|||
|
|||
body { |
|||
background-color: var(--color-background); |
|||
} |
|||
|
|||
.pagedjs_pages { |
|||
display: flex; |
|||
width: calc(var(--pagedjs-width) * 2); |
|||
flex: 0; |
|||
flex-wrap: wrap; |
|||
margin: 0 auto; |
|||
} |
|||
|
|||
.pagedjs_page { |
|||
background-color: var(--color-paper); |
|||
box-shadow: 0 0 0 1px var(--color-pageSheet); |
|||
margin: 0; |
|||
flex-shrink: 0; |
|||
flex-grow: 0; |
|||
margin-top: 10mm; |
|||
} |
|||
|
|||
.pagedjs_first_page { |
|||
margin-left: var(--pagedjs-width); |
|||
} |
|||
|
|||
.pagedjs_page:last-of-type { |
|||
margin-bottom: 10mm; |
|||
} |
|||
|
|||
.pagedjs_pagebox{ |
|||
box-shadow: 0 0 0 1px var(--color-pageBox); |
|||
} |
|||
|
|||
.pagedjs_left_page{ |
|||
z-index: 20; |
|||
width: calc(var(--pagedjs-bleed-left) + var(--pagedjs-pagebox-width))!important; |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-crop { |
|||
border-color: transparent; |
|||
} |
|||
|
|||
.pagedjs_left_page .pagedjs_bleed-right .pagedjs_marks-middle{ |
|||
width: 0; |
|||
} |
|||
|
|||
.pagedjs_right_page{ |
|||
|