Compare commits

...

11 Commits

Author SHA1 Message Date
mb 5b52666d9f adding some markdown markup 5 years ago
mb 36da59aaf0 adding some markdown markup 5 years ago
mb b824daa420 Update 'README.md' 5 years ago
mb 3e492eefdb Update 'README.md' 5 years ago
mb 0e92ac740d Update 'README.md' 5 years ago
mb b628c6bb05 small change in the readme text 5 years ago
mb 1d2ded309f adding a , 5 years ago
manetta d0b8d337d5 removing the pump img 5 years ago
manetta cf03fafd0a renaming all files to etherpump + adding a etherpump readme 5 years ago
colm 95a021d405 adding python-dateutil to the requirements inside setup.py to enable pip install -e . when installing etherdump 5 years ago
Luke Murphy f9bb4444e2
Add `__PUBLISH__` logic 6 years ago
  1. 2
      .gitignore
  2. 145
      README.md
  3. 6
      bin/etherpump
  4. 10
      etherpump.egg-info/PKG-INFO
  5. 35
      etherpump.egg-info/SOURCES.txt
  6. 1
      etherpump.egg-info/dependency_links.txt
  7. 2
      etherpump.egg-info/requires.txt
  8. 1
      etherpump.egg-info/top_level.txt
  9. 0
      etherpump/__init__.py
  10. 0
      etherpump/commands/__init__.py
  11. 0
      etherpump/commands/appendmeta.py
  12. 0
      etherpump/commands/common.py
  13. 2
      etherpump/commands/creatediffhtml.py
  14. 2
      etherpump/commands/deletepad.py
  15. 2
      etherpump/commands/dumpcsv.py
  16. 2
      etherpump/commands/gethtml.py
  17. 2
      etherpump/commands/gettext.py
  18. 0
      etherpump/commands/html5tidy.py
  19. 10
      etherpump/commands/index.py
  20. 4
      etherpump/commands/init.py
  21. 0
      etherpump/commands/join.py
  22. 4
      etherpump/commands/list.py
  23. 2
      etherpump/commands/listauthors.py
  24. 324
      etherpump/commands/publication.py
  25. 15
      etherpump/commands/pull.py
  26. 2
      etherpump/commands/revisionscount.py
  27. 2
      etherpump/commands/sethtml.py
  28. 2
      etherpump/commands/settext.py
  29. 0
      etherpump/commands/showmeta.py
  30. 2
      etherpump/commands/status.py
  31. 0
      etherpump/data/templates/index.html
  32. 0
      etherpump/data/templates/pad.html
  33. 2
      etherpump/data/templates/pad_colors.html
  34. 0
      etherpump/data/templates/pad_index.html
  35. 42
      etherpump/data/templates/publication.html
  36. 0
      etherpump/data/templates/rss.xml
  37. 20
      setup.py

2
.gitignore

@ -4,4 +4,4 @@ build/
venv/ venv/
testing/ testing/
padinfo.json padinfo.json
.etherdump .etherpump

145
README.md

@ -1,41 +1,125 @@
etherdump etherpump
========= =========
Tool to publish [etherpad](http://etherpad.org/) pages to files. *pumping text from etherpads into publications*
A command-line utility that extends the multi writing and publishing functionalities of the [etherpad](http://etherpad.org/) by exporting the pads in multiple formats.
Many pads, many networks
------------------------
*Etherpump* is a fork of [*etherdump*](https://gitlab.constantvzw.org/aa/etherdump), a command line tool written by [Michael Murtaugh](http://automatist.org/) that converts etherpad pages to files. This fork is made out of curiosities in the tool, a wish to study it and shared sparks of enthusiasm to use it in different situations within Varia.
Etherpump is a stretched version of etherdump. It is a playground in which we would like to add features to the initial tool that diffuse actions of *dumping* into *pumping*. So most of all, etherpump is a work-in-progress, exploring potential uses of etherpads to edit, structure and publish various types of content.
Added features are:
* opt-in publishing with the `__PUBLISH__` magic word
* the `publication` command, that listens to custom magic words such as `__RELEARN__`
Etherdump is a tool that is used from the command line. It dumps all pads of one etherpad installation to a folder, saving them as different text files, such as plain text and HTML. It also creates an index file, that allows one to easily navigate through the list of pads. Etherdump follows a document-driven idea of publishing, which means that it converts pads as database entries into pads as files. This seems to be a redundant act of copying, but is actually an important in-between step that allows for many different publishing projects and experiments.
We started to get to know etherdump through various editions of Relearn and/or the worksessions organized by Constant. Collaborative writing on an etherpad has been an important ingredient for these situations. The habit of using pads branched into the day-to-day practice of Varia, where we use etherpads for all sorts of things, ranging from organising remote-meetings with 10+ people, to writing and designing PDF documents collaboratively.
After installing etherdump on the Varia server, we collectively decided to not want to publish pads by default. Discussions in the group around the use of etherpads, privacy and ideas of what publishing means, led to a need to have etherdump only start the indexing work after it recognizes a `__PUBLISH__` marker on a pad. We decided to work on a `__PUBLISH__ vs. __NOPUBLISH__` branch of etherdump, which we now fork into **etherpump**.
Change log / notes
==================
**September 2019**
Forking *etherdump* into *etherpump*. (Work in progress!)
<https://git.vvvvvvaria.org/varia/etherpump>
-----
**May - September 2019**
Etherdump is used to produce the *Ruminating Relearn* section of the Network Of One's Own 2 (NOOO2) publication.
A new command is added to make a web publication, based on the custom magic word `__RELEARN__`.
-----
**June 2019**
Multiple conversations around etherdump emerged during Relearn Curved in Varia, Rotterdam.
Including the idea of executable pads (*etherhooks*), custom magic words, a federated snippet protocol (*etherstekje*) and more.
<https://varia.zone/relearn-2019.html>
-----
**April 2019**
Installation of etherdump on the Varia server.
<https://etherdump.vvvvvvaria.org/>
-----
**March 2019**
The `__PUBLISH__ vs. __NOPUBLISH__` was added to the etherdump repository by *decentral1se*.
<https://gitlab.constantvzw.org/aa/etherdump/issues/3>
-----
Originally designed for use at: [Constant](http://etherdump.constantvzw.org/).
More notes can be found in the [git repository of etherdump](https://gitlab.constantvzw.org/aa/etherdump).
Install etherpump
=================
Requirements Requirements
------------- -------------
* python3
* html5lib * python3
* requests (settext) * html5lib
* python-dateutil, jinja2 (index subcommand) * requests (settext)
* python-dateutil, jinja2 (used by the index subcommand)
Installation Installation
------------- -------------
pip install python-dateutil jinja2 html5lib `$ pip install python-dateutil jinja2 html5lib`
python setup.py install
`$ python setup.py install`
Example Example
--------------- ---------------
mkdir mydump ```
cd myddump $ mkdir mydump
etherdump init $ cd myddump
$ etherdump init
```
The program then interactively asks some questions: The program then interactively asks some questions:
```
Please type the URL of the etherpad: Please type the URL of the etherpad:
http://automatist.local:9001/
The APIKEY is the contents of the file APIKEY.txt in the etherpad folder
Please paste the APIKEY:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
The settings are placed in a file called .etherdump/settings.json and are used (by default) by future commands. https://pad.vvvvvvaria.org/
```
The APIKEY is the contents of the file APIKEY.txt in the etherpad folder.
```
Please paste the APIKEY:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```
subcommands The settings are placed in a file called `.etherdump/settings.json` and are used (by default) by future commands.
Subcommands
---------- ----------
* init * init
@ -49,33 +133,18 @@ subcommands
* revisionscount * revisionscount
* index * index
* deletepad * deletepad
* publication (*etherpump*)
To get help on a subcommand: To get help on a subcommand:
etherdump revisionscount --help `$ etherdump revisionscount --help`
Change log / notes
=======================
Originally designed for use at: [constant](http://etherdump.constantvzw.org/). License
=======
GNU AFFERO GENERAL PUBLIC LICENSE, Version 3
17 Oct 2016 See `License.txt`
-----------------------------------------------
Preparations for [Machine Research](https://machineresearch.wordpress.com/) [2](http://constantvzw.org/site/Machine-Research,2646.html)
6 Oct 2017
----------------------
Feature request from PW: When deleting a previously public document, generate a page / pages with an explanation (along the lines of "This document was previously public but has been marked .... maybe give links to search").
3 Nov 2017
---------------
machineresearch seems to be __NOPUBLISH__ but still exists (also in recentchanges)
Jan 2018
-------------
Updated files to work with python3 (probably this has broken python2).

6
bin/etherdump → bin/etherpump

@ -4,7 +4,7 @@ from __future__ import print_function
import sys import sys
usage = """Usage: usage = """Usage:
etherdump CMD etherpump CMD
where CMD could be: where CMD could be:
pull pull
@ -20,7 +20,7 @@ where CMD could be:
html5tidy html5tidy
For more information on each command try: For more information on each command try:
etherdump CMD --help etherpump CMD --help
""" """
@ -36,7 +36,7 @@ except IndexError:
sys.exit(0) sys.exit(0)
try: try:
# http://stackoverflow.com/questions/301134/dynamic-module-import-in-python # http://stackoverflow.com/questions/301134/dynamic-module-import-in-python
cmdmod = __import__("etherdump.commands.%s" % cmd, fromlist=["etherdump.commands"]) cmdmod = __import__("etherpump.commands.%s" % cmd, fromlist=["etherdump.commands"])
cmdmod.main(args) cmdmod.main(args)
except ImportError as e: except ImportError as e:
print ("Error performing command '{0}'\n(python said: {1})\n".format(cmd, e)) print ("Error performing command '{0}'\n(python said: {1})\n".format(cmd, e))

10
etherpump.egg-info/PKG-INFO

@ -0,0 +1,10 @@
Metadata-Version: 1.0
Name: etherpump
Version: 0.0.1
Summary: Etherpump an etherpad publishing system
Home-page: https://git.vvvvvvaria.org/varia/etherpump
Author: Varia members
Author-email: info@varia.zone
License: LICENSE.txt
Description: UNKNOWN
Platform: UNKNOWN

35
etherpump.egg-info/SOURCES.txt

@ -0,0 +1,35 @@
README.md
setup.py
bin/etherpump
etherpump/__init__.py
etherpump.egg-info/PKG-INFO
etherpump.egg-info/SOURCES.txt
etherpump.egg-info/dependency_links.txt
etherpump.egg-info/requires.txt
etherpump.egg-info/top_level.txt
etherpump/commands/__init__.py
etherpump/commands/appendmeta.py
etherpump/commands/common.py
etherpump/commands/creatediffhtml.py
etherpump/commands/deletepad.py
etherpump/commands/dumpcsv.py
etherpump/commands/gethtml.py
etherpump/commands/gettext.py
etherpump/commands/html5tidy.py
etherpump/commands/index.py
etherpump/commands/init.py
etherpump/commands/join.py
etherpump/commands/list.py
etherpump/commands/listauthors.py
etherpump/commands/publication.py
etherpump/commands/pull.py
etherpump/commands/revisionscount.py
etherpump/commands/sethtml.py
etherpump/commands/settext.py
etherpump/commands/showmeta.py
etherpump/commands/status.py
etherpump/data/templates/index.html
etherpump/data/templates/pad.html
etherpump/data/templates/pad_colors.html
etherpump/data/templates/pad_index.html
etherpump/data/templates/rss.xml

1
etherpump.egg-info/dependency_links.txt

@ -0,0 +1 @@

2
etherpump.egg-info/requires.txt

@ -0,0 +1,2 @@
html5lib
jinja2

1
etherpump.egg-info/top_level.txt

@ -0,0 +1 @@
etherpump

0
etherdump/__init__.py → etherpump/__init__.py

0
etherdump/commands/__init__.py → etherpump/commands/__init__.py

0
etherdump/commands/appendmeta.py → etherpump/commands/appendmeta.py

0
etherdump/commands/common.py → etherpump/commands/common.py

2
etherdump/commands/creatediffhtml.py → etherpump/commands/creatediffhtml.py

@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError
def main(args): def main(args):
p = ArgumentParser("calls the createDiffHTML API function for the given padid") p = ArgumentParser("calls the createDiffHTML API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
p.add_argument("--rev", type=int, default=None, help="revision, default: latest") p.add_argument("--rev", type=int, default=None, help="revision, default: latest")

2
etherdump/commands/deletepad.py → etherpump/commands/deletepad.py

@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError
def main(args): def main(args):
p = ArgumentParser("calls the getText API function for the given padid") p = ArgumentParser("calls the getText API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
args = p.parse_args(args) args = p.parse_args(args)

2
etherdump/commands/dumpcsv.py → etherpump/commands/dumpcsv.py

@ -30,7 +30,7 @@ def jsonload (url):
def main (args): def main (args):
p = ArgumentParser("outputs a CSV of information all all pads") p = ArgumentParser("outputs a CSV of information all all pads")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False")
args = p.parse_args(args) args = p.parse_args(args)

2
etherdump/commands/gethtml.py → etherpump/commands/gethtml.py

@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError
def main(args): def main(args):
p = ArgumentParser("calls the getHTML API function for the given padid") p = ArgumentParser("calls the getHTML API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
p.add_argument("--rev", type=int, default=None, help="revision, default: latest") p.add_argument("--rev", type=int, default=None, help="revision, default: latest")

2
etherdump/commands/gettext.py → etherpump/commands/gettext.py

@ -14,7 +14,7 @@ except ImportError:
def main(args): def main(args):
p = ArgumentParser("calls the getText API function for the given padid") p = ArgumentParser("calls the getText API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
p.add_argument("--rev", type=int, default=None, help="revision, default: latest") p.add_argument("--rev", type=int, default=None, help="revision, default: latest")

0
etherdump/commands/html5tidy.py → etherpump/commands/html5tidy.py

10
etherdump/commands/index.py → etherpump/commands/index.py

@ -15,13 +15,13 @@ except ImportError:
from urllib.request import urlopen, URLError, HTTPError from urllib.request import urlopen, URLError, HTTPError
from jinja2 import FileSystemLoader, Environment from jinja2 import FileSystemLoader, Environment
from etherdump.commands.common import * from etherpump.commands.common import *
from time import sleep from time import sleep
import dateutil.parser import dateutil.parser
""" """
index: index:
Generate pages from etherdumps using a template. Generate pages from etherpumps using a template.
Built-in templates: rss.xml, index.html Built-in templates: rss.xml, index.html
@ -87,7 +87,7 @@ def main (args):
p.add_argument("--templatepath", default=None, help="path to find templates, default: built-in") p.add_argument("--templatepath", default=None, help="path to find templates, default: built-in")
p.add_argument("--template", default="index.html", help="template name, built-ins include index.html, rss.xml; default: index.html") p.add_argument("--template", default="index.html", help="template name, built-ins include index.html, rss.xml; default: index.html")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: ./.etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: ./.etherdump/settings.json")
# p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") # p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)")
p.add_argument("--order", default="padid", help="order, possible values: padid, pad (no group name), lastedited, (number of) authors, revisions, default: padid") p.add_argument("--order", default="padid", help="order, possible values: padid, pad (no group name), lastedited, (number of) authors, revisions, default: padid")
@ -105,12 +105,12 @@ def main (args):
pg = p.add_argument_group('template variables') pg = p.add_argument_group('template variables')
pg.add_argument("--feedurl", default="feed.xml", help="rss: to use as feeds own (self) link, default: feed.xml") pg.add_argument("--feedurl", default="feed.xml", help="rss: to use as feeds own (self) link, default: feed.xml")
pg.add_argument("--siteurl", default=None, help="rss: to use as channel's site link, default: the etherpad url") pg.add_argument("--siteurl", default=None, help="rss: to use as channel's site link, default: the etherpad url")
pg.add_argument("--title", default="etherdump", help="title for document or rss feed channel title, default: etherdump") pg.add_argument("--title", default="etherpump", help="title for document or rss feed channel title, default: etherdump")
pg.add_argument("--description", default="", help="rss: channel description, default: empty") pg.add_argument("--description", default="", help="rss: channel description, default: empty")
pg.add_argument("--language", default="en-US", help="rss: feed language, default: en-US") pg.add_argument("--language", default="en-US", help="rss: feed language, default: en-US")
pg.add_argument("--updatePeriod", default="daily", help="rss: updatePeriod, possible values: hourly, daily, weekly, monthly, yearly; default: daily") pg.add_argument("--updatePeriod", default="daily", help="rss: updatePeriod, possible values: hourly, daily, weekly, monthly, yearly; default: daily")
pg.add_argument("--updateFrequency", default=1, type=int, help="rss: update frequency within the update period (where 2 would mean twice per period); default: 1") pg.add_argument("--updateFrequency", default=1, type=int, help="rss: update frequency within the update period (where 2 would mean twice per period); default: 1")
pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherdump", help="generator, default: https://gitlab.com/activearchives/etherdump") pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherpump", help="generator, default: https://gitlab.com/activearchives/etherdump")
pg.add_argument("--timestamp", default=None, help="timestamp, default: now (e.g. 2015-12-01 12:30:00)") pg.add_argument("--timestamp", default=None, help="timestamp, default: now (e.g. 2015-12-01 12:30:00)")
pg.add_argument("--next", default=None, help="next link, default: None)") pg.add_argument("--next", default=None, help="next link, default: None)")
pg.add_argument("--prev", default=None, help="prev link, default: None") pg.add_argument("--prev", default=None, help="prev link, default: None")

4
etherdump/commands/init.py → etherpump/commands/init.py

@ -69,7 +69,7 @@ def tryapiurl (url, verbose=False):
print ("URLError", e, file=sys.stderr) print ("URLError", e, file=sys.stderr)
def main(args): def main(args):
p = ArgumentParser("initialize an etherdump folder") p = ArgumentParser("initialize an etherpump folder")
p.add_argument("arg", nargs="*", default=[], help="optional positional args: path etherpadurl") p.add_argument("arg", nargs="*", default=[], help="optional positional args: path etherpadurl")
p.add_argument("--path", default=None, help="path to initialize") p.add_argument("--path", default=None, help="path to initialize")
p.add_argument("--padurl", default=None, help="") p.add_argument("--padurl", default=None, help="")
@ -85,7 +85,7 @@ def main(args):
if not path: if not path:
path = "." path = "."
edpath = os.path.join(path, ".etherdump") edpath = os.path.join(path, ".etherpump")
try: try:
os.makedirs(edpath) os.makedirs(edpath)
except OSError: except OSError:

0
etherdump/commands/join.py → etherpump/commands/join.py

4
etherdump/commands/list.py → etherpump/commands/list.py

@ -2,7 +2,7 @@ from __future__ import print_function
from argparse import ArgumentParser from argparse import ArgumentParser
import json import json
import sys import sys
from etherdump.commands.common import getjson from etherpump.commands.common import getjson
try: try:
# python2 # python2
from urlparse import urlparse, urlunparse from urlparse import urlparse, urlunparse
@ -16,7 +16,7 @@ except ImportError:
def main (args): def main (args):
p = ArgumentParser("call listAllPads and print the results") p = ArgumentParser("call listAllPads and print the results")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="lines", help="output format: lines, json; default lines") p.add_argument("--format", default="lines", help="output format: lines, json; default lines")
args = p.parse_args(args) args = p.parse_args(args)

2
etherdump/commands/listauthors.py → etherpump/commands/listauthors.py

@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError
def main(args): def main(args):
p = ArgumentParser("call listAuthorsOfPad for the padid") p = ArgumentParser("call listAuthorsOfPad for the padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
p.add_argument("--format", default="lines", help="output format, can be: lines, json; default: lines") p.add_argument("--format", default="lines", help="output format, can be: lines, json; default: lines")
args = p.parse_args(args) args = p.parse_args(args)

324
etherpump/commands/publication.py

@ -0,0 +1,324 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re, os, time
from datetime import datetime
import dateutil.parser
import pypandoc
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urlparse import urlparse, urlunparse
except ImportError:
# python3
from urllib.parse import urlparse, urlunparse, urlencode, quote
from urllib.request import urlopen, URLError, HTTPError
from jinja2 import FileSystemLoader, Environment
from etherpump.commands.common import *
from time import sleep
import dateutil.parser
"""
publication:
Generate a single document from etherpumps using a template.
Built-in templates: publication.html
"""
def group (items, key=lambda x: x):
""" returns a list of lists, of items grouped by a key function """
ret = []
keys = {}
for item in items:
k = key(item)
if k not in keys:
keys[k] = []
keys[k].append(item)
for k in sorted(keys):
keys[k].sort()
ret.append(keys[k])
return ret
# def base (x):
# return re.sub(r"(\.raw\.html)|(\.diff\.html)|(\.meta\.json)|(\.raw\.txt)$", "", x)
def splitextlong (x):
""" split "long" extensions, i.e. foo.bar.baz => ('foo', '.bar.baz') """
m = re.search(r"^(.*?)(\..*)$", x)
if m:
return m.groups()
else:
return x, ''
def base (x):
return splitextlong(x)[0]
def excerpt (t, chars=25):
if len(t) > chars:
t = t[:chars] + "..."
return t
def absurl (url, base=None):
if not url.startswith("http"):
return base + url
return url
def url_base (url):
(scheme, netloc, path, params, query, fragment) = urlparse(url)
path, _ = os.path.split(path.lstrip("/"))
ret = urlunparse((scheme, netloc, path, None, None, None))
if ret:
ret += "/"
return ret
def datetimeformat (t, format='%Y-%m-%d %H:%M:%S'):
if type(t) == str:
dt = dateutil.parser.parse(t)
return dt.strftime(format)
else:
return time.strftime(format, time.localtime(t))
def main (args):
p = ArgumentParser("Convert dumped files to a document via a template.")
p.add_argument("input", nargs="+", help="Files to list (.meta.json files)")
p.add_argument("--templatepath", default=None, help="path to find templates, default: built-in")
p.add_argument("--template", default="publication.html", help="template name, built-ins include publication.html; default: publication.html")
p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: ./.etherdump/settings.json")
# p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)")
p.add_argument("--order", default="padid", help="order, possible values: padid, pad (no group name), lastedited, (number of) authors, revisions, default: padid")
p.add_argument("--reverse", default=False, action="store_true", help="reverse order, default: False (reverse chrono)")
p.add_argument("--limit", type=int, default=0, help="limit to number of items, default: 0 (no limit)")
p.add_argument("--skip", default=None, type=int, help="skip this many items, default: None")
p.add_argument("--content", default=False, action="store_true", help="rss: include (full) content tag, default: False")
p.add_argument("--link", default="diffhtml,html,text", help="link variable will be to this version, can be comma-delim list, use first avail, default: diffhtml,html,text")
p.add_argument("--linkbase", default=None, help="base url to use for links, default: try to use the feedurl")
p.add_argument("--output", default=None, help="output, default: stdout")
p.add_argument("--files", default=False, action="store_true", help="include files (experimental)")
pg = p.add_argument_group('template variables')
pg.add_argument("--feedurl", default="feed.xml", help="rss: to use as feeds own (self) link, default: feed.xml")
pg.add_argument("--siteurl", default=None, help="rss: to use as channel's site link, default: the etherpad url")
pg.add_argument("--title", default="etherpump", help="title for document or rss feed channel title, default: etherdump")
pg.add_argument("--description", default="", help="rss: channel description, default: empty")
pg.add_argument("--language", default="en-US", help="rss: feed language, default: en-US")
pg.add_argument("--updatePeriod", default="daily", help="rss: updatePeriod, possible values: hourly, daily, weekly, monthly, yearly; default: daily")
pg.add_argument("--updateFrequency", default=1, type=int, help="rss: update frequency within the update period (where 2 would mean twice per period); default: 1")
pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherpump", help="generator, default: https://gitlab.com/activearchives/etherdump")
pg.add_argument("--timestamp", default=None, help="timestamp, default: now (e.g. 2015-12-01 12:30:00)")
pg.add_argument("--next", default=None, help="next link, default: None)")
pg.add_argument("--prev", default=None, help="prev link, default: None")
args = p.parse_args(args)
tmpath = args.templatepath
# Default path for template is the built-in data/templates
if tmpath == None:
tmpath = os.path.split(os.path.abspath(__file__))[0]
tmpath = os.path.split(tmpath)[0]
tmpath = os.path.join(tmpath, "data", "templates")
env = Environment(loader=FileSystemLoader(tmpath))
env.filters["excerpt"] = excerpt
env.filters["datetimeformat"] = datetimeformat
template = env.get_template(args.template)
info = loadpadinfo(args.padinfo)
inputs = args.input
inputs.sort()
# Use "base" to strip (longest) extensions
# inputs = group(inputs, base)
def wrappath (p):
path = "./{0}".format(p)
ext = os.path.splitext(p)[1][1:]
return {
"url": path,
"path": path,
"code": 200,
"type": ext
}
def metaforpaths (paths):
ret = {}
pid = base(paths[0])
ret['pad'] = ret['padid'] = pid
ret['versions'] = [wrappath(x) for x in paths]
lastedited = None
for p in paths:
mtime = os.stat(p).st_mtime
if lastedited == None or mtime > lastedited:
lastedited = mtime
ret["lastedited_iso"] = datetime.fromtimestamp(lastedited).strftime("%Y-%m-%dT%H:%M:%S")
ret["lastedited_raw"] = mtime
return ret
def loadmeta(p):
# Consider a set of grouped files
# Otherwise, create a "dummy" one that wraps all the files as versions
if p.endswith(".meta.json"):
with open(p) as f:
return json.load(f)
# # IF there is a .meta.json, load it & MERGE with other files
# if ret:
# # TODO: merge with other files
# for p in paths:
# if "./"+p not in ret['versions']:
# ret['versions'].append(wrappath(p))
# return ret
# else:
# return metaforpaths(paths)
def fixdates (padmeta):
d = dateutil.parser.parse(padmeta["lastedited_iso"])
padmeta["lastedited"] = d
padmeta["lastedited_822"] = d.strftime("%a, %d %b %Y %H:%M:%S +0000")
return padmeta
pads = map(loadmeta, inputs)
pads = [x for x in pads if x != None]
pads = map(fixdates, pads)
args.pads = list(pads)
def could_have_base (x, y):
return x == y or (x.startswith(y) and x[len(y):].startswith("."))
def get_best_pad (x):
for pb in padbases:
p = pads_by_base[pb]
if could_have_base(x, pb):
return p
def has_version (padinfo, path):
return [x for x in padinfo['versions'] if 'path' in x and x['path'] == "./"+path]
if args.files:
inputs = args.input
inputs.sort()
removelist = []
pads_by_base = {}
for p in args.pads:
# print ("Trying padid", p['padid'], file=sys.stderr)
padbase = os.path.splitext(p['padid'])[0]
pads_by_base[padbase] = p
padbases = list(pads_by_base.keys())
# SORT THEM LONGEST FIRST TO ensure that LONGEST MATCHES MATCH
padbases.sort(key=lambda x: len(x), reverse=True)
# print ("PADBASES", file=sys.stderr)
# for pb in padbases:
# print (" ", pb, file=sys.stderr)
print ("pairing input files with pads", file=sys.stderr)
for x in inputs:
# pair input with a pad if possible
xbasename = os.path.basename(x)
p = get_best_pad(xbasename)
if p:
if not has_version(p, x):
print ("Grouping file {0} with pad {1}".format(x, p['padid']), file=sys.stderr)
p['versions'].append(wrappath(x))
else:
print ("Skipping existing version {0} ({1})...".format(x, p['padid']), file=sys.stderr)
removelist.append(x)
# Removed Matches files
for x in removelist:
inputs.remove(x)
print ("Remaining files:", file=sys.stderr)
for x in inputs:
print (x, file=sys.stderr)
print (file=sys.stderr)
# Add "fake" pads for remaining files
for x in inputs:
args.pads.append(metaforpaths([x]))
if args.timestamp == None:
args.timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
padurlbase = re.sub(r"api/1.2.9/$", "p/", info["apiurl"])
# if type(padurlbase) == unicode:
# padurlbase = padurlbase.encode("utf-8")
args.siteurl = args.siteurl or padurlbase
args.utcnow = datetime.utcnow().strftime("%a, %d %b %Y %H:%M:%S +0000")
# order items & apply limit
if args.order == "lastedited":
args.pads.sort(key=lambda x: x.get("lastedited_iso"), reverse=args.reverse)
elif args.order == "pad":
args.pads.sort(key=lambda x: x.get("pad"), reverse=args.reverse)
elif args.order == "padid":
args.pads.sort(key=lambda x: x.get("padid"), reverse=args.reverse)
elif args.order == "revisions":
args.pads.sort(key=lambda x: x.get("revisions"), reverse=args.reverse)
elif args.order == "authors":
args.pads.sort(key=lambda x: len(x.get("authors")), reverse=args.reverse)
elif args.order == "custom":
# TODO: make this list non-static, but a variable that can be given from the CLI
customorder = [
'nooo.relearn.preamble',
'nooo.relearn.activating.the.archive',
'nooo.relearn.call.for.proposals',
'nooo.relearn.call.for.proposals-proposal-footnote',
'nooo.relearn.colophon']
order = []
for x in customorder:
for pad in args.pads:
if pad["padid"] == x:
order.append(pad)
args.pads = order
else:
raise Exception("That ordering is not implemented!")
if args.limit:
args.pads = args.pads[:args.limit]
# add versions_by_type, add in full text
# add link (based on args.link)
linkversions = args.link.split(",")
linkbase = args.linkbase or url_base(args.feedurl)
# print ("linkbase", linkbase, args.linkbase, args.feedurl)
for p in args.pads:
versions_by_type = {}
p["versions_by_type"] = versions_by_type
for v in p["versions"]:
t = v["type"]
versions_by_type[t] = v
if "text" in versions_by_type:
# try:
with open (versions_by_type["text"]["path"]) as f:
content = f.read()
# print('content:', content)
# [Relearn] Add pandoc command here?
html = pypandoc.convert_text(content, 'html', format='md')
# print('html:', html)
p["text"] = html
# except FileNotFoundError:
# p['text'] = 'ERROR'
# ADD IN LINK TO PAD AS "link"
for v in linkversions:
if v in versions_by_type:
vdata = versions_by_type[v]
try:
if v == "pad" or os.path.exists(vdata["path"]):
p["link"] = absurl(vdata["url"], linkbase)
break
except KeyError as e:
pass
if args.output:
with open(args.output, "w") as f:
print (template.render(vars(args)), file=f)
else:
print (template.render(vars(args)))

15
etherdump/commands/pull.py → etherpump/commands/pull.py

@ -12,9 +12,9 @@ except ImportError:
from urllib.parse import urlencode, quote from urllib.parse import urlencode, quote
from urllib.request import urlopen, URLError, HTTPError from urllib.request import urlopen, URLError, HTTPError
from etherdump.commands.common import * from etherpump.commands.common import *
from time import sleep from time import sleep
from etherdump.commands.html5tidy import html5tidy from etherpump.commands.html5tidy import html5tidy
import html5lib import html5lib
from xml.etree import ElementTree as ET from xml.etree import ElementTree as ET
from fnmatch import fnmatch from fnmatch import fnmatch
@ -47,7 +47,7 @@ def main (args):
p.add_argument("padid", nargs="*", default=[]) p.add_argument("padid", nargs="*", default=[])
p.add_argument("--glob", default=False, help="download pads matching a glob pattern") p.add_argument("--glob", default=False, help="download pads matching a glob pattern")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherpump/settings.json")
p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)")
p.add_argument("--pub", default="p", help="folder to store files for public pads, default: p") p.add_argument("--pub", default="p", help="folder to store files for public pads, default: p")
p.add_argument("--group", default="g", help="folder to store files for group pads, default: g") p.add_argument("--group", default="g", help="folder to store files for group pads, default: g")
@ -69,6 +69,8 @@ def main (args):
p.add_argument("--script", default="/versions.js", help="add script url to output pages, default: /versions.js") p.add_argument("--script", default="/versions.js", help="add script url to output pages, default: /versions.js")
p.add_argument("--nopublish", default="__NOPUBLISH__", help="no publish magic word, default: __NOPUBLISH__") p.add_argument("--nopublish", default="__NOPUBLISH__", help="no publish magic word, default: __NOPUBLISH__")
p.add_argument("--publish", default="__PUBLISH__", help="the publish magic word, default: __PUBLISH__")
p.add_argument("--publish-opt-in", default=False, action="store_true", help="ensure `--publish` is honoured instead of `--nopublish`")
args = p.parse_args(args) args = p.parse_args(args)
@ -187,6 +189,13 @@ def main (args):
try_deleting((p+raw_ext,p+".raw.html",p+".diff.html",p+".meta.json")) try_deleting((p+raw_ext,p+".raw.html",p+".diff.html",p+".meta.json"))
continue continue
##########################################
## ENFORCE __PUBLISH__ MAGIC WORD
##########################################
if args.publish_opt_in and args.publish not in text:
try_deleting((p+raw_ext,p+".raw.html",p+".diff.html",p+".meta.json"))
continue
ver["path"] = p+raw_ext ver["path"] = p+raw_ext
ver["url"] = quote(ver["path"]) ver["url"] = quote(ver["path"])
with open(ver["path"], "w") as f: with open(ver["path"], "w") as f:

2
etherdump/commands/revisionscount.py → etherpump/commands/revisionscount.py

@ -7,7 +7,7 @@ from urllib2 import urlopen, HTTPError, URLError
def main(args): def main(args):
p = ArgumentParser("call getRevisionsCount for the given padid") p = ArgumentParser("call getRevisionsCount for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
args = p.parse_args(args) args = p.parse_args(args)

2
etherdump/commands/sethtml.py → etherpump/commands/sethtml.py

@ -12,7 +12,7 @@ def main(args):
p = ArgumentParser("calls the setHTML API function for the given padid") p = ArgumentParser("calls the setHTML API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--html", default=None, help="html, default: read from stdin") p.add_argument("--html", default=None, help="html, default: read from stdin")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
# p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") # p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary") p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary")

2
etherdump/commands/settext.py → etherpump/commands/settext.py

@ -20,7 +20,7 @@ def main(args):
p = ArgumentParser("calls the getText API function for the given padid") p = ArgumentParser("calls the getText API function for the given padid")
p.add_argument("padid", help="the padid") p.add_argument("padid", help="the padid")
p.add_argument("--text", default=None, help="text, default: read from stdin") p.add_argument("--text", default=None, help="text, default: read from stdin")
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--showurl", default=False, action="store_true")
# p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") # p.add_argument("--format", default="text", help="output format, can be: text, json; default: text")
p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary") p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary")

0
etherdump/commands/showmeta.py → etherpump/commands/showmeta.py

2
etherdump/commands/status.py → etherpump/commands/status.py

@ -61,7 +61,7 @@ def ignore_p (path, settings=None):
def main (args): def main (args):
p = ArgumentParser("Check for pads that have changed since last sync (according to .meta.json)") p = ArgumentParser("Check for pads that have changed since last sync (according to .meta.json)")
# p.add_argument("padid", nargs="*", default=[]) # p.add_argument("padid", nargs="*", default=[])
p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json")
p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)")
p.add_argument("--pub", default=".", help="folder to store files for public pads, default: pub") p.add_argument("--pub", default=".", help="folder to store files for public pads, default: pub")
p.add_argument("--group", default="g", help="folder to store files for group pads, default: g") p.add_argument("--group", default="g", help="folder to store files for group pads, default: g")

0
etherdump/data/templates/index.html → etherpump/data/templates/index.html

0
etherdump/data/templates/pad.html → etherpump/data/templates/pad.html

2
etherdump/data/templates/pad_colors.html → etherpump/data/templates/pad_colors.html

@ -10,7 +10,7 @@
<body> <body>
{{ html }} {{ html }}
<div class="etherdump_version_links"> <div class="etherpump_version_links">
Pad last edited {{lastedited}}; other versions: <a href="{{raw_url}}">text-only</a> <a href="{{meta_url}}">metadata</a> Pad last edited {{lastedited}}; other versions: <a href="{{raw_url}}">text-only</a> <a href="{{meta_url}}">metadata</a>
</div> </div>
</body> </body>

0
etherdump/data/templates/pad_index.html → etherpump/data/templates/pad_index.html

42
etherpump/data/templates/publication.html

@ -0,0 +1,42 @@
<!DOCTYPE html>
<html lang="{{language}}">
<!-- __RELEARN__ -->
<head>
<meta charset="utf-8" />
<title>{{title}}</title>
<link rel="stylesheet" type="text/css" href="{%block css %}publication.assets/publication.css{%endblock%}">
<link rel="stylesheet" type="text/css" href="publication.assets/normalise.css">
<link rel="alternate" type="application/rss+xml" href="recentchanges.rss">
</head>
<body>
<h1>{{ title }}</h1>
<div id="toc">
<p>Table of Contents</p>
<ol>
{% for pad in pads %}
<li class="name">
<a href="#{{ pad.padid }}">{{ pad.padid }}</a>
</li>
{% endfor %}
</ol>
</div>
<img id="coverimg" src="publication.assets/rumination.svg">
{% for pad in pads %}
<hr>
<div id="{{ pad.padid }}" data-padid="{{ pad.padid }}" class="pad">
<small class="lastedited">Last edited: {{ pad.lastedited_iso|datetimeformat }}</small><br>
<small class="revisions">Revisions: {{ pad.revisions }}</small><br>
<small class="padname"><a href="{{ pad.link }}">{{ pad.pathbase }}</a></small><br>
<!-- <small class="authors">Authors: {{ pad.author_ids|length }}</small><br> -->
<div class="padcontent">{{ pad.text }}</div>
</div>
{% endfor %}
{% block info %}<hr><small class="info">Last update {{timestamp}}.</small>{% endblock %}
<div id="footer"></div>
</body>
</html>

0
etherdump/data/templates/rss.xml → etherpump/data/templates/rss.xml

20
setup.py

@ -17,18 +17,18 @@ def find (p, d):
return ret return ret
setup( setup(
name='etherdump', name='etherpump',
version='0.3.0', version='0.0.1',
author='Active Archives Contributors', author='Varia members',
author_email='mm@automatist.org', author_email='info@varia.zone',
packages=['etherdump', 'etherdump.commands'], packages=['etherpump', 'etherpump.commands'],
package_dir={'etherdump': 'etherdump'}, package_dir={'etherpump': 'etherpump'},
#package_data={'activearchives': find("activearchives", "templates/") + find("activearchives", "data/")}, #package_data={'activearchives': find("activearchives", "templates/") + find("activearchives", "data/")},
package_data={'etherdump': find("etherdump", "data/")}, package_data={'etherpump': find("etherpump", "data/")},
scripts=['bin/etherdump'], scripts=['bin/etherpump'],
url='http://activearchives.org/wiki/Etherdump', url='https://git.vvvvvvaria.org/varia/etherpump',
license='LICENSE.txt', license='LICENSE.txt',
description='Etherdump an etherpad publishing & archiving system', description='Etherpump an etherpad publishing system',
# long_description=open('README.md').read(), # long_description=open('README.md').read(),
install_requires=[ install_requires=[
"html5lib", "jinja2" "html5lib", "jinja2"

Loading…
Cancel
Save