diff --git a/.gitignore b/.gitignore index 4b37ad7..7410b00 100644 --- a/.gitignore +++ b/.gitignore @@ -4,4 +4,4 @@ build/ venv/ testing/ padinfo.json -.etherdump +.etherpump diff --git a/README.md b/README.md index 6480885..a995289 100644 --- a/README.md +++ b/README.md @@ -1,41 +1,119 @@ -etherdump +etherpump ========= -Tool to publish [etherpad](http://etherpad.org/) pages to files. +![etherpump - pumping text from the etherpad into publications](etherpump.png) +A command-line utility that extends the multi writing and publishing functionalities of the [etherpad](http://etherpad.org/) by exporting the pads in multiple formats. + +Many pads, many networks +------------------------ + +*Etherpump* is a fork of [*etherdump*](https://gitlab.constantvzw.org/aa/etherdump) a command line tool written by [Michael Murtaugh](http://automatist.org/) that converts etherpad pages to files. This fork is made out of curiosities in the tool, a wish to study it and shared sparks of enthusiasm to use it in different situations within Varia. + +Etherpump is a stretched version of etherdump. It introduces features to the initial tool that diffuse actions of *dumping* into *pumping*. Instead of dumping all pads by default, etherpump will be the place for us to write tools, that allow one to use etherpads to edit and curate content for publishing. + +Added features are: + +* opt-in publishing with the `__PUBLISH__` magic word +* the `publication` command, that listens to custom magic words such as `__RELEARN__` + +Etherdump is a tool that is used from the command line. It dumps all pads of one etherpad installation to a folder, saving them as different text files, such as plain text and HTML. It also creates an index file, that allows one to easily navigate through the list of pads. Etherdump follows a document-driven idea of publishing, which means that it converts pads as database entries into pads as files. This seems to be a redundant act of copying, but is actually an important in-between step that allows for many different publishing projects and experiments. + +We started to get to know etherdump through various editions of Relearn and/or the worksessions organized by Constant. Collaborative writing on an etherpad has been an important ingredient for these situations. The habit of using pads branched into the day-to-day practice of Varia, where we use etherpads for all sorts of things, ranging from organising remote-meetings with 10+ people, to writing and designing PDF documents collaboratively. + +After installing etherdump on the Varia server, we collectively decided to not want to publish pads by default. Discussions in the group around the use of etherpads, privacy and ideas of what publishing means, led to a need to have etherdump only start the indexing work after it recognizes a `__PUBLISH__` marker on a pad. We decided to work on a `__PUBLISH__ vs. __NOPUBLISH__` branch of etherdump, which we now fork into **etherpump**. + + +Change log / notes +================== + +**September 2019** + +Forking *etherdump* into *etherpump*. (Work in progress!) + + + +----- + +**May - September 2019** + +Etherdump is used to produce the *Ruminating Relearn* section of the Network Of One's Own 2 (NOOO2) publication. + +A new command is added to make a web publication, based on the custom magic word `__RELEARN__`. + +----- + +**June 2019** + +Multiple conversations around etherdump emerged during Relearn Curved in Varia, Rotterdam. + +Including the idea of executable pads (*etherhooks*), custom magic words, a federated snippet protocol (*etherstekje*) and more. + + + +----- + +**April 2019** + +Installation of etherdump on the Varia server. + + + +----- + +**March 2019** + +The `__PUBLISH__ vs. __NOPUBLISH__` was added to the etherdump repository by *decentral1se*. + + + +----- + +Originally designed for use at: [Constant](http://etherdump.constantvzw.org/). + +More notes can be found in the [git repository of etherdump](https://gitlab.constantvzw.org/aa/etherdump). + + +Install etherpump +================= Requirements ------------- - * python3 - * html5lib - * requests (settext) - * python-dateutil, jinja2 (index subcommand) + +* python3 +* html5lib +* requests (settext) +* python-dateutil, jinja2 (used by the index subcommand) Installation ------------- - pip install python-dateutil jinja2 html5lib - python setup.py install + $ pip install python-dateutil jinja2 html5lib + $ python setup.py install Example --------------- - mkdir mydump - cd myddump - etherdump init -The program then interactively asks some questions: + $ mkdir mydump + $ cd myddump + $ etherdump init - Please type the URL of the etherpad: - http://automatist.local:9001/ - The APIKEY is the contents of the file APIKEY.txt in the etherpad folder - Please paste the APIKEY: - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +The program then interactively asks some questions: -The settings are placed in a file called .etherdump/settings.json and are used (by default) by future commands. +``` + Please type the URL of the etherpad: + + https://pad.vvvvvvaria.org/ + The APIKEY is the contents of the file APIKEY.txt in the etherpad folder + + Please paste the APIKEY: + xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +``` +The settings are placed in a file called .etherdump/settings.json and are used (by default) by future commands. -subcommands +Subcommands ---------- * init @@ -49,32 +127,17 @@ subcommands * revisionscount * index * deletepad +* publication (*etherpump*) To get help on a subcommand: - etherdump revisionscount --help - - -Change log / notes -======================= - -Originally designed for use at: [constant](http://etherdump.constantvzw.org/). - + etherdump revisionscount --help -17 Oct 2016 ------------------------------------------------ -Preparations for [Machine Research](https://machineresearch.wordpress.com/) [2](http://constantvzw.org/site/Machine-Research,2646.html) +License +======= -6 Oct 2017 ----------------------- -Feature request from PW: When deleting a previously public document, generate a page / pages with an explanation (along the lines of "This document was previously public but has been marked .... maybe give links to search"). +GNU AFFERO GENERAL PUBLIC LICENSE, Version 3 -3 Nov 2017 ---------------- -machineresearch seems to be __NOPUBLISH__ but still exists (also in recentchanges) - -Jan 2018 -------------- -Updated files to work with python3 (probably this has broken python2). +See License.txt diff --git a/bin/etherdump b/bin/etherpump similarity index 87% rename from bin/etherdump rename to bin/etherpump index 9bdefee..aee8c00 100755 --- a/bin/etherdump +++ b/bin/etherpump @@ -4,7 +4,7 @@ from __future__ import print_function import sys usage = """Usage: - etherdump CMD + etherpump CMD where CMD could be: pull @@ -20,7 +20,7 @@ where CMD could be: html5tidy For more information on each command try: - etherdump CMD --help + etherpump CMD --help """ @@ -36,7 +36,7 @@ except IndexError: sys.exit(0) try: # http://stackoverflow.com/questions/301134/dynamic-module-import-in-python - cmdmod = __import__("etherdump.commands.%s" % cmd, fromlist=["etherdump.commands"]) + cmdmod = __import__("etherpump.commands.%s" % cmd, fromlist=["etherdump.commands"]) cmdmod.main(args) except ImportError as e: print ("Error performing command '{0}'\n(python said: {1})\n".format(cmd, e)) diff --git a/etherpump.egg-info/PKG-INFO b/etherpump.egg-info/PKG-INFO new file mode 100644 index 0000000..1091436 --- /dev/null +++ b/etherpump.egg-info/PKG-INFO @@ -0,0 +1,10 @@ +Metadata-Version: 1.0 +Name: etherpump +Version: 0.0.1 +Summary: Etherpump an etherpad publishing system +Home-page: https://git.vvvvvvaria.org/varia/etherpump +Author: Varia members +Author-email: info@varia.zone +License: LICENSE.txt +Description: UNKNOWN +Platform: UNKNOWN diff --git a/etherpump.egg-info/SOURCES.txt b/etherpump.egg-info/SOURCES.txt new file mode 100644 index 0000000..5bb57c3 --- /dev/null +++ b/etherpump.egg-info/SOURCES.txt @@ -0,0 +1,35 @@ +README.md +setup.py +bin/etherpump +etherpump/__init__.py +etherpump.egg-info/PKG-INFO +etherpump.egg-info/SOURCES.txt +etherpump.egg-info/dependency_links.txt +etherpump.egg-info/requires.txt +etherpump.egg-info/top_level.txt +etherpump/commands/__init__.py +etherpump/commands/appendmeta.py +etherpump/commands/common.py +etherpump/commands/creatediffhtml.py +etherpump/commands/deletepad.py +etherpump/commands/dumpcsv.py +etherpump/commands/gethtml.py +etherpump/commands/gettext.py +etherpump/commands/html5tidy.py +etherpump/commands/index.py +etherpump/commands/init.py +etherpump/commands/join.py +etherpump/commands/list.py +etherpump/commands/listauthors.py +etherpump/commands/publication.py +etherpump/commands/pull.py +etherpump/commands/revisionscount.py +etherpump/commands/sethtml.py +etherpump/commands/settext.py +etherpump/commands/showmeta.py +etherpump/commands/status.py +etherpump/data/templates/index.html +etherpump/data/templates/pad.html +etherpump/data/templates/pad_colors.html +etherpump/data/templates/pad_index.html +etherpump/data/templates/rss.xml \ No newline at end of file diff --git a/etherpump.egg-info/dependency_links.txt b/etherpump.egg-info/dependency_links.txt new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/etherpump.egg-info/dependency_links.txt @@ -0,0 +1 @@ + diff --git a/etherpump.egg-info/requires.txt b/etherpump.egg-info/requires.txt new file mode 100644 index 0000000..da75e56 --- /dev/null +++ b/etherpump.egg-info/requires.txt @@ -0,0 +1,2 @@ +html5lib +jinja2 diff --git a/etherpump.egg-info/top_level.txt b/etherpump.egg-info/top_level.txt new file mode 100644 index 0000000..a9e7107 --- /dev/null +++ b/etherpump.egg-info/top_level.txt @@ -0,0 +1 @@ +etherpump diff --git a/etherpump.png b/etherpump.png new file mode 100644 index 0000000..154eca9 Binary files /dev/null and b/etherpump.png differ diff --git a/etherdump/__init__.py b/etherpump/__init__.py similarity index 100% rename from etherdump/__init__.py rename to etherpump/__init__.py diff --git a/etherdump/commands/__init__.py b/etherpump/commands/__init__.py similarity index 100% rename from etherdump/commands/__init__.py rename to etherpump/commands/__init__.py diff --git a/etherdump/commands/appendmeta.py b/etherpump/commands/appendmeta.py similarity index 100% rename from etherdump/commands/appendmeta.py rename to etherpump/commands/appendmeta.py diff --git a/etherdump/commands/common.py b/etherpump/commands/common.py similarity index 100% rename from etherdump/commands/common.py rename to etherpump/commands/common.py diff --git a/etherdump/commands/creatediffhtml.py b/etherpump/commands/creatediffhtml.py similarity index 94% rename from etherdump/commands/creatediffhtml.py rename to etherpump/commands/creatediffhtml.py index ea3f7fa..ea709a1 100644 --- a/etherdump/commands/creatediffhtml.py +++ b/etherpump/commands/creatediffhtml.py @@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError def main(args): p = ArgumentParser("calls the createDiffHTML API function for the given padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--rev", type=int, default=None, help="revision, default: latest") diff --git a/etherdump/commands/deletepad.py b/etherpump/commands/deletepad.py similarity index 94% rename from etherdump/commands/deletepad.py rename to etherpump/commands/deletepad.py index 7c97584..3461a86 100644 --- a/etherdump/commands/deletepad.py +++ b/etherpump/commands/deletepad.py @@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError def main(args): p = ArgumentParser("calls the getText API function for the given padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") args = p.parse_args(args) diff --git a/etherdump/commands/dumpcsv.py b/etherpump/commands/dumpcsv.py similarity index 97% rename from etherdump/commands/dumpcsv.py rename to etherpump/commands/dumpcsv.py index aa6e971..19efb33 100644 --- a/etherdump/commands/dumpcsv.py +++ b/etherpump/commands/dumpcsv.py @@ -30,7 +30,7 @@ def jsonload (url): def main (args): p = ArgumentParser("outputs a CSV of information all all pads") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False") args = p.parse_args(args) diff --git a/etherdump/commands/gethtml.py b/etherpump/commands/gethtml.py similarity index 95% rename from etherdump/commands/gethtml.py rename to etherpump/commands/gethtml.py index ca0c79a..b55a091 100644 --- a/etherdump/commands/gethtml.py +++ b/etherpump/commands/gethtml.py @@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError def main(args): p = ArgumentParser("calls the getHTML API function for the given padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--rev", type=int, default=None, help="revision, default: latest") diff --git a/etherdump/commands/gettext.py b/etherpump/commands/gettext.py similarity index 96% rename from etherdump/commands/gettext.py rename to etherpump/commands/gettext.py index 45285e7..8108223 100644 --- a/etherdump/commands/gettext.py +++ b/etherpump/commands/gettext.py @@ -14,7 +14,7 @@ except ImportError: def main(args): p = ArgumentParser("calls the getText API function for the given padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--rev", type=int, default=None, help="revision, default: latest") diff --git a/etherdump/commands/html5tidy.py b/etherpump/commands/html5tidy.py similarity index 100% rename from etherdump/commands/html5tidy.py rename to etherpump/commands/html5tidy.py diff --git a/etherdump/commands/index.py b/etherpump/commands/index.py similarity index 97% rename from etherdump/commands/index.py rename to etherpump/commands/index.py index 0efe7cd..0f506d3 100644 --- a/etherdump/commands/index.py +++ b/etherpump/commands/index.py @@ -15,13 +15,13 @@ except ImportError: from urllib.request import urlopen, URLError, HTTPError from jinja2 import FileSystemLoader, Environment -from etherdump.commands.common import * +from etherpump.commands.common import * from time import sleep import dateutil.parser """ index: - Generate pages from etherdumps using a template. + Generate pages from etherpumps using a template. Built-in templates: rss.xml, index.html @@ -87,7 +87,7 @@ def main (args): p.add_argument("--templatepath", default=None, help="path to find templates, default: built-in") p.add_argument("--template", default="index.html", help="template name, built-ins include index.html, rss.xml; default: index.html") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: ./.etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: ./.etherdump/settings.json") # p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") p.add_argument("--order", default="padid", help="order, possible values: padid, pad (no group name), lastedited, (number of) authors, revisions, default: padid") @@ -105,12 +105,12 @@ def main (args): pg = p.add_argument_group('template variables') pg.add_argument("--feedurl", default="feed.xml", help="rss: to use as feeds own (self) link, default: feed.xml") pg.add_argument("--siteurl", default=None, help="rss: to use as channel's site link, default: the etherpad url") - pg.add_argument("--title", default="etherdump", help="title for document or rss feed channel title, default: etherdump") + pg.add_argument("--title", default="etherpump", help="title for document or rss feed channel title, default: etherdump") pg.add_argument("--description", default="", help="rss: channel description, default: empty") pg.add_argument("--language", default="en-US", help="rss: feed language, default: en-US") pg.add_argument("--updatePeriod", default="daily", help="rss: updatePeriod, possible values: hourly, daily, weekly, monthly, yearly; default: daily") pg.add_argument("--updateFrequency", default=1, type=int, help="rss: update frequency within the update period (where 2 would mean twice per period); default: 1") - pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherdump", help="generator, default: https://gitlab.com/activearchives/etherdump") + pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherpump", help="generator, default: https://gitlab.com/activearchives/etherdump") pg.add_argument("--timestamp", default=None, help="timestamp, default: now (e.g. 2015-12-01 12:30:00)") pg.add_argument("--next", default=None, help="next link, default: None)") pg.add_argument("--prev", default=None, help="prev link, default: None") diff --git a/etherdump/commands/init.py b/etherpump/commands/init.py similarity index 97% rename from etherdump/commands/init.py rename to etherpump/commands/init.py index 1f229a5..a443d04 100644 --- a/etherdump/commands/init.py +++ b/etherpump/commands/init.py @@ -69,7 +69,7 @@ def tryapiurl (url, verbose=False): print ("URLError", e, file=sys.stderr) def main(args): - p = ArgumentParser("initialize an etherdump folder") + p = ArgumentParser("initialize an etherpump folder") p.add_argument("arg", nargs="*", default=[], help="optional positional args: path etherpadurl") p.add_argument("--path", default=None, help="path to initialize") p.add_argument("--padurl", default=None, help="") @@ -85,7 +85,7 @@ def main(args): if not path: path = "." - edpath = os.path.join(path, ".etherdump") + edpath = os.path.join(path, ".etherpump") try: os.makedirs(edpath) except OSError: diff --git a/etherdump/commands/join.py b/etherpump/commands/join.py similarity index 100% rename from etherdump/commands/join.py rename to etherpump/commands/join.py diff --git a/etherdump/commands/list.py b/etherpump/commands/list.py similarity index 92% rename from etherdump/commands/list.py rename to etherpump/commands/list.py index 036699a..8cc9947 100644 --- a/etherdump/commands/list.py +++ b/etherpump/commands/list.py @@ -2,7 +2,7 @@ from __future__ import print_function from argparse import ArgumentParser import json import sys -from etherdump.commands.common import getjson +from etherpump.commands.common import getjson try: # python2 from urlparse import urlparse, urlunparse @@ -16,7 +16,7 @@ except ImportError: def main (args): p = ArgumentParser("call listAllPads and print the results") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="lines", help="output format: lines, json; default lines") args = p.parse_args(args) diff --git a/etherdump/commands/listauthors.py b/etherpump/commands/listauthors.py similarity index 94% rename from etherdump/commands/listauthors.py rename to etherpump/commands/listauthors.py index c9bbf4a..90e3cf1 100644 --- a/etherdump/commands/listauthors.py +++ b/etherpump/commands/listauthors.py @@ -8,7 +8,7 @@ from urllib2 import urlopen, HTTPError, URLError def main(args): p = ArgumentParser("call listAuthorsOfPad for the padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") p.add_argument("--format", default="lines", help="output format, can be: lines, json; default: lines") args = p.parse_args(args) diff --git a/etherpump/commands/publication.py b/etherpump/commands/publication.py new file mode 100644 index 0000000..c450ab3 --- /dev/null +++ b/etherpump/commands/publication.py @@ -0,0 +1,324 @@ +from __future__ import print_function +from argparse import ArgumentParser +import sys, json, re, os, time +from datetime import datetime +import dateutil.parser +import pypandoc + +try: + # python2 + from urllib2 import urlopen, URLError, HTTPError + from urllib import urlencode + from urlparse import urlparse, urlunparse +except ImportError: + # python3 + from urllib.parse import urlparse, urlunparse, urlencode, quote + from urllib.request import urlopen, URLError, HTTPError + +from jinja2 import FileSystemLoader, Environment +from etherpump.commands.common import * +from time import sleep +import dateutil.parser + +""" +publication: + Generate a single document from etherpumps using a template. + + Built-in templates: publication.html + +""" + +def group (items, key=lambda x: x): + """ returns a list of lists, of items grouped by a key function """ + ret = [] + keys = {} + for item in items: + k = key(item) + if k not in keys: + keys[k] = [] + keys[k].append(item) + for k in sorted(keys): + keys[k].sort() + ret.append(keys[k]) + return ret + +# def base (x): +# return re.sub(r"(\.raw\.html)|(\.diff\.html)|(\.meta\.json)|(\.raw\.txt)$", "", x) + +def splitextlong (x): + """ split "long" extensions, i.e. foo.bar.baz => ('foo', '.bar.baz') """ + m = re.search(r"^(.*?)(\..*)$", x) + if m: + return m.groups() + else: + return x, '' + +def base (x): + return splitextlong(x)[0] + +def excerpt (t, chars=25): + if len(t) > chars: + t = t[:chars] + "..." + return t + +def absurl (url, base=None): + if not url.startswith("http"): + return base + url + return url + +def url_base (url): + (scheme, netloc, path, params, query, fragment) = urlparse(url) + path, _ = os.path.split(path.lstrip("/")) + ret = urlunparse((scheme, netloc, path, None, None, None)) + if ret: + ret += "/" + return ret + +def datetimeformat (t, format='%Y-%m-%d %H:%M:%S'): + if type(t) == str: + dt = dateutil.parser.parse(t) + return dt.strftime(format) + else: + return time.strftime(format, time.localtime(t)) + +def main (args): + p = ArgumentParser("Convert dumped files to a document via a template.") + + p.add_argument("input", nargs="+", help="Files to list (.meta.json files)") + + p.add_argument("--templatepath", default=None, help="path to find templates, default: built-in") + p.add_argument("--template", default="publication.html", help="template name, built-ins include publication.html; default: publication.html") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: ./.etherdump/settings.json") + # p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") + + p.add_argument("--order", default="padid", help="order, possible values: padid, pad (no group name), lastedited, (number of) authors, revisions, default: padid") + p.add_argument("--reverse", default=False, action="store_true", help="reverse order, default: False (reverse chrono)") + p.add_argument("--limit", type=int, default=0, help="limit to number of items, default: 0 (no limit)") + p.add_argument("--skip", default=None, type=int, help="skip this many items, default: None") + + p.add_argument("--content", default=False, action="store_true", help="rss: include (full) content tag, default: False") + p.add_argument("--link", default="diffhtml,html,text", help="link variable will be to this version, can be comma-delim list, use first avail, default: diffhtml,html,text") + p.add_argument("--linkbase", default=None, help="base url to use for links, default: try to use the feedurl") + p.add_argument("--output", default=None, help="output, default: stdout") + + p.add_argument("--files", default=False, action="store_true", help="include files (experimental)") + + pg = p.add_argument_group('template variables') + pg.add_argument("--feedurl", default="feed.xml", help="rss: to use as feeds own (self) link, default: feed.xml") + pg.add_argument("--siteurl", default=None, help="rss: to use as channel's site link, default: the etherpad url") + pg.add_argument("--title", default="etherpump", help="title for document or rss feed channel title, default: etherdump") + pg.add_argument("--description", default="", help="rss: channel description, default: empty") + pg.add_argument("--language", default="en-US", help="rss: feed language, default: en-US") + pg.add_argument("--updatePeriod", default="daily", help="rss: updatePeriod, possible values: hourly, daily, weekly, monthly, yearly; default: daily") + pg.add_argument("--updateFrequency", default=1, type=int, help="rss: update frequency within the update period (where 2 would mean twice per period); default: 1") + pg.add_argument("--generator", default="https://gitlab.com/activearchives/etherpump", help="generator, default: https://gitlab.com/activearchives/etherdump") + pg.add_argument("--timestamp", default=None, help="timestamp, default: now (e.g. 2015-12-01 12:30:00)") + pg.add_argument("--next", default=None, help="next link, default: None)") + pg.add_argument("--prev", default=None, help="prev link, default: None") + + args = p.parse_args(args) + + tmpath = args.templatepath + # Default path for template is the built-in data/templates + if tmpath == None: + tmpath = os.path.split(os.path.abspath(__file__))[0] + tmpath = os.path.split(tmpath)[0] + tmpath = os.path.join(tmpath, "data", "templates") + + env = Environment(loader=FileSystemLoader(tmpath)) + env.filters["excerpt"] = excerpt + env.filters["datetimeformat"] = datetimeformat + template = env.get_template(args.template) + + info = loadpadinfo(args.padinfo) + + inputs = args.input + inputs.sort() + # Use "base" to strip (longest) extensions + # inputs = group(inputs, base) + + def wrappath (p): + path = "./{0}".format(p) + ext = os.path.splitext(p)[1][1:] + return { + "url": path, + "path": path, + "code": 200, + "type": ext + } + + def metaforpaths (paths): + ret = {} + pid = base(paths[0]) + ret['pad'] = ret['padid'] = pid + ret['versions'] = [wrappath(x) for x in paths] + lastedited = None + for p in paths: + mtime = os.stat(p).st_mtime + if lastedited == None or mtime > lastedited: + lastedited = mtime + ret["lastedited_iso"] = datetime.fromtimestamp(lastedited).strftime("%Y-%m-%dT%H:%M:%S") + ret["lastedited_raw"] = mtime + return ret + + def loadmeta(p): + # Consider a set of grouped files + # Otherwise, create a "dummy" one that wraps all the files as versions + if p.endswith(".meta.json"): + with open(p) as f: + return json.load(f) + # # IF there is a .meta.json, load it & MERGE with other files + # if ret: + # # TODO: merge with other files + # for p in paths: + # if "./"+p not in ret['versions']: + # ret['versions'].append(wrappath(p)) + # return ret + # else: + # return metaforpaths(paths) + + def fixdates (padmeta): + d = dateutil.parser.parse(padmeta["lastedited_iso"]) + padmeta["lastedited"] = d + padmeta["lastedited_822"] = d.strftime("%a, %d %b %Y %H:%M:%S +0000") + return padmeta + + pads = map(loadmeta, inputs) + pads = [x for x in pads if x != None] + pads = map(fixdates, pads) + args.pads = list(pads) + + def could_have_base (x, y): + return x == y or (x.startswith(y) and x[len(y):].startswith(".")) + + def get_best_pad (x): + for pb in padbases: + p = pads_by_base[pb] + if could_have_base(x, pb): + return p + + def has_version (padinfo, path): + return [x for x in padinfo['versions'] if 'path' in x and x['path'] == "./"+path] + + if args.files: + inputs = args.input + inputs.sort() + removelist = [] + + pads_by_base = {} + for p in args.pads: + # print ("Trying padid", p['padid'], file=sys.stderr) + padbase = os.path.splitext(p['padid'])[0] + pads_by_base[padbase] = p + padbases = list(pads_by_base.keys()) + # SORT THEM LONGEST FIRST TO ensure that LONGEST MATCHES MATCH + padbases.sort(key=lambda x: len(x), reverse=True) + # print ("PADBASES", file=sys.stderr) + # for pb in padbases: + # print (" ", pb, file=sys.stderr) + print ("pairing input files with pads", file=sys.stderr) + for x in inputs: + # pair input with a pad if possible + xbasename = os.path.basename(x) + p = get_best_pad(xbasename) + if p: + if not has_version(p, x): + print ("Grouping file {0} with pad {1}".format(x, p['padid']), file=sys.stderr) + p['versions'].append(wrappath(x)) + else: + print ("Skipping existing version {0} ({1})...".format(x, p['padid']), file=sys.stderr) + removelist.append(x) + # Removed Matches files + for x in removelist: + inputs.remove(x) + print ("Remaining files:", file=sys.stderr) + for x in inputs: + print (x, file=sys.stderr) + print (file=sys.stderr) + # Add "fake" pads for remaining files + for x in inputs: + args.pads.append(metaforpaths([x])) + + if args.timestamp == None: + args.timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + padurlbase = re.sub(r"api/1.2.9/$", "p/", info["apiurl"]) + # if type(padurlbase) == unicode: + # padurlbase = padurlbase.encode("utf-8") + args.siteurl = args.siteurl or padurlbase + args.utcnow = datetime.utcnow().strftime("%a, %d %b %Y %H:%M:%S +0000") + + # order items & apply limit + if args.order == "lastedited": + args.pads.sort(key=lambda x: x.get("lastedited_iso"), reverse=args.reverse) + elif args.order == "pad": + args.pads.sort(key=lambda x: x.get("pad"), reverse=args.reverse) + elif args.order == "padid": + args.pads.sort(key=lambda x: x.get("padid"), reverse=args.reverse) + elif args.order == "revisions": + args.pads.sort(key=lambda x: x.get("revisions"), reverse=args.reverse) + elif args.order == "authors": + args.pads.sort(key=lambda x: len(x.get("authors")), reverse=args.reverse) + elif args.order == "custom": + + # TODO: make this list non-static, but a variable that can be given from the CLI + + customorder = [ + 'nooo.relearn.preamble', + 'nooo.relearn.activating.the.archive', + 'nooo.relearn.call.for.proposals', + 'nooo.relearn.call.for.proposals-proposal-footnote', + 'nooo.relearn.colophon'] + order = [] + for x in customorder: + for pad in args.pads: + if pad["padid"] == x: + order.append(pad) + args.pads = order + else: + raise Exception("That ordering is not implemented!") + + if args.limit: + args.pads = args.pads[:args.limit] + + # add versions_by_type, add in full text + # add link (based on args.link) + linkversions = args.link.split(",") + linkbase = args.linkbase or url_base(args.feedurl) + # print ("linkbase", linkbase, args.linkbase, args.feedurl) + + for p in args.pads: + versions_by_type = {} + p["versions_by_type"] = versions_by_type + for v in p["versions"]: + t = v["type"] + versions_by_type[t] = v + + if "text" in versions_by_type: + # try: + with open (versions_by_type["text"]["path"]) as f: + content = f.read() + # print('content:', content) + # [Relearn] Add pandoc command here? + html = pypandoc.convert_text(content, 'html', format='md') + # print('html:', html) + p["text"] = html + # except FileNotFoundError: + # p['text'] = 'ERROR' + + # ADD IN LINK TO PAD AS "link" + for v in linkversions: + if v in versions_by_type: + vdata = versions_by_type[v] + try: + if v == "pad" or os.path.exists(vdata["path"]): + p["link"] = absurl(vdata["url"], linkbase) + break + except KeyError as e: + pass + + if args.output: + with open(args.output, "w") as f: + print (template.render(vars(args)), file=f) + else: + print (template.render(vars(args))) diff --git a/etherdump/commands/pull.py b/etherpump/commands/pull.py similarity index 98% rename from etherdump/commands/pull.py rename to etherpump/commands/pull.py index 8b54460..91ae2bd 100644 --- a/etherdump/commands/pull.py +++ b/etherpump/commands/pull.py @@ -12,9 +12,9 @@ except ImportError: from urllib.parse import urlencode, quote from urllib.request import urlopen, URLError, HTTPError -from etherdump.commands.common import * +from etherpump.commands.common import * from time import sleep -from etherdump.commands.html5tidy import html5tidy +from etherpump.commands.html5tidy import html5tidy import html5lib from xml.etree import ElementTree as ET from fnmatch import fnmatch @@ -47,7 +47,7 @@ def main (args): p.add_argument("padid", nargs="*", default=[]) p.add_argument("--glob", default=False, help="download pads matching a glob pattern") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherpump/settings.json") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") p.add_argument("--pub", default="p", help="folder to store files for public pads, default: p") p.add_argument("--group", default="g", help="folder to store files for group pads, default: g") diff --git a/etherdump/commands/revisionscount.py b/etherpump/commands/revisionscount.py similarity index 93% rename from etherdump/commands/revisionscount.py rename to etherpump/commands/revisionscount.py index 6612894..15ec72f 100644 --- a/etherdump/commands/revisionscount.py +++ b/etherpump/commands/revisionscount.py @@ -7,7 +7,7 @@ from urllib2 import urlopen, HTTPError, URLError def main(args): p = ArgumentParser("call getRevisionsCount for the given padid") p.add_argument("padid", help="the padid") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") args = p.parse_args(args) diff --git a/etherdump/commands/sethtml.py b/etherpump/commands/sethtml.py similarity index 97% rename from etherdump/commands/sethtml.py rename to etherpump/commands/sethtml.py index 7b6a0cf..6180d48 100644 --- a/etherdump/commands/sethtml.py +++ b/etherpump/commands/sethtml.py @@ -12,7 +12,7 @@ def main(args): p = ArgumentParser("calls the setHTML API function for the given padid") p.add_argument("padid", help="the padid") p.add_argument("--html", default=None, help="html, default: read from stdin") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") # p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary") diff --git a/etherdump/commands/settext.py b/etherpump/commands/settext.py similarity index 97% rename from etherdump/commands/settext.py rename to etherpump/commands/settext.py index b96cf1f..97f0555 100644 --- a/etherdump/commands/settext.py +++ b/etherpump/commands/settext.py @@ -20,7 +20,7 @@ def main(args): p = ArgumentParser("calls the getText API function for the given padid") p.add_argument("padid", help="the padid") p.add_argument("--text", default=None, help="text, default: read from stdin") - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--showurl", default=False, action="store_true") # p.add_argument("--format", default="text", help="output format, can be: text, json; default: text") p.add_argument("--create", default=False, action="store_true", help="flag to create pad if necessary") diff --git a/etherdump/commands/showmeta.py b/etherpump/commands/showmeta.py similarity index 100% rename from etherdump/commands/showmeta.py rename to etherpump/commands/showmeta.py diff --git a/etherdump/commands/status.py b/etherpump/commands/status.py similarity index 98% rename from etherdump/commands/status.py rename to etherpump/commands/status.py index e2961f0..605769a 100644 --- a/etherdump/commands/status.py +++ b/etherpump/commands/status.py @@ -61,7 +61,7 @@ def ignore_p (path, settings=None): def main (args): p = ArgumentParser("Check for pads that have changed since last sync (according to .meta.json)") # p.add_argument("padid", nargs="*", default=[]) - p.add_argument("--padinfo", default=".etherdump/settings.json", help="settings, default: .etherdump/settings.json") + p.add_argument("--padinfo", default=".etherpump/settings.json", help="settings, default: .etherdump/settings.json") p.add_argument("--zerorevs", default=False, action="store_true", help="include pads with zero revisions, default: False (i.e. pads with no revisions are skipped)") p.add_argument("--pub", default=".", help="folder to store files for public pads, default: pub") p.add_argument("--group", default="g", help="folder to store files for group pads, default: g") diff --git a/etherdump/data/templates/index.html b/etherpump/data/templates/index.html similarity index 100% rename from etherdump/data/templates/index.html rename to etherpump/data/templates/index.html diff --git a/etherdump/data/templates/pad.html b/etherpump/data/templates/pad.html similarity index 100% rename from etherdump/data/templates/pad.html rename to etherpump/data/templates/pad.html diff --git a/etherdump/data/templates/pad_colors.html b/etherpump/data/templates/pad_colors.html similarity index 88% rename from etherdump/data/templates/pad_colors.html rename to etherpump/data/templates/pad_colors.html index 39cdf25..44b7e36 100644 --- a/etherdump/data/templates/pad_colors.html +++ b/etherpump/data/templates/pad_colors.html @@ -10,7 +10,7 @@ {{ html }} -