Merge branch 'migrate-to-python3' of decentral1se/etherpump into master

This commit is contained in:
decentral1se 2019-09-25 19:00:42 +02:00 committed by Gitea
commit 64b16ba5a3
27 changed files with 128 additions and 186 deletions

1
.gitignore vendored
View File

@ -5,3 +5,4 @@ venv/
testing/
padinfo.json
.etherpump
*egg-info*

View File

@ -8,10 +8,9 @@ A command-line utility that extends the multi writing and publishing functionali
Many pads, many networks
------------------------
*Etherpump* is a fork of [*etherdump*](https://gitlab.constantvzw.org/aa/etherdump), a command line tool written by [Michael Murtaugh](http://automatist.org/) that converts etherpad pages to files. This fork is made out of curiosities in the tool, a wish to study it and shared sparks of enthusiasm to use it in different situations within Varia.
Etherpump is a stretched version of etherdump. It is a playground in which we would like to add features to the initial tool that diffuse actions of *dumping* into *pumping*. So most of all, etherpump is a work-in-progress, exploring potential uses of etherpads to edit, structure and publish various types of content.
*Etherpump* is a fork of [*etherpump*](https://gitlab.constantvzw.org/aa/etherpump), a command line tool written by [Michael Murtaugh](http://automatist.org/) that converts etherpad pages to files. This fork is made out of curiosities in the tool, a wish to study it and shared sparks of enthusiasm to use it in different situations within Varia.
Etherpump is a stretched version of etherpump. It is a playground in which we would like to add features to the initial tool that diffuse actions of *dumping* into *pumping*. So most of all, etherpump is a work-in-progress, exploring potential uses of etherpads to edit, structure and publish various types of content.
Added features are:
* opt-in publishing with the `__PUBLISH__` magic word
@ -19,27 +18,28 @@ Added features are:
See the [Change log / notes ](#change-log--notes) section for further changes.
Etherdump is a tool that is used from the command line. It dumps all pads of one etherpad installation to a folder, saving them as different text files, such as plain text and HTML. It also creates an index file, that allows one to easily navigate through the list of pads. Etherdump follows a document-driven idea of publishing, which means that it converts pads as database entries into pads as files. This seems to be a redundant act of copying, but is actually an important in-between step that allows for many different publishing projects and experiments.
etherpump is a tool that is used from the command line. It dumps all pads of one etherpad installation to a folder, saving them as different text files, such as plain text and HTML. It also creates an index file, that allows one to easily navigate through the list of pads. etherpump follows a document-driven idea of publishing, which means that it converts pads as database entries into pads as files. This seems to be a redundant act of copying, but is actually an important in-between step that allows for many different publishing projects and experiments.
We started to get to know etherdump through various editions of Relearn and/or the worksessions organized by Constant. Collaborative writing on an etherpad has been an important ingredient for these situations. The habit of using pads branched into the day-to-day practice of Varia, where we use etherpads for all sorts of things, ranging from organising remote-meetings with 10+ people, to writing and designing PDF documents collaboratively.
After installing etherdump on the Varia server, we collectively decided to not want to publish pads by default. Discussions in the group around the use of etherpads, privacy and ideas of what publishing means, led to a need to have etherdump only start the indexing work after it recognizes a `__PUBLISH__` marker on a pad. We decided to work on a `__PUBLISH__ vs. __NOPUBLISH__` branch of etherdump, which we now fork into **etherpump**.
We started to get to know etherpump through various editions of Relearn and/or the worksessions organized by Constant. Collaborative writing on an etherpad has been an important ingredient for these situations. The habit of using pads branched into the day-to-day practice of Varia, where we use etherpads for all sorts of things, ranging from organising remote-meetings with 10+ people, to writing and designing PDF documents collaboratively.
After installing etherpump on the Varia server, we collectively decided to not want to publish pads by default. Discussions in the group around the use of etherpads, privacy and ideas of what publishing means, led to a need to have etherpump only start the indexing work after it recognizes a `__PUBLISH__` marker on a pad. We decided to work on a `__PUBLISH__ vs. __NOPUBLISH__` branch of etherpump, which we now fork into **etherpump**.
Change log / notes
==================
**September 2019**
Forking *etherdump* into *etherpump*. (Work in progress!)
Forking *etherpump* into *etherpump*. (Work in progress!)
<https://git.vvvvvvaria.org/varia/etherpump>
Migrating the source code to Python 3.
-----
**May - September 2019**
Etherdump is used to produce the *Ruminating Relearn* section of the Network Of One's Own 2 (NOOO2) publication.
etherpump is used to produce the *Ruminating Relearn* section of the Network Of One's Own 2 (NOOO2) publication.
A new command is added to make a web publication, based on the custom magic word `__RELEARN__`.
@ -47,7 +47,7 @@ A new command is added to make a web publication, based on the custom magic word
**June 2019**
Multiple conversations around etherdump emerged during Relearn Curved in Varia, Rotterdam.
Multiple conversations around etherpump emerged during Relearn Curved in Varia, Rotterdam.
Including the idea of executable pads (*etherhooks*), custom magic words, a federated snippet protocol (*etherstekje*) and more.
@ -57,94 +57,67 @@ Including the idea of executable pads (*etherhooks*), custom magic words, a fede
**April 2019**
Installation of etherdump on the Varia server.
Installation of etherpump on the Varia server.
<https://etherdump.vvvvvvaria.org/>
<https://etherpump.vvvvvvaria.org/>
-----
**March 2019**
The `__PUBLISH__ vs. __NOPUBLISH__` was added to the etherdump repository by *decentral1se*.
The `__PUBLISH__ vs. __NOPUBLISH__` was added to the etherpump repository by *decentral1se*.
<https://gitlab.constantvzw.org/aa/etherdump/issues/3>
<https://gitlab.constantvzw.org/aa/etherpump/issues/3>
-----
Originally designed for use at: [Constant](http://etherdump.constantvzw.org/).
Originally designed for use at: [Constant](http://etherpump.constantvzw.org/).
More notes can be found in the [git repository of etherdump](https://gitlab.constantvzw.org/aa/etherdump).
More notes can be found in the [git repository of etherpump](https://gitlab.constantvzw.org/aa/etherpump).
Install etherpump
=================
Requirements
-------------
`$ pip install etherpump`
* python3
* html5lib
* requests (settext)
* python-dateutil, jinja2 (used by the index subcommand)
Installation
-------------
`$ pip install python-dateutil jinja2 html5lib`
`$ python setup.py install`
Etherpump only supports Python 3.
Example
---------------
-------
```
$ mkdir mydump
$ cd myddump
$ etherdump init
$ etherpump init
```
The program then interactively asks some questions:
```
Please type the URL of the etherpad:
https://pad.vvvvvvaria.org/
```
> Please type the URL of the etherpad:
> https://pad.vvvvvvaria.org/
The APIKEY is the contents of the file APIKEY.txt in the etherpad folder.
```
Please paste the APIKEY:
> Please paste the APIKEY:
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```
The settings are placed in a file called `.etherdump/settings.json` and are used (by default) by future commands.
The settings are placed in a file called `.etherpump/settings.json` and are used (by default) by future commands.
Subcommands
----------
* init
* pull
* list
* listauthors
* gettext
* settext
* gethtml
* creatediffhtml
* revisionscount
* index
* deletepad
* publication (*etherpump*)
To see all available subcommands, run:
To get help on a subcommand:
`$ etherpump --help`
`$ etherdump revisionscount --help`
For help on each individual subcommand, run:
`$ etherpump revisionscount --help`
License
=======
GNU AFFERO GENERAL PUBLIC LICENSE, Version 3
GNU AFFERO GENERAL PUBLIC LICENSE, Version 3.
See `License.txt`
See [LICENSE.txt](./LICENSE.txt).

View File

@ -1,10 +0,0 @@
Metadata-Version: 1.0
Name: etherpump
Version: 0.0.1
Summary: Etherpump an etherpad publishing system
Home-page: https://git.vvvvvvaria.org/varia/etherpump
Author: Varia members
Author-email: info@varia.zone
License: LICENSE.txt
Description: UNKNOWN
Platform: UNKNOWN

View File

@ -1,35 +0,0 @@
README.md
setup.py
bin/etherpump
etherpump/__init__.py
etherpump.egg-info/PKG-INFO
etherpump.egg-info/SOURCES.txt
etherpump.egg-info/dependency_links.txt
etherpump.egg-info/requires.txt
etherpump.egg-info/top_level.txt
etherpump/commands/__init__.py
etherpump/commands/appendmeta.py
etherpump/commands/common.py
etherpump/commands/creatediffhtml.py
etherpump/commands/deletepad.py
etherpump/commands/dumpcsv.py
etherpump/commands/gethtml.py
etherpump/commands/gettext.py
etherpump/commands/html5tidy.py
etherpump/commands/index.py
etherpump/commands/init.py
etherpump/commands/join.py
etherpump/commands/list.py
etherpump/commands/listauthors.py
etherpump/commands/publication.py
etherpump/commands/pull.py
etherpump/commands/revisionscount.py
etherpump/commands/sethtml.py
etherpump/commands/settext.py
etherpump/commands/showmeta.py
etherpump/commands/status.py
etherpump/data/templates/index.html
etherpump/data/templates/pad.html
etherpump/data/templates/pad_colors.html
etherpump/data/templates/pad_index.html
etherpump/data/templates/rss.xml

View File

@ -1 +0,0 @@

View File

@ -1,2 +0,0 @@
html5lib
jinja2

View File

@ -1 +0,0 @@
etherpump

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python
from __future__ import print_function
from argparse import ArgumentParser
import json, os

View File

@ -1,15 +1,16 @@
from __future__ import print_function
import re, os, json, sys
from math import ceil, floor
from time import sleep
try:
# python2
from urlparse import urlparse, urlunparse
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib import quote_plus, unquote_plus
from htmlentitydefs import name2codepoint
from urllib.parse import urlparse, urlunparse
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
from urllib.parse import quote_plus, unquote_plus
from html.entities import name2codepoint
input = raw_input
except ImportError:
@ -24,12 +25,12 @@ def splitpadname (padid):
if m:
return(m.group(1), padid[m.end():])
else:
return (u"", padid)
return ("", padid)
def padurl (padid, ):
return padid
def padpath (padid, pub_path=u"", group_path=u"", normalize=False):
def padpath (padid, pub_path="", group_path="", normalize=False):
g, p = splitpadname(padid)
# if type(g) == unicode:
# g = g.encode("utf-8")
@ -48,7 +49,7 @@ def padpath (padid, pub_path=u"", group_path=u"", normalize=False):
return os.path.join(pub_path, p)
def padpath2id (path):
if type(path) == unicode:
if type(path) == str:
path = path.encode("utf-8")
dd, p = os.path.split(path)
gname = dd.split("/")[-1]
@ -95,7 +96,7 @@ def progressbar (i, num, label="", file=sys.stderr):
percentage = int(floor(p*100))
bars = int(ceil(p*20))
bar = ("*"*bars) + ("-"*(20-bars))
msg = u"\r{0} {1}/{2} {3}... ".format(bar, (i+1), num, label)
msg = "\r{0} {1}/{2} {3}... ".format(bar, (i+1), num, label)
sys.stderr.write(msg)
sys.stderr.flush()
@ -114,15 +115,15 @@ def unescape(text):
# character reference
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
return chr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
return chr(int(text[2:-1]))
except ValueError:
pass
else:
# named entity
try:
text = unichr(name2codepoint[text[1:-1]])
text = chr(name2codepoint[text[1:-1]])
except KeyError:
pass
return text # leave as is

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def main(args):

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def main(args):

View File

@ -1,9 +1,10 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re
from datetime import datetime
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from csv import writer
from math import ceil, floor
@ -52,7 +53,7 @@ def main (args):
percentage = int(floor(p*100))
bars = int(ceil(p*20))
bar = ("*"*bars) + ("-"*(20-bars))
msg = u"\r{0} {1}/{2} {3}... ".format(bar, (i+1), numpads, padid)
msg = "\r{0} {1}/{2} {3}... ".format(bar, (i+1), numpads, padid)
if len(msg) > maxmsglen:
maxmsglen = len(msg)
sys.stderr.write("\r{0}".format(" "*maxmsglen))
@ -63,7 +64,7 @@ def main (args):
groupname = m.group(1)
padidnogroup = padid[m.end():]
else:
groupname = u""
groupname = ""
padidnogroup = padid
data['padID'] = padid.encode("utf-8")
@ -75,7 +76,7 @@ def main (args):
lastedited_raw = jsonload(apiurl+'getLastEdited?'+urlencode(data))['data']['lastEdited']
lastedited_iso = datetime.fromtimestamp(int(lastedited_raw)/1000).isoformat()
author_ids = jsonload(apiurl+'listAuthorsOfPad?'+urlencode(data))['data']['authorIDs']
author_ids = u" ".join(author_ids).encode("utf-8")
author_ids = " ".join(author_ids).encode("utf-8")
out.writerow((padidnogroup.encode("utf-8"), groupname.encode("utf-8"), revisions, lastedited_iso, author_ids))
count += 1

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def main(args):

View File

@ -1,10 +1,11 @@
from __future__ import print_function
from argparse import ArgumentParser
import json, sys
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
except ImportError:
# python3
from urllib.parse import urlencode

View File

@ -1,6 +1,6 @@
#!/usr/bin/env python3
from __future__ import print_function
from html5lib import parse
import os, sys
from argparse import ArgumentParser

View File

@ -1,4 +1,4 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re, os, time
from datetime import datetime
@ -6,9 +6,10 @@ import dateutil.parser
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urlparse import urlparse, urlunparse
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
from urllib.parse import urlparse, urlunparse
except ImportError:
# python3
from urllib.parse import urlparse, urlunparse, urlencode, quote
@ -182,9 +183,9 @@ def main (args):
padmeta["lastedited_822"] = d.strftime("%a, %d %b %Y %H:%M:%S +0000")
return padmeta
pads = map(loadmeta, inputs)
pads = list(map(loadmeta, inputs))
pads = [x for x in pads if x != None]
pads = map(fixdates, pads)
pads = list(map(fixdates, pads))
args.pads = list(pads)
def could_have_base (x, y):

View File

@ -1,11 +1,12 @@
from __future__ import print_function
from argparse import ArgumentParser
try:
# python2
from urlparse import urlparse, urlunparse
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib.parse import urlparse, urlunparse
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
input = raw_input
except ImportError:
# python3

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json, os, re
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def group (items, key=lambda x: x):
ret = []

View File

@ -1,13 +1,14 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
import sys
from etherpump.commands.common import getjson
try:
# python2
from urlparse import urlparse, urlunparse
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib.parse import urlparse, urlunparse
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
input = raw_input
except ImportError:
# python3

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def main(args):

View File

@ -1,4 +1,4 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re, os, time
from datetime import datetime
@ -7,9 +7,10 @@ import pypandoc
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urlparse import urlparse, urlunparse
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
from urllib.parse import urlparse, urlunparse
except ImportError:
# python3
from urllib.parse import urlparse, urlunparse, urlencode, quote
@ -183,9 +184,9 @@ def main (args):
padmeta["lastedited_822"] = d.strftime("%a, %d %b %Y %H:%M:%S +0000")
return padmeta
pads = map(loadmeta, inputs)
pads = list(map(loadmeta, inputs))
pads = [x for x in pads if x != None]
pads = map(fixdates, pads)
pads = list(map(fixdates, pads))
args.pads = list(pads)
def could_have_base (x, y):

View File

@ -1,12 +1,13 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re, os
from datetime import datetime
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
except ImportError:
# python3
from urllib.parse import urlencode, quote

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
def main(args):
p = ArgumentParser("call getRevisionsCount for the given padid")

View File

@ -1,8 +1,9 @@
from __future__ import print_function
from argparse import ArgumentParser
import json, sys
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
import requests

View File

@ -1,11 +1,12 @@
from __future__ import print_function
from argparse import ArgumentParser
import json, sys
try:
# python2
from urllib2 import urlopen, URLError, HTTPError
from urllib import urlencode
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlencode
except ImportError:
# python3
from urllib.parse import urlencode, quote

View File

@ -1,7 +1,7 @@
from __future__ import print_function
from argparse import ArgumentParser
import json, sys, re
from common import *
from .common import *
"""
Extract and output selected fields of metadata

View File

@ -1,11 +1,12 @@
from __future__ import print_function
from argparse import ArgumentParser
import sys, json, re, os
from datetime import datetime
from urllib import urlencode
from urllib2 import urlopen, HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import urlopen
from urllib.error import HTTPError, URLError
from math import ceil, floor
from common import *
from .common import *
"""
status (meta):
@ -95,7 +96,7 @@ def main (args):
pad = PadItem(path=p)
padsbypath[pad.path] = pad
pads = padsbypath.values()
pads = list(padsbypath.values())
pads.sort(key=lambda x: (x.status, x.padid))
curstat = None