Browse Source

adding RECbot in the mix

master
manetta 3 years ago
parent
commit
c8e34955e1
  1. 87
      RECbot/README.md
  2. 310
      RECbot/RECbot.py
  3. 16
      RECbot/generate_handles.py
  4. 47
      RECbot/templates/index.html
  5. 68
      RECbot/templates/stylesheet.css

87
RECbot/README.md

@ -0,0 +1,87 @@
# RECbot
A small XMPP bot written in Python that logs XMPP conversations into a HTML page, allowing collaborative log writing over time.
The bot is used in group chats, where it includes all images that are send to the group and all messages that include `@bot`.
*work-in-progress*
## Situated tails
* Archive bot, Relearn 2017, <https://gitlab.com/relearn/relearn2017/-/tree/master/xmpp-bots/archive-bot>
* Streambot, Varia website extension 2017-2018, <https://git.vvvvvvaria.org/varia/xmpp.streambot>
* Logbot, Varia XMPP extension 2017-2020, <https://git.vvvvvvaria.org/varia/bots/src/branch/master/logbot>
## Use RECbot
* check if `RECbot` is one of the participants in the groupchat!
* send an image to the groupchat **OR** use one of the `__ACTION WORDS__` below
* the bot replies and thanks you kindly
* check the output of RECbot (locally or online, for example: <https://vvvvvvaria.org/logs>)
RECbot works with `__ACTION WORDS__` and unique `:HANDLES`.
* `__ADD__` RECbot entries with `__ADD__ <message>`, for example: `__ADD__ Logging as a form of stretching time.` or `__ADD__ https://nicelink.org`
* `__DELETE__` RECbot entries with `__DELETE__ :HANDLE`, for example: `__DELETE__ :~+*/+-` (\*spark)
* `__BOOK__` (\*sparks)
## Install RECbot
RECbot uses the `slixmpp` library to connect to XMPP and `beautifulsoup` to parse the HTML pages.
`$ sudo pip3 install slixmpp beautifulsoup4`
## Run RECbot!
`$ python3 RECbot.py`
The bot will ask you to provide the following details:
* XMPP address of a (bot)account
* password
* groupchat address
* nickname for the bot
* output folder path
You can also run it as a oneliner, for example by writing:
`$ python3 RECbot.py -u bot@vvvvvvaria.org -p CHANGEME -g roomname@muc.vvvvvvaria.org -n RECbot -o /var/www/logs/`
* `-u` / `--use` = user / use this XMPP address
* `-p` / `--password` = password
* `-g` / `--groupchat` = groupchat
* `-n` / `--nickname` = nickname
* `-o` / `--output` = output
## \*sparks
-----------
It would be so nice to have different RECbot *modes*: `--log`, `--stream`, `--distribusi`
* `--log`: RECbot writes a growing HTML page with images and text, that can be marked up and styled in HTML/CSS.
* `--stream`: RECbot stores all images that are send to the group, and displays them as an image stream.
* `--distribusi`: RECbot saves files (images, messages as markdown, files, links as HTML pages) and generates a distribusi page of all collected material.
Under the hood the process can be cut up into two procedures:
* saving text/image/audio/video based messages as files (.txt, .png/.jpg, .ogg, .og4/.mp4)
* recbot.py
* generating different outputs, depending on the selected *mode*
* distribusi.py[\*]
* log.py[\*]
* stream.py[\*]
These modes can be changed at any moment.
[\*] These are standalone scripts. They can be used on any set of files in a folder and generate HTML pages with customizable styling.
------------
How can `__ACTION WORDS__` become `__MAGIC WORDS__` ???
------------

310
RECbot/RECbot.py

@ -0,0 +1,310 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# To run this bot:
# $ python3 logbot.py
# The output folder of this bot currently is: /var/www/logs/digital-autonomy
import logging
from getpass import getpass
from argparse import ArgumentParser
import slixmpp
import ssl, os, requests, urllib
from datetime import datetime
from bs4 import BeautifulSoup
import os, re, random
def check_handle(handle, used_handles):
if handle in used_handles:
handle_is_already_used = True
else:
handle_is_already_used = False
return handle_is_already_used
def request_handle(used_handles_path):
used_handles = open(used_handles_path, 'r').readlines()
handles = open('handles.txt', 'r').readlines()
handle = random.choice(handles).replace('\n','')
# check if handle is not used yet!
handle_is_already_used = False
if handle in used_handles:
handle_is_already_used = True
while check_handle(handle, used_handles) == True:
handle = random.choice(handles)
# add handle to .handles.txt
with open(used_handles_path, 'a+') as h:
h.write(handle)
return handle
def write_to_log(self, entry):
output = self.output
# print(f'Output: { output }')
log = 'index.html'
css = 'stylesheet.css'
used_handles = '.handles.txt'
log_path = os.path.join(output, log)
css_path = os.path.join(output, css)
used_handles_path = os.path.join(output, used_handles)
# check if file exists, if not: write it!
if not os.path.isfile(log_path):
html_template = open('templates/log.html', 'r').read()
css_template = open('templates/stylesheet.css', 'r').read()
with open(log_path, 'w') as l:
l.write(html_template)
l.write(f'<h1>{ self.groupchat }</h1>')
with open(css_path, 'w') as c:
c.write(css_template)
with open(used_handles_path, 'w') as h:
h.write('-----')
# add entry to log
handle = request_handle(used_handles_path)
print(f'Picked a handle: { handle }')
now = datetime.now().strftime('%A %d %B (%Y)')
print(f'Now is: { now }')
post = f'''<div id="{ handle }" class="post">
<small class="postid">{ handle }</small>
{ entry }
<small class="date">Added on { now }</small>
<small class="tags">Tags:<span class="tagcontainer"></span></small>
</div>'''
print(f'Post: { post }')
with open(log_path, 'a+') as l:
l.write(post)
print('added to the log!')
with open(used_handles_path, 'a+') as h:
h.write(handle)
print('added to the .handles file!')
def find_in_soup(self, handle, tag):
print('--------ADD TAG ---------')
print(f'handle: { handle }')
log = 'index.html'
log_path = os.path.join(self.output, log)
html = open(log_path, 'r').read()
soup = BeautifulSoup(html, 'html.parser')
# print(soup.prettify())
post = soup.find(id=handle)
# print(f'posts: { posts }')
# for post in posts:
print(f'post: { post }')
if post:
# tagcontainer = post.findChildren(id="tagcontainer", recursive=True)[0]
# print(f'tagcontainer: { tagcontainer }')
# print(f'tagcontainer.contents: { tagcontainer.contents }')
# tagcontainer.contents.append(f'<span class="tag">{ tag }</span>')
# print(f'tagcontainer.contents: { tagcontainer.contents }')
# new_tag = soup.new_tag("a", href="http://www.example.com")
new_tag = soup.new_tag("span")
new_tag.append(tag)
soup.find(id=handle).find(class_="tagcontainer").append(new_tag)
print(f'new soup: { str(soup) } ')
# write soup to file
with open(log_path, 'w') as l:
l.write(str(soup))
class MUCBot(slixmpp.ClientXMPP):
"""
A simple Slixmpp bot that will save images
and messages that are marked with @bot to a folder.
"""
def __init__(self, use, password, groupchat, nickname, output):
slixmpp.ClientXMPP.__init__(self, use, password)
self.groupchat = groupchat
self.nick = nickname
self.output = output
# The session_start event will be triggered when
# the bot establishes its connection with the server
# and the XML logs are ready for use. We want to
# listen for this event so that we we can initialize
# our roster.
self.add_event_handler("session_start", self.start)
# The groupchat_message event is triggered whenever a message
# stanza is received from any chat room. If you also also
# register a handler for the 'message' event, MUC messages
# will be processed by both handlers.
self.add_event_handler("groupchat_message", self.muc_message)
def start(self, event):
self.get_roster()
self.send_presence()
# https://xmpp.org/extensions/xep-0045.html
self.plugin['xep_0045'].join_muc(self.groupchat,
self.nick,
# If a room password is needed, use:
# password=the_room_password,
wait=True)
# NOTE(luke): disabled for now. We'll make it possible to speak to logbot privately later
# Send a message to the room
# self.send_message(mto=self.groupchat, mbody='Hello! RECbot here. I\'m new :). You can log text/image/sound/video messages, by including @bot in your message. Happy logging! PS. you can access the logs at https://vvvvvvaria.org/logs/', mtype='groupchat')
def muc_message(self, msg):
# Some inspection commands
#print('Message: {}'.format(msg))
# Always check that a message is not the bot itself, otherwise you will create an infinite loop responding to your own messages.
if msg['mucnick'] != self.nick:
# Check if output folder exists
if not os.path.exists(self.output):
os.mkdir(self.output)
# Check if an OOB URL is included in the stanza (which is how an image is sent)
# (OOB object - https://xmpp.org/extensions/xep-0066.html#x-oob)
if len(msg['oob']['url']) > 0:
# Send a reply
self.send_message(mto=self.groupchat,
mbody="Super, our log is growing. Your image is added!",
mtype='groupchat')
# Save the image to the output folder
url = msg['oob']['url'] # grep the url in the message
filename = os.path.basename(url) # grep the filename in the url
output_path = os.path.join(self.output, filename)
u = urllib.request.urlopen(url) # read the image data
f = open(output_path, 'wb') # open the output file
f.write(u.read()) # write image to file
f.close() # close the output file
# Add the image to the log
img = f'<div class="entry image"><img src="{ filename }"></div>'
write_to_log(self, img)
# Include a new post in the log (only when '__ADD__' is used in the message)
if '__ADD__' in msg['body']:
# reply from the bot
self.send_message(mto=self.groupchat,
mbody=f'Noted! And added to the log. Thanks { msg["mucnick"] }!',
mtype='groupchat')
# Add the message to the log!
message = msg['body'].replace('__ADD__','')
message = f'<div class="entry text">{ message }</div>'
write_to_log(self, message)
# Include a new post in the log (only when '__ADD__' is used in the message)
if '__ANNOTATE__' in msg['body']:
handle = msg['body'].split()[1]
annotation = msg['body'].replace('__ANNOTATE__', '').replace(handle, '')
post = find_in_soup(self, handle, annotation)
# reply from the bot
self.send_message(mto=self.groupchat,
mbody="Thanks!",
mtype='groupchat')
# Check if this is a book ...
if '__BOOK__' in msg['body']:
self.send_message(mto=self.groupchat,
mbody="Oh a book, that's cool! Thanks {}!".format(msg['mucnick']),
mtype='groupchat')
# Start of book feature
book = msg['body'].replace('@bot', '').replace('/book', '')
book = re.sub(' +', ' ', book) # remove double spaces
book = book.lstrip().rstrip() # remove spaces at the beginning and at the end
book = book.replace(' ', '+').lower() # turn space into + and lowercase
page_link = 'https://www.worldcat.org/search?q={}&qt=results_page'.format(book)
page_response = requests.get(page_link, timeout=5)
page_content = BeautifulSoup(page_response.content, "html.parser")
try:
book_title = page_content.findAll("div", {"class": "name"})[0].text
book_author = page_content.findAll("div", {"class": "author"})[0].text
book_publisher = page_content.findAll("div", {"class": "publisher"})[0].text
response = '<b>BOOK</b>: ' + book_title + ' ' + book_author + ' ' + book_publisher
book_found = True
except IndexError:
book_found = False
if book_found:
# Add message to log
message = '<b>BOOK</b>: ' + book_title + ' ' + book_author + ' ' + book_publisher
message = f'<div class="entry book">{ message }</div>'
write_to_log(self, message)
self.send_message(mto=self.groupchat, mbody='Hope this was the book you were looking for: ' + book_title + ' ' + book_author + ' ' + book_publisher, mtype='groupchat')
else:
self.send_message(mto=self.groupchat, mbody='Sorry, no book found!', mtype='groupchat')
if __name__ == '__main__':
# Setup the command line arguments.
parser = ArgumentParser()
# output verbosity options.
parser.add_argument("-q", "--quiet", help="set logging to ERROR",
action="store_const", dest="loglevel",
const=logging.ERROR, default=logging.INFO)
parser.add_argument("-d", "--debug", help="set logging to DEBUG",
action="store_const", dest="loglevel",
const=logging.DEBUG, default=logging.INFO)
# Different options.
parser.add_argument("-u", "--use", dest="use",
help="XMPP address to use")
parser.add_argument("-p", "--password", dest="password",
help="password to use")
parser.add_argument("-g", "--groupchat", dest="groupchat",
help="groupchat to join")
parser.add_argument("-n", "--nick", dest="nickname",
help="nickname for the bot")
parser.add_argument("-o", "--output", dest="output",
help="output folder, this is where the files are stored",
type=str)
args = parser.parse_args()
# Setup logging.
logging.basicConfig(level=args.loglevel,
format='%(levelname)-8s %(message)s')
if args.use is None:
args.use = input("Use this XMPP address for the bot: ")
if args.password is None:
args.password = getpass("Password: ")
if args.groupchat is None:
args.groupchat = input("Groupchat XMPP address: ")
if args.nickname is None:
args.nickname = input("Nickname for the bot: ")
if args.output is None:
args.output = input("Output folder path of the log: ")
# Setup the MUCBot and register plugins. Note that while plugins may
# have interdependencies, the order in which you register them does
# not matter.
xmpp = MUCBot(args.use, args.password, args.groupchat, args.nickname, args.output)
xmpp.register_plugin('xep_0030') # Service Discovery
xmpp.register_plugin('xep_0045') # Multi-User Chat
xmpp.register_plugin('xep_0199') # XMPP Ping
xmpp.register_plugin('xep_0066') # Process URI's (files, images)
# Connect to the XMPP server and start processing XMPP stanzas.
xmpp.connect()
xmpp.process()

16
RECbot/generate_handles.py

@ -0,0 +1,16 @@
import random
characters = ['*','+','-','/','-','-']
out = open('handles.txt', 'w')
handles = set()
# generate handles
while len(handles) < 1000:
handle = ''
for h in range(5):
handle += random.choice(characters)
handles.add(handle)
# write handles to file
for handle in handles:
out.write(handle + '\n')

47
RECbot/templates/index.html

@ -0,0 +1,47 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Log</title>
<link rel="stylesheet" type="text/css" href="stylesheet.css">
</head>
<body>
<div id="welcome">
<p>Welcome to this Log!</p>
<p>This Log file is written through <em>logbot</em> and chat messages exchanged in a <em>XMPP groupchat</em>.</p>
<hr>
<p>For the writers of this log, you can:
<br>
<br>
send an image,
<br>
<br>
<code>__ADD__</code> a message,
<br>
<br>
<code>__DELETE__</code> it by using the <code>~HANDLE</code> on the left (*spark),
<br>
<br>
<code>__ANNOTATE__</code> something using the <code>~HANDLE</code>,
<br>
<br>
<!-- <code>__ECHO__</code> material using the <code>~HANDLE</code> or a <code>#TAG</code> (*spark), -->
<!-- <br> -->
<!-- <br> -->
<code>__BOOK__</code> (*spark, almost there),
<br>
<br>
or, ... (*spark)
</p>
<!-- <hr> -->
</div>
<!-- <div id="echo">
<label for="echo" style="display: none;">__ECHO__</label>
<select name="echo" id="echo">
<option value="~HANDLE">~HANDLE</option>
<option value="#TAG">#TAG</option>
</select>
<input type="text" name="echo">
<button><code>__ECHO__</code></button>
</div> -->
<!-- Hmm ... We don't close the body anymore ... -->

68
RECbot/templates/stylesheet.css

@ -0,0 +1,68 @@
body{
background-color: lightgrey;
min-width: 1080px;
margin: 40px;
font-size: 20px;
line-height: 24px;
}
div#welcome{
float: right;
top:40px;
right:40px;
width: 200px;
font-size: 16px;
}
div#welcome p{
margin:0 0 1em 0;
padding:0;
}
div#welcome hr{
border:1px dotted blue;
margin:2em 0;
}
div#echo{
position: fixed;
bottom: 0;
left: 0;
width: 100%;
padding: 0.5em;
background-color: pink;
}
div.post{
margin: 2em 5em 2em 9em;
width: 800px;
}
div.post span.tagcontainer span{
padding-left: 0.5em;
color: blue;
}
p{
margin: 1em 0;
}
code{
color: blue;
}
small{
font-size: 12px;
line-height:1.2;
}
small.postid{
float: left;
font-family: monospace;
margin: 0 0 0 -180px;
padding: 1em 1.5em;
/*border-radius: 50px;*/
/*border: 1px dotted blue;*/
color: blue;
background-color: white;
font-size: 20px;
}
small.date{
display:block;
color:magenta;
margin:1em 0;
}
img{
max-width: 100%;
}
Loading…
Cancel
Save