mb@mb
cd6913bfce
|
6 years ago | |
---|---|---|
static | 6 years ago | |
templates | 6 years ago | |
.gitignore | 6 years ago | |
README.md | 6 years ago | |
Screenshot from 2018-08-31 10-49-07.png | 6 years ago | |
Screenshot from 2018-09-10 15-41-36.png | 6 years ago | |
index.json | 6 years ago | |
start.py | 6 years ago | |
tfidf.py | 6 years ago |
README.md
searrrrrrrrrrch (prototype)
A small flask exercise, combining the TFIDF algorithm written in python with a web interface.
Grrrrrrrrrrls is a project in progress for the Computer Grrrls exhibition at the HMKV & La Gaîté Lyrique.
Install
$ pip3 install flask
$ pip3 install nltk
Start
Start the flask/python local server ...
$ python3 start.py
Browse to your localhost on port 5000 ...
> 127.0.0.1:5000
Txt documents
The search machine is using the index.json file to process results. The function 'create_index' can be called to generate this file. It uses a set of plain text files to index each word and its corresponding TFIDF value. The plain text files are not included in this repo, i don't think i can publish them like that.
Changing txt documents
If you want to work with another set of texts, make a 'txt/' folder, add a few txt files in it, and remove the index.json file (or rename it if you want to keep it with you).
To generate a new index.json file:
Remove the index.json file
$ rm index.json
Stop and start the python server...
ctrl + c
$ python3 start.py
Notes
This Grrrrrrrrrrls search machine cannot handle too much at once: it can only work with one word.
This is a prototype :)
![Screenshot from 2018-08-31 10-49-07.png](Screenshot from 2018-08-31 10-49-07.png)