various cross-reading prototypes
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
manetta 1fda8c7925 short round of debugging this prototype, keeping the option in to use multiple languages ... (add [EN] or [NL] or [FR] or any other tag you want in the filename of a document to let the tool search only within that language) 6 years ago
..
static/css first commit with 2 cross-reading prototypes 6 years ago
templates first commit with 2 cross-reading prototypes 6 years ago
txt first commit with 2 cross-reading prototypes 6 years ago
.gitignore adding a slash to the pycache ignore 6 years ago
README.md first commit with 2 cross-reading prototypes 6 years ago
index.json first commit with 2 cross-reading prototypes 6 years ago
readings.py short round of debugging this prototype, keeping the option in to use multiple languages ... (add [EN] or [NL] or [FR] or any other tag you want in the filename of a document to let the tool search only within that language) 6 years ago
start.py short round of debugging this prototype, keeping the option in to use multiple languages ... (add [EN] or [NL] or [FR] or any other tag you want in the filename of a document to let the tool search only within that language) 6 years ago
tfidf.py short round of debugging this prototype, keeping the option in to use multiple languages ... (add [EN] or [NL] or [FR] or any other tag you want in the filename of a document to let the tool search only within that language) 6 years ago
words.txt short round of debugging this prototype, keeping the option in to use multiple languages ... (add [EN] or [NL] or [FR] or any other tag you want in the filename of a document to let the tool search only within that language) 6 years ago

README.md

█▀▀ █▀▀█ █▀▀█ █▀▀ █▀▀ ░░ █▀▀█ █▀▀ █▀▀█ █▀▀▄ ░▀░ █▀▀▄ █▀▀▀ █▀▀ 
█░░ █▄▄▀ █░░█ ▀▀█ ▀▀█ ▀▀ █▄▄▀ █▀▀ █▄▄█ █░░█ ▀█▀ █░░█ █░▀█ ▀▀█ 
▀▀▀ ▀░▀▀ ▀▀▀▀ ▀▀▀ ▀▀▀ ░░ ▀░▀▀ ▀▀▀ ▀░░▀ ▀▀▀░ ▀▀▀ ▀░░▀ ▀▀▀▀ ▀▀▀ 

cross-reader (TF-IDF)

(a few notes)

Install

$ pip3 install flask 

$ pip3 install nltk

Start

Start the flask/python local server ...

$ python3 start.py

Browse to your localhost on port 5000 ...

> 127.0.0.1:5000

Txt documents

The search machine is using the index.json file to process results.

The function 'create_index' can be called to generate this file. It uses a set of plain text files to index each word and its corresponding TFIDF value.

Changing txt documents

If you want to work with another set of texts, make a 'txt/' folder, add a few txt files in it, and remove the index.json file (or rename it if you want to keep it).

To generate a new index.json file:

Remove the index.json file

$ rm index.json

Stop and start the python server...

ctrl + c

$ python3 start.py