implement whoosh as a simple text search #11

Closed
opened 1 year ago by crunk · 3 comments
crunk commented 1 year ago
Owner

https://whoosh.readthedocs.io/en/latest/intro.html
Whoosh is a fast, pure Python search engine library.

shouldn't be to heavy since the books db, is a single CSV file

https://whoosh.readthedocs.io/en/latest/intro.html Whoosh is a fast, pure Python search engine library. shouldn't be to heavy since the books db, is a single CSV file
Poster
Owner

Not to be confused with whoosh, the subdomain for Zulip :P

Not to be confused with whoosh, the subdomain for Zulip :P
Poster
Owner

pip install fuzzysearch, its just a single csv as content and whoosh really wants an entire collection of documents.

We could artificially split the csv into lines and act like they are all documents, but whoosh is a big project that this doesn't really need.

pip install fuzzysearch, its just a single csv as content and whoosh really wants an entire collection of documents. We could artificially split the csv into lines and act like they are all documents, but whoosh is a big project that this doesn't really need.
Poster
Owner

whoosh implemented with d306b61b2d, ff7189af66, 52b513bc2a

the reason why fuzzysearching doesn't work is because you want to search on multiple fields in the CSV.
if you add the whole csv row: fuzzy search sees "false" positives in almost every string.

Whoosh was the only pip package that yielded good results.

whoosh implemented with d306b61b2d, ff7189af66, 52b513bc2a the reason why fuzzysearching doesn't work is because you want to search on multiple fields in the CSV. if you add the whole csv row: fuzzy search sees "false" positives in almost every string. Whoosh was the only pip package that yielded good results.
crunk closed this issue 10 months ago
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.