Commit Graph

27 Commits

Author SHA1 Message Date
rra
27a5c9b1d7 minor tweaks 2020-05-06 09:52:38 +02:00
rra
2278cd9d3d about_crawler now properly uses functions from fedicrawler 2020-05-05 16:55:59 +02:00
rra
f62bff244c repairs 2020-05-05 16:49:24 +02:00
rra
65ddc49057 working on making the script importable by about_collector 2020-05-05 16:28:37 +02:00
rra
27a6fb1a0a now continues where it left off last time 2020-05-05 16:28:04 +02:00
rra
1b1e5b1e52 add info about new script 2020-05-05 15:25:18 +02:00
rra
3f5d2bbad0 minor tweaks to fedicrawler & scrape 05-05-2020 2020-05-05 15:25:05 +02:00
rra
c003a6ae96 added new script to document mastodon about pages 2020-05-05 15:24:34 +02:00
rra
0429659306 update readme 2020-04-30 12:53:54 +02:00
rra
15d19e0435 update readme 2020-04-30 12:53:04 +02:00
rra
30fe54a03e april 30 2020 2020-04-30 12:38:14 +02:00
rra
1512278240 now based primarily on nodeinfo2, add socksproxy, filtering out weird stuff 2020-04-30 12:37:57 +02:00
rra
2ab9879118 29/4 2020 2020-04-29 13:16:30 +02:00
rra
26b4e4a868 april 2020 2020-04-28 14:30:31 +02:00
rra
b4c5d50a77 updated with a way to get around gab.best enumeration and better error logging 2020-04-28 14:30:18 +02:00
rra
ce613426d0 rerun scrape may 2019 2019-05-08 17:52:46 +02:00
rra
e9c8e8341c rerun scrape march 2019 2019-03-27 11:59:45 +01:00
rra
59fd15dd7d small fixes 2018-06-07 23:57:33 +02:00
rra
381e44b2d9 scraper now uses parallelism 2018-06-07 23:51:07 +02:00
rra
c2c1bbccaf scrape results of v2 on 07-06-2018 2018-06-07 23:47:06 +02:00
rra
09d76040eb crawler now scrapes in parallel threads 2018-06-07 23:46:36 +02:00
rra
776ac11b52 more info on methodology and where it is lacking 2018-05-30 13:15:26 +02:00
rra
da97f04832 scrape with metadata on 30/5/2018 2018-05-30 13:05:16 +02:00
rra
cb646ab40e made changes in file saving 2018-05-30 13:04:56 +02:00
rra
4cac59e445 crawler now looks for instance metadata, started to abstract collection into functions 2018-05-30 10:27:02 +02:00
rra
d90366d4c6 gitignore and readme 2018-05-30 09:05:29 +02:00
rra
abbb8a6dd7 first version, crawls only the announced peers 2018-05-30 08:20:46 +02:00