[Author Prev][Author Next][Thread Prev][Thread Next][Author Index][Thread Index]

[tor-dev] Atlas is not that friendly to Web Archive



Hello!

I've recently found out that new Atlas re-design is not that friendly to
web archive. http://archive.li/ can't properly detect "page loaded"
event that leads to capturing "loading" page[%]. Moreover,
https://web.archive.org/ can't capture #-based links at all, as far as I see.

[%] https://archive.li/https://atlas.torproject.org/%23details/5C3B8FB35A13C508CF65E8499E35755DA098DC93

Ability to archive atlas pages is kinda nice to be able to "cite" some
relay status in some specific date as Atlas has no it's own time machine
and information about relay is purged in a few days after relay going down.
https://archive.li/RzGpJ is better than https://archive.li/JGQRW :-)

I'm not a skilled frontend developer, but maybe trading some Time-to-DOM
making JS loading and onionoo.tpo request synchronous should be
enough to make website friendly for that sort of crawlers... But it's
unclear to me if T2DOM is valuable KPI for Atlas or not :)

What do you think?

-- 
WBRBW, Leonid Evdokimov, xmpp:leon@xxxxxxxxxxxx http://darkk.net.ru tel:+79816800702
PGP: 6691 DE6B 4CCD C1C1 76A0  0D4A E1F2 A980 7F50 FAB2

Attachment: signature.asc
Description: PGP signature

_______________________________________________
tor-dev mailing list
tor-dev@xxxxxxxxxxxxxxxxxxxx
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev