319 words
2 minutes
004: πŸ§ͺ Testing ArchiveBox release candidate 0.8.5rc50 -- with uv

I found myself wanting to test out the v0.8 Release Candidate for ArchiveBox. Since setting a custom port is very easy, running multiple instances of archivebox is trivial:

uv tool run archivebox@0.8.5rc50 init
uv tool run archivebox@0.8.5rc50 manage createsuperuser
uv tool run archivebox@0.8.5rc50 server 6789

🎊 We are up and running!

localhost 6789

Let’s set up the rest of the dependencies:

uv tool run archivebox@0.8.5rc50 install

πŸ§ͺ RSS Parsing#

The improvements to RSS parsing are of particular interest to me.

🐘 Let’s attempt to backup my toots:

uv tool run archivebox@0.8.5rc50 add --parser=rss --depth=1 https://infosec.exchange/@brie.rss

No dice: AttributeError: object has no attribute 'title'.

uv tool run archivebox@0.8.5rc50 add --parser=rss --depth=1 https://brie.dev/rss.xml

Errors were reported. The command completed successfully after several minutes but nothing appears in the UI.

AttributeError: object has no attribute 'updated_parsed'

πŸ† I was successful in parsing my private and public Pinboard feeds.

brie.dev/troubleshooting parsed via public Pinboard feed

It is interesting to observe some of the flags that are passed to Google Chrome:

--virtual-time-budget=15000 
--disable-features=DarkMode
--run-all-compositor-stages-before-draw 
--hide-scrollbars 
--autoplay-policy=no-user-gesture-required 
--no-first-run
--use-fake-ui-for-media-stream 
--use-fake-device-for-media-stream "--simulate-outdated-no-au='Tue, 31 Dec 2099 23:59:59 GMT'"
--screenshot "https://brie.dev/troubleshooting/"

πŸ§ͺ Webhooks#

Let’s try adding one via the UI! It works. It looks like the version is not parsed properly in the user-agent that is printed in the webhook payload:

"--user-agent",
      "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36 ArchiveBox/{VERSION} (+https://github.com/ArchiveBox/ArchiveBox/)"

🐾 Next Steps#

For my production ArchiveBoxes, I think I would definitely want to parse through the webhook payload to filter things so that…

  • I am notified of some failures
  • I get a digest of successful ingestions and updates

πŸ”­ Observations#

  • βž• The UI is a bit prettier and that’s nice!
  • βž• Things go better when all dependencies have been met.

πŸ•ΈοΈ βœ‹ πŸ’― HTTP 500#

I can induce a HTTP 500 by…

  • attempting to delete a snapshot with any data saved
  • looking at any /change URL
  • leaving Referenced model blank when adding a webhook

πŸ“š READmore#

Congrats on being an enthusiastic internet archiver! πŸ‘Œ
004: πŸ§ͺ Testing ArchiveBox release candidate 0.8.5rc50 -- with uv
https://brie.ninja/posts/004/
Author
Brie Carranza
Published at
2025-01-22