reader 2.11 released – metadata is tags

Hi there!

I'm happy to announce version 2.11 of reader, a Python feed reader library.

What's new? #

Quite a lot happened since reader 2.5!

Unified tag API + entry and global tags #

Tags and metadata are now the same thing, generic resource tags:

>>> reader.get_tag(feed, 'one', 'default')
'default'
>>> reader.set_tag(feed, 'one', 'value')
>>> reader.get_tag(feed, 'one')
'value'
>>> reader.set_tag(feed, 'two')
>>> dict(reader.get_tags(feed))
{'one': 'value', 'two': None}

This means you can filter by metadata keys, and attach values to tags.

Even better, tags aren't just for feeds¹ anymore – you can add tags to entries, and to a global namespace.

Memory usage improvements #

reader update uses about 22% less memory, owed to two changes.

The first one is not in reader itself, but was contributed to feedparser: instead of reading the whole feed in memory to detect encoding, use a prefix of the feed, and decode the rest on the fly.²

The result is a ~20% decrease in update_feeds() maximum resident set size (35%, when compared to baseline!).

reader will vendor the patched feedparser until the change is released upstream, so you can reap the benefits now.

The second one is parsing feeds serially, using workers only to retrieve them. Since parsing time is mostly spent in pure Python code, there's no speed-up from doing it in parallel – but each thread takes up extra memory.

This decreased update_feeds() memory usage by another ~20% when using more than one worker (but only on Linux; on macOS it's less notable).

Bug fixes #

The way reader checked SQLite has JSON support was somewhat brittle, causing it to fail on SQLite 3.38; reader 2.11 fixes this.

Usability improvements #

Among a number of smaller API improvements, now you can:

filter feeds in the same way both when getting and when updating feeds – including by tags
run arbitrary actions before updating a feed
add an existing feed without getting an exception
delete a missing feed or entry without getting an exception

For more details, see the full changelog.

That's it for now.

Learned something new today? Share this with others, it really helps! PyCoder's Weekly HN Reddit linkedin Twitter

What is reader? #

reader takes care of the core functionality required by a feed reader, so you can focus on what makes yours different.

reader in action reader allows you to:

retrieve, store, and manage Atom, RSS, and JSON feeds
mark articles as read or important
add arbitrary metadata to feeds and articles
filter feeds and articles
full-text search articles
get statistics on feed and user activity
write plugins to extend its functionality

...all these with:

a stable, clearly documented API
excellent test coverage
fully typed Python

To find out more, check out the GitHub repo and the docs, or give the tutorial a try.

Why use a feed reader library? #

Have you been unhappy with existing feed readers and wanted to make your own, but:

never knew where to start?
it seemed like too much work?
you don't like writing backend code?

Are you already working with feedparser, but:

want an easier way to store, filter, sort and search feeds and entries?
want to get back type-annotated objects instead of dicts?
want to restrict or deny file-system access?
want to change the way feeds are retrieved by using Requests?
want to also support JSON Feed?

... while still supporting all the feed types feedparser does?

If you answered yes to any of the above, reader can help.

Why make your own feed reader? #

So you can:

have full control over your data
control what features it has or doesn't have
decide how much you pay for it
make sure it doesn't get closed while you're still using it
really, it's easier than you think

Obviously, this may not be your cup of tea, but if it is, reader can help.

The old feed-specific tag and metadata methods are still available, but are deprecated and will be removed in version 3.0. ^[return]
If you'd like to read more about the whole thing, drop me a line. ^[return]