9 Ways to Save Bandwidth on your RSS Feed
By Pete Freitag
One of the things you will notice after you have published an rss feed is that it will consume a lot of the bandwidth. For example on my site Spendfish 18% of all requests are for RSS feeds. This is no wonder since feed readers may download your feed several times a day even if nothing has changed.
I've put together a list of ways you can save bandwidth and reduce the number of requests to your RSS feed (which also saves server resources).
1 - Use the ttl
Tag
The ttl
tag goes directly inside the channel
tag in your RSS feed. It stands for time to live, and should hold the number of minutes a RSS reader should wait before requesting your feed again. Most blog software defaults this setting to 60
, which means that your feed will be downloaded every hour by clients that obey this setting. We can increase this to 3 hours by adding the following:
<ttl>180</ttl>
Read more about ttl
here
2 - Add a skipDays
tag
The skipDays
tag also goes inside the channel
tag and should contain several child day
tags. As you might guess this tells RSS readers to skip downloading your feed on the specified days. For example if you don't publish content on the weekends:
<skipDays> <day>Saturday</day> <day>Sunday</day> </skipDays>
3 - Add a skipHours
tag
Same idea as the skipDays
tag, but allows you to specify which hours during the day your feed should not be downloaded. The hours are specified in 0-23 using GMT. For example since I'm in New York, and I typically don't post things to my blog very early in the morning I could add:
<skipHours> <hour>7</hour> <hour>8</hour> </skipHours>
More info about skipHours
and skipDays
here.
4 - Support If-Modified-Since
header
Many RSS readers and clients send an If-Modified-Since
header in their request to your RSS feed. This is one of the ways clients make what's called a conditional HTTP GET, you can return a 304 Not Modified
HTTP response code (and omit the request body) if the RSS feed has not changed since the date specified in the If-Modified-Since
header. If the content has changed you simply return the normal 200
status code.
The header sent by the client might look something like this:
If-Modified-Since: Tue, 10 Jul 2007 21:19:55 GMT
Most clients will pass in the value you specify in the Last-Modified
header, so you should make sure that header is being populated. More info here. ColdFusion If-Modified-Since example here.
5 - Support ETag
and If-None-Match
HTTP headers
The ETag
header is a HTTP response header that you can send back in your RSS feed response. It stands for entity tag and should be a unique value representing the content, you could do a MD5 hash of your RSS feed content, or simply use a date time of the last change. Clients will send this back in a If-None-Match
header, if this header contains your current ETag then you can return a 304
status code. More info about If-None-Match
here.
6 - Don't send the request body on HTTP HEAD requests
Here's what HTTP 1.1 has to say about the HEAD
method:
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response.
The HEAD
method should only return HTTP headers, and no request body (so you don't need to return your entire RSS for these types of requests). You will find that several aggregators and readers will make HEAD
requests for your feed (including MXNA), it's a simple way to save some bandwidth.
7 - Limit the number of items you publish
This one is kind of obvious, but it is a very easy way to save bandwidth in your rss feed. If you are publishing 15 items or articles in your feed, if you lower that to 10 you can save a good amount of bandwidth.
8 - Publish Partial Content feeds
Another obvious way to save bandwidth - don't publish full articles in your feed. By publishing just the first few sentences of content you can also save a good amount of bandwidth. Your readers may not be too happy about this one however.
9 - Use Gzip Compression
Configure your web server to return the contents using gzip compression
Do you have any other techniques that work? I considered adding Cache-Control
, but I wonder if that would actually take away the savings you would get from a Conditional Get?
I have to thank Charlie Arehart for giving me the idea for this blog entry. After my Working with RSS in ColdFusion presentation at cfunited he suggested that I write a blog entry on ways to reduce the bandwidth consumption of your RSS feed. After doing the research for this entry I realize that this topic could the subject of an entire presentation!
9 Ways to Save Bandwidth on your RSS Feed was first published on July 12, 2007.
If you like reading about rss, bandwidth, performance, if-modified-since, etag, cache, caching, http, or feeds then you might also like:
- The MySQL Query Cache
- Cache Template in Request Setting Explained
- Foundeo's 2007 End of the Year Sale
- Yahoo Pipes Generates Invalid RSS Feeds
Discuss / Follow me on Twitter ↯
Tweet Follow @pfreitagComments
And yes, a talk on the topic would be a good idea. Perhaps no single CF user group would care to hear it, but the world of CF (and indeed all) bloggers would benefit. How about recording one with Connect (or Captivate/Camtasia/CamStudio) and then sharing it with the world? :-)
Support for RFC3229+feed instance manipulation can shave tons of bandwidth with really huge feeds... Planet-type aggregated feeds, primarily.
For those who don't know, RFC3229 provides a way to deliver deltas via HTTP. In the case of Atom/RSS, it means the client sends along an "A-IM: feed" header with it's GET request. The server sees this, then checks If-None-Match and uses it to derive a subset of entries that are then returned to the client.
The idea is that the client only receives new entries that it hasn't seen before. Issues to bear in mind:
(1) Increased CPU utilization... instead of sending out one (presumably cached) feed to all clients, you're cooking up individual feeds (and database hits) for each reader.
(2) If your feed items contain lots of constantly changing meta info (comment counts and so on), then you're either going to have to keep resending updated entries (thus voiding the benefit of RFC3229), or drop the metadata.
(3) There's a temptation to just send feed deltas to everyone, requested or not... bad idea. There are still plenty of widget-style aggregators out there that expect to receive a full feed on every request.