Custom Feeds.

Feed43 is a site where one can create RSS Feeds from web sites (without ready made feeds).

I’ve used it a few times, works quite fine after you get the logic.

Why am I writing this? Well, this one particular site got a redesign, too. Broke the feed, of course, provides no feeds whatsoever. Complained about the thing, message was that if I please could fix the feed! (As the RSS -feature would cost that particular nonprofit org some 30 euros/month or so…)

Oh well, this might be a good moment to write down what has to be done in order to get the feed running, again!

Ok, first surf to Feed43 (open the feedable site in another window).

Either login in to access your old feeds or just create a new one. (I’ll use my own account here).

Type in the source page’s address (this should be the page you want to feed). Don’t forget the encoding (here in Finland mostly ISO-8859-1)!

Define extraction rules.

  • {%} is the (global) selection rule (select the whole html-document for processing with the wildcard {%})
  • {*} is the whatever -operator
  • item rule is the rule for feed items.

Global search pattern for this case:

{*}<table class=”NewsSummaryContent SummaryContent”>{%}</table>{*}

Pattern states that:

  • first we don’t care what comes before <table class=”NewsSummaryContent … as all the news are located there, hence the {*}
  • <table class=”NewsSummaryContent SummaryContent”>{%}</table> tells the parser that we are interested in the contents of that particular table, we’ll type {%}
  • and last but not the least, we don’t care about the rest: {*}

If everything works fine, you should get all the news in the clipped data -section.

Item rules are a bit tricky, depending on the source html. In this case the html is shitty at best, so the work is not nice at all (tables are used for layout and so on.).

Anyway…

The main idea is to find repeating patterns of items for the news. For example title, date, excerpt and url.

For this case the html looks like this:

<tr>
<td>
<p class=”odd”><a href=”http://www.example.fi/kaikki_uutiset/?x137352=147144“>Just some title</a>
<span class=”pvm”>(3.6.2008)</span></p><div class=”ingressi”>This is the excerpt.</div></td>
</tr>

and the item rule like this:

<tr>{*}<a href=”{%}”>{%}</a>{*}<span class=”pvm”>{%}</span>{*}<div class=”ingressi”>{%}</div>{*}</tr>

Could be better and more optimized, but who cares? I certainly know I don’t…

Anyway, when writing the item rules you can always check the result from the “Clipped data” -view, which is nice.

Next step is to define output format, titles for the feed and so on. After setting those, the interesting part begins – RSS Item properties.
Select the correct template tag (marked with {%1}, {%2} and so on) and type it in the right input field (in my case title is number two and three as I want to include the date, too. I type {%2}{%3})

And voila: there it is, the feed, fixed.

Now just figure out a good url and send it to everyone you know…

Tips:

  • Be patient, the item rules are sometimes tricky
  • Set up an account if you want to control your feeds – otherwise anyone can edit them.

Good luck with feeding!