Log in or register to vote.

webstemmer

A web crawler and HTML layout analyzer

Webstemmer is a web crawler and HTML layout analyzer. It extracts articles from
news sites as plain text and removes banners, ads and/or navigation links
automatically. You only need to give a URL of the top page of a site and it
works in an almost fully automatic way with little human intervention.

+2