I've read that you need to be under 1 query/second to really piss them off 20 queries a day is like a normal user. You can use curl to grab the page - and you can grab the # of back links w/in one query.
Otherwise, if you want to get the sites I'd use LWP and set prefs at 100 results, set the user agent to lynx or something like that and not run more than one query within 5 seconds. The delay will keep you on their good side.
I've sent a spider at yahoo that pissed them off and temporarily blocked my usage. Something like 950 pages one right after the other. I was just trying out a scraping program I grabbed. I'm sure others have seen that kind of behavior using some of the old yahoo scraping programs out there. Most new programs have anti-bombardment timing built in.
-- This message may have been cut off and the rest will only be shown to members. To become a member, click here --