scraperWiki API: The Daily REST library entry for Excel and GAScript

ScraperWiki API

I’ve been taking a look at scraperWiki lately. In case you haven’t come across it, it’s a framework to allow you to scrape structured data from web sites using various data manipulation tools and code. One of the great things about it is that many of the data sets scraped have been made public by their authors. Another good thing is that scraperWiki has an API, which means that all this data can be accessed by the  REST library to load them directly into either Google Apps Script or Excel. I’ll implement a general connector for scraperWiki shortly that will get data associated with chosen scraperWiki short_name, but first, here’s an API entry to get a table of what scraperWiki definitions are already available.

In addition to the datasets that others have published, you can of course use scraperWiki to create your own. This means that with scraperWiki, you can get data any web site, even if they don’t actually have an API.

The scraperwiki rest API is single query API, populating multiple rows in a spreadsheet from one query. You just name the columns to match any data you want to retrieve and go. Here’s the results of query for the first 1000 scraperwiki entires. This example can be found in the cDataSet.xlsm and downloaded from here

Library entry

        With .add(“scraperWiki”)
            .add “restType”, erRestType.erSingleQuery
            .add “url”, “https://api.scraperwiki.com/api/1.0/scraper/search?format=jsondict&maxrows=”
            .add “results”, “”
            .add “treeSearch”, False
            .add “ignore”, “”
        End With


And the execution code looks like this

Public Sub testScraperWiki()
    generalQuery “scraperWiki”, “scraperWiki”, “1000”
End Sub

and for Google apps script

function testScraperWiki() {
    mcpher.generalQuery (“scraperWiki”, “scraperWiki”, “1000”);  
}

Next Step
In a future post, I’ll show how to get data associated with a scraperwiki into Excel and GAS.

The rest library is itself implemented as REST API and can be queried like this.
For more stuff like this, visit the ramblings site or the associated blog. If you have suggestion for particular API, vote for it on google moderator or contact me on our forum

Author: bm082975

Leave a Reply

Your email address will not be published. Required fields are marked *