OutWit Hub is a cool FireFox addon that allows you to extract any web page information and export it to our favorite Excel for easier management and organization.
When launched, the tool shows you different kinds of data that can be extracted from the current webpage:
- all the images on the page,
- all page links,
- email addresses,
- page text,
- RSS feeds found,
- page tables, etc.
Let me demonstarte its power using just a few examples:
1. Extract page lists:
Let’s try to extract the tool FAQ using two possible methods:
- Navigate to that page and click the tool icon in the navigation bar;
- Choose "Lists" and export it to Excel.
- Let the tool "guess" what to extract: click on "guess" and see all possible data compiled in the form of a handy table.
2. Extract all page images
- Navigate to any page containing a lot of images;
- Click the tool icon and then choose "images";
- See the detailed table containing:
- each image thumbnail,
- image source URL,
- image dimensions,
- image alt text;
- image file names.
3. Scrape Google Results
This one is a bit more complicated but it demonstrates how flexible and customizable the tool can be (kindly shared by Dale Stokdyk)!
First, you will need to create your own scraper, here’s a screenshot which pretty much says that all: just do what is shown there:
- Set Google to show 100 results per page (to have more data to export and analyze);
- Search for any phrase;
- Click the tool icon and click "scraper".