Page harvesting

Page harvesting enables you to manually select the elements you want to harvest on a web page then send the selection to a ETA collection. It is particularly useful when you are conducting an investigation by browsing the internet as you can harvest as much or as little as you like: the title and abstract of an article for example, a social media profile or a few paragraphs.

ETA Harvester applies the most relevant rule set to the page, highlighting the elements selected for harvesting in green. If you are satisfied with the selection you simply choose the collectionA collection is a container for storing and organising ingested files and documents. Only the textual content is stored in collections, not the original files and documents. you want to send the text to then click ‘Harvest’. If you want to modify the selection you can manually select and deselect elements or apply another rule set. For more about rule sets see Rule sets.

To view the page harvesting workflow click here.

For detailed steps see Harvesting a page.

Figure 1: Using the page harvest feature

Incognito windows

If you plan to use incognito windows to page harvest web pages you need to enable the incognito setting. Go to the Google Chrome/Chromium Extensions tab then, under ETA Harvester, tick ‘Allow in incognito’.

Note: Incognito mode does not work in batch harvesting.

Figure 2: The incognito setting on the Google Chrome Extensions tab