When you add a document to a collectionA collection is a container for storing and organising ingested files and documents. Only the textual content is stored in collections, not the original files and documents., default information is automatically extracted from it and marked up with coloured labels that represent entity and non-entity classes (for example, email addresses, locations, organisations and people). Each unbroken annotation of marked up text is referred to as a text reference.
ETA offers several tools to help you explore, refine and analyse the information extracted from your documents:
The following topics describe ways you can begin to explore the information in your collections:
To view information extracted from documents:
A highly productive method for creating text references in documents and links on text graphs is to create dictionaries.
Dictionaries are lists of words and phrases of particular interest to you that you want to identify in documents, such as types of weapons, names of illicit drugs, names of specific organisations and war crime indicators. Each word or phrase is referred to as an entry. As ETA processes text it looks for entries and creates a link on the text graph whenever it finds one.
For example, this word list ...
... creates these text references in a document.
To extract specific words and phrases using a dictionary:
Note: By default, ETA will not distinguish between upper and lower case spelling in a word list. To specify the case of words and phrases you want to extract, use the ‘case’ condition. Enter the exact text #cond:case followed by the case you want to specify. For example to specify that the words and phrases must begin with a capital letter (known as title case), enter #cond:case title. The options are lower case, upper case, title and exact.
The Search screen enables you to create and run search queries against one or more collections in a project. This topic shows you how to run a simple keyword search.
To search for keywords in documents:
The search results are displayed.
NetworksA network is a visual summary, generated by ETA, of the information in one or more documents. enable you to explore and analyse the relationships between the entities in a collection or set of collections. You can view the information in networks in a table and as a graph.
To create a network from a collection:
The network is displayed.
Neighbours (entities with a link to the node) are displayed on the graph.