Search Guide
This guide will help you understand how to use the query string syntax when searching documents. Canopy recommends that you familiarize yourself with how we index data before using search capabilities.
There are several ways to search the document list: using the search bar, recalling a search from Search History or Saved Searches, or via Bulk Search and Tagging.
From the Review tab at the top of the screen, click on Documents in the dropdown to access the Document List page and find the search bar:
Search results are shown for the search highlighted in green text:
You can recall a search from the search history by clicking on Advanced Search and then selecting Search History:
Click on the search text or magnifying glass icon to recall and run the historical search:
Once the search is recalled, the search bar is populated with the historical search query:
You can save searches using the Save Search button:
You can recall a saved search from the search history by clicking on Advanced Search and then selecting Saved Searches:
By clicking on the pencil icon, you can rename the search. Click on the search text or magnifying glass icon to recall and run the saved search:
Once the search is recalled, the search bar is populated with the saved search query:
Given that you are using the new Standard Search Syntax, your query string will use Fields, Terms, and Operators.
When running a query, search terms will be entered into a Field. You may select a Field from the Field Names list outlined below. If no field is specified, the term will search the default `content.text’ field, containing extracted or OCRed text.
The content.text field currently supports a maximum of 10K characters. Files that have text exceeding the character limit will be marked partially indexed.
A Term can either be a single word or a phrase. Double quotes are used to indicate a phrase. For example, you can search on tree
or work
; “tree work”
will search for all words in the phrase, in order.
An Operator allows you to customize your search. Operator options are explained after the Field Names section.
You can run Wildcard searches on individual terms. Use ? to replace a single character and * to replace zero or more characters:
Tr?e wo*
Click here for more information on Regular Expression Syntax
Terms that are similar, but not an exact match, can be searched using the fuzzy operator:
tre~ wrk~
Although the default edit distance is set to 2 characters, an edit distance of 1 will catch 80% of misspellings:
Wrk~1
You can use a proximity search to allow text to be further apart or in a different order than when searching on a quoted phrase. You may specify a maximum edit distance of words in a phrase, as in the example below:
“tree work”~4
Documents with text that more closely matches the original specified order will be considered more relevant to your search.
You can use different brackets to denote specific ranges for date, numeric and string fields. Use square brackets to specify inclusive ranges [min-max]: All days in 2018:
date:["2018-01-01 00:00:00.000" TO "2018-12-31 00:00:00.000"]
Numbers from 100) upwards
count:[100 TO *]
Use curly brackets to specify exclusive ranges {min-max}:
Tags between delta
and sigma
, excluding delta
and sigma
:
tag:{delta TO sigma}
Dates before 2018
date:{* TO "2018-01-01 00:00:00.000"}
Curly and square brackets can also be combined: Numbers from 1 up to but not including 8
count:[1 TO 8}
The following syntax can be used for ranges with one side unbounded:
age:>30
age:>=30
age:<30
age:<=30
meta.pii_density.socialsecuritynumber:>100
“meta.pii_density.eu phone”:>10
To combine an upper and lower unbounded range, join the two clauses by using the AND operator:
age:(>=10 AND <30)
age:(+>=10 +<30)
You can utilize the boost operator to make one word more relevant than another. For example, if you want to find all documents about trees, but are especially interested in sugar maples, search the following:
sugar^2 maple
Although the default boost value is 1, it can be any positive floating point number. Boosts between 0 and 1 reduce relevance.
The boost operator can also be applied to phrases or groups:
"tree work"^2
(sugar maple)^4
A search using the terms sugar maple tree
will find any document that contains one or more uses of sugar
or maple
or tree
. Boolean operators can be used within your query to provide additional information.
Boolean operators include + (this term must be included) and - (this term must not be included), while all other terms are optional. For example, sugar maple +tree -work
states that:
tree
must be includedwork
must not be includedsugar
andmaple
are optional; their inclusion increases relevance
The operators AND, OR and NOT (also written &&
, ||
and !
) can also be used, but their specific functionality should be considered carefully. NOT takes precedence over AND, which takes precedence over OR. While using + and - only affects the term to the right of the operator, using AND and OR will affect the terms to the left and right. For example:
sugar OR maple AND tree AND NOT work
This example will yield an inaccurate result, because maple
is now a required term.
(sugar OR maple) AND tree AND NOT work
This example will yield an inaccurate result because at least one of sugar
or maple
is now required and the search for those terms would now be scored differently from the original query.
((sugar AND tree) OR (maple AND tree) OR tree) AND NOT work
This example replicates the logic from the original query, but the relevance scoring will not match that of the original query.
The operators AND, NOT and OR must be in upper case.
Groups of terms or clauses can be created by using parentheses to form sub-queries:
(sugar OR maple) AND tree
Groups can be used to focus on a particular field or boost results of a sub-query:
piitag:(name OR phone) title:(full text search)^3
Groups can be used to find a list of values in a field:
id:(2FG2G55FGF OR 2FG2G55CGF OR 3FG2G55FGF)
Alternately:
id:(2FG2G55FGF 2FG2G55CGF 3FG2G55FGF)
Reserved characters include the following: + - = &&|| > < ! ( ) { } [ ] ^ " ~ * ? : \ /
When any reserved characters function within your query itself, use a leading backslash to separate them from your search operators. For example, to search for (2+2)-4
, you would need to write \(2\+2\)\-4
.
Please note that <
and >
can’t be separated. The only way to prevent them from acting as a range operator is to remove them from the query string entirely.