Product Documentation
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Search Guide

Search Interface

This guide will help you understand how to use the query string syntax when searching documents. Canopy recommends that you familiarize yourself with how we index data before using search capabilities.

There are several ways to search the document list: using the search bar, recalling a search from Search History or Saved Searches, or via Bulk Search and Tagging.

From the Review tab at the top of the screen, click on Documents in the dropdown to access the Document List page and find the search bar:

img_10.png

Search results are shown for the search highlighted in green text:

img_15.png

Search History

You can recall a search from the search history by clicking on Advanced Search and then selecting Search History:

img_2.png

Click on the search text or magnifying glass icon to recall and run the historical search:

img_4.png

Once the search is recalled, the search bar is populated with the historical search query:

img_12.png

Saved Searches

You can save searches using the Save Search button:

img_17.png

You can recall a saved search from the search history by clicking on Advanced Search and then selecting Saved Searches:

img_3.png

By clicking on the pencil icon, you can rename the search. Click on the search text or magnifying glass icon to recall and run the saved search:

img_7.png

Once the search is recalled, the search bar is populated with the saved search query:

img_16.png

Search Syntax

Overview

Given that you are using the new Standard Search Syntax, your query string will use Fields, Terms, and Operators.

When running a query, search terms will be entered into a Field. You may select a Field from the Field Names list outlined below. If no field is specified, the term will search the default `content.text’ field, containing extracted or OCRed text.

The content.text field currently supports a maximum of 10K characters. Files that have text exceeding the character limit will be marked partially indexed.

A Term can either be a single word or a phrase. Double quotes are used to indicate a phrase. For example, you can search on tree or work; “tree work” will search for all words in the phrase, in order.

An Operator allows you to customize your search. Operator options are explained after the Field Names section.

You can run Wildcard searches on individual terms. Use ? to replace a single character and * to replace zero or more characters: Tr?e wo*

Regular Expressions (regex)

Click here for more information on Regular Expression Syntax

Fuzziness

Terms that are similar, but not an exact match, can be searched using the fuzzy operator:

tre~ wrk~

Although the default edit distance is set to 2 characters, an edit distance of 1 will catch 80% of misspellings:

Wrk~1

Proximity Searches

You can use a proximity search to allow text to be further apart or in a different order than when searching on a quoted phrase. You may specify a maximum edit distance of words in a phrase, as in the example below:

“tree work”~4

Documents with text that more closely matches the original specified order will be considered more relevant to your search.

Ranges

You can use different brackets to denote specific ranges for date, numeric and string fields. Use square brackets to specify inclusive ranges [min-max]: All days in 2018:

date:["2018-01-01 00:00:00.000" TO "2018-12-31 00:00:00.000"]

Numbers from 100) upwards

count:[100 TO *]

Use curly brackets to specify exclusive ranges {min-max}: Tags between delta and sigma, excluding delta and sigma:

tag:{delta TO sigma}

Dates before 2018

date:{* TO "2018-01-01 00:00:00.000"}

Curly and square brackets can also be combined: Numbers from 1 up to but not including 8

count:[1 TO 8}

The following syntax can be used for ranges with one side unbounded:

age:>30

age:>=30

age:<30

age:<=30

meta.pii_density.socialsecuritynumber:>100

“meta.pii_density.eu phone”:>10

To combine an upper and lower unbounded range, join the two clauses by using the AND operator:

age:(>=10 AND <30) age:(+>=10 +<30)

Boosting

You can utilize the boost operator to make one word more relevant than another. For example, if you want to find all documents about trees, but are especially interested in sugar maples, search the following:

sugar^2 maple

Although the default boost value is 1, it can be any positive floating point number. Boosts between 0 and 1 reduce relevance.

The boost operator can also be applied to phrases or groups:

"tree work"^2 (sugar maple)^4

Boolean Operators

A search using the terms sugar maple tree will find any document that contains one or more uses of sugar or maple or tree. Boolean operators can be used within your query to provide additional information.

Boolean operators include + (this term must be included) and - (this term must not be included), while all other terms are optional. For example, sugar maple +tree -work states that:

  • tree must be included
  • work must not be included
  • sugar and maple are optional; their inclusion increases relevance

The operators AND, OR and NOT (also written &&, || and !) can also be used, but their specific functionality should be considered carefully. NOT takes precedence over AND, which takes precedence over OR. While using + and - only affects the term to the right of the operator, using AND and OR will affect the terms to the left and right. For example:

sugar OR maple AND tree AND NOT work

This example will yield an inaccurate result, because maple is now a required term.

(sugar OR maple) AND tree AND NOT work

This example will yield an inaccurate result because at least one of sugar or maple is now required and the search for those terms would now be scored differently from the original query.

((sugar AND tree) OR (maple AND tree) OR tree) AND NOT work

This example replicates the logic from the original query, but the relevance scoring will not match that of the original query.

The operators AND, NOT and OR must be in upper case.

Grouping

Groups of terms or clauses can be created by using parentheses to form sub-queries:

(sugar OR maple) AND tree

Groups can be used to focus on a particular field or boost results of a sub-query:

piitag:(name OR phone) title:(full text search)^3

Groups can be used to find a list of values in a field:

id:(2FG2G55FGF OR 2FG2G55CGF OR 3FG2G55FGF)

Alternately: id:(2FG2G55FGF 2FG2G55CGF 3FG2G55FGF)

Reserved Characters

Reserved characters include the following: + - = &&|| > < ! ( ) { } [ ] ^ " ~ * ? : \ /

When any reserved characters function within your query itself, use a leading backslash to separate them from your search operators. For example, to search for (2+2)-4, you would need to write \(2\+2\)\-4.

Please note that < and > can’t be separated. The only way to prevent them from acting as a range operator is to remove them from the query string entirely.