1.41
Release: May 3, 2023
While processing, sometimes the extracted file information is not completely updated in the search index or backend database. This has been fixed.
Release: April 28, 2023
The “Canadian Province” field can now be mapped to more values (Québec, Nouvelle-Écosse, etc.) from spreadsheets. This change is also reflected when adding the value manually from dropdown. These values are normalized to their English versions and written without the diacritics in the entity list.
Release: April 24, 2023
When downloaded, eml and msg files were not showing the file extension. This bug has been fixed.
Release: April 15, 2023
Several bugs were fixed related to processing:
-
OCR Processing Settings - If “Skip image OCR by image size (Eg : 200X200)” is selected as a processing option, then all OCR images are being skipped. This has been fixed.
-
When converting multi-page TIFFs to PNG images, only the first page is being converted. This has been fixed.
-
The number of param passed was different at function call and definition. Params at function call corrected.
-
Updated_datetime in ES had format “yyyy-MM-dd HH:mm:ss.SSS“ and in pipeline was passing as “yyyy-MM-dd HH:mm:ss." Datetime format in pipeline changed to “yyyy-MM-dd HH:mm:ss.SSS."
Release: April 14, 2023
When viewing a large document or Excel spreadsheet and paging to the next document, the document view was not switching because the previous document’s data was still streaming.
Now, when paging to the next document, the document view will be shown correctly, regardless of the size of the previous document.
Release: April 07, 2023
Invalid characters were found in file path when mapping. The filepath has been encoded to resolve this issue.
Converting empty rows in Excel caused date conversion issues with columns containing date fields. Now, we are checking for empty rows to avoid the conversion issue.
Release: April 07, 2023
When the search query cannot be parsed, an error message will appear under the search bar: “Invalid search syntax detected. Please check our Search Guide for help.” Syntax issues that could cause a parser error include the use of reserved characters and invalid regular expressions. To remove the error, clear the search bar or chip and re-enter the search.
DBR-5113 Support Regex Search on Chains of Numeric Characters Separated by Non-Alphabetic Delimiters
Currently, when pattern search for chains of numeric characters separated by non-alphabetic delimiters, the filter generates separate tokens. For example, 01-02-03 → [01],[02],[03]. This is the default behavior when both specifying and not specifying a search field. This means the following applies to a pattern search:
/[0-9]{2}\[0-9]{2}\[0-9]{2}/ → would not match on text content 01-02-03
name:/[0-9]{2}[0-9]{2}[0-9]{2}/ → would not match on file name 01-02-03
However, pattern search for the chain of numeric characters would match. For example:
/[0-9]{2}/ → would match on each of the strings 01, 02, and 03
name:/[0-9]{2}/ → would match on each of the strings 01, 02, and 03
Where an Elastic field is not specified, we are changing the default behavior to leave chains of numeric characters separated by non-alphabetic delimiters intact. Thus, the following applies when pattern searching:
/[0-9]{2}[0-9]{2}[0-9]{2}/ → would match on 01-02-03
/[0-9]{2}/ → would not match on each of the strings 01, 02, and 03
The behavior for the field pattern matching remains unchanged.
The “Other Personal Info” field can be up to 64 characters long. When the field is defined with a long name, the user loses the ability to delete the field in the Manage Entity Fields screen and the ability to add or remove to a layout on the Manage Entity Layout screen. Now, the long field name display is truncated so that you can delete and add. Hovering over the truncated field will show the full field name in a tool tip.
The date fields for .ics and .vcf files were not being populated in the document list and document view. We are now populating the meta data fields “Send Date” and “Received Date” with .ics and .vcf files.
Sometimes .emf files were stalling during processing. We made changes to the architecture to support large .emf and .wmf files.
Release: April 01, 2023
Entities will not be propagated to Excel and CSV files to avoid false positives and improve propagation speed.
The “Other Personal Info” field can be up to 64 characters long.
When the field is defined with a long name, it is removing the ability to delete the field in the Manage Entity Fields screen and the ability to add or remove to a layout on the Manage Entity Layout screen.
Now, the long field name display is truncated so that you can delete and add. Hovering over the truncated field will show the full field name in a tool tip.
Release: March 30, 2023
Enhanced Metafile (EMF) files are now processed by converting them to Portable Network Graphics (PNG) files. The PNG is OCR’d and will display during review. During processing, if the EMF file is not able to be recognized as an EMF, or cannot be converted to a PNG, the file will fail processing with the error message “Unsupported format or corrupt file.”
Release: March 29, 2023
PII checkbox highlighting on large text files was hanging the browser view. We have now disabled highlights for text files with over 100,000 characters (content.length > 100,000).
Release: March 29, 2023
When Canopy’s app identifies an entity with one or more related elements, it will automatically add it as a raw entity to a document. In the raw entity list, users will be able to filter on the Entry Method field for “Automated” to find where the Canopy app has added raw entities. Users will now be able to filter on three types of entity entry methods: “Manual,” “Automated,” and “Propagated.”
Files with the extension .out are typically UNIX based executables, object code, or libraries. These files should be skipped during processing. This problem has been resolved by skipping these file by extension. Skipped .out files will appear in the list of skipped files on the Upload and Processing dashboard.
MS Excel files with empty sheets failed during PII detection while processing. We implemented additional checks to handle empty Excel sheets while detecting PII. This problem has been fixed.
For old projects, custodian information is not present in the project meta data, which is checked for Bulk Mapping. An additional check was added for old projects which do not have custodians. This problem has been fixed.
Release: March 24, 2023
If more that one modified version of an XLS file is uploaded, previous versions are being remembered when smart mapping. Now, when uploading a modified version of a file, the previous modified versions will be deleted.
After updating a global template, the tenant template flag was being changed to false. Now, if the template is changed or updated, the flag will not be changed.
The Manage Review Settings->Allow Users to Download Documents
permission restricts/allows both “file downloading” and “image viewing”. When a “Lite Reviewer” or a “Review User” were restricted from downloading files, then users in these roles were being prohibited from viewing images.
Now, we don’t restrict image viewing for users who are prohibited from downloading files.
Release: March 20, 2023
If the master had a blank value and the children had values, the child values were not being propagated to the master. Now, if the master has a blank value and the first child encountered has a value, that value will be assigned to the master. Conflict checks will then proceed as normal, and the user can manually resolve any conflicts.
Release: March 16, 2023
When initiating a file upload process, an error may be received that could mean either there is a network error, or that the file being uploaded is locked. If the upload process receives an error while initiating the upload, and the network is up and running, then the upload will fail with the message “Error reading file.” If the upload process receives an error while initiating the upload and there is no network connectivity, the process will wait until the network comes back online. Once back online, if the error still persists, then the upload will fail with a message “Error reading file.” As long as the browser session is not refreshed, and the access token has not expired (72 hours), then the upload will continue to retry until the network has reconnected.
Release: March 16, 2023
When headers and columns don’t match, running PII detection on CSV files fails. Now, the PII detection will ignore rows that do not match headers. This change does not effect the viewing or mapping of CSV files.
Release: March 10, 2023
DBR-4579 Merging Based on Elements with Non-Null Empty Values or Merging Elements when Entities are in Conflict
This patch fixes two distinct issues found when consolidating:
1. Downstream processes are adding empty arrays (a non-null, but empty value) to element values used in consolidation (e.g, PUIDs). Consolidation then uses the empty array to merge, using the empty array as the key value. The solution was to fix the downstream process to make sure that the empty arrays are not stored as element values. This has been fixed.
2. Master entities are merging some elements of an entity which have a conflict (i.e., a grouped entity). This has been fixed.
The most recent version of the XLS library was found to have issues. We have downgraded the library to the previous version while waiting for a new release of the XLS library. This fixes the current problem.
Release: March 6, 2023
Bulk edits were applied to raw entities instead of the entities filtered by document IDs. This problem has been fixed and released to all regions.
Release: March 2, 2023
If the .zip, .rar, or .7z containers preserve the last modified and creation dates of the archived files, Canopy will use these dates for the files extracted from the zip file. In a future release, these date fields will be accessible through the front end. Currently, users will not see any change in functionality.
Customer can load Canopy supplied over inclusive rules to use in sampling to find detections that were not found by the standard detection methods. As the name implies, these detection rules should return many more false positives than the standard detection methods. Low confidence rules include: US_SSN, US_Passport, CA_Passport, CA_SIN, UK_Passport, AU_Passport, AU_TFN, NZ_Passport, and NZ_IRD. Read more about this and download the rules from the custom rules documentation.
This feature supports the incremental loading of rows for tabular data (CSV/Excel). Once opened for the first time, files will be converted into a more lightweight format, ensuring that subsequent access is faster. All previous functionality will remain.
This feature will load rows from tabular data formats (e.g. CSV, Excel) as the user scrolls down, instead of loading all the rows at one time. This allows for a smoother and faster user experience when dealing with large tabular datasets. All previous functionality, like name and address splitting, as well as mapping entities, will remain.
Canopy admin users can now designate custom fields to be Personally Unique Identifiers (PUIDs). This is helpful when instructing review teams which values are most important for consolidation logic. These custom fields may also be used for Entity Propagation in future releases.
Admin level users now can toggle on or off Entity Propagation.
This feature allows users to see which raw entities have been manually and automatically added.
Users will be able to use the filter panel to filter on which documents are associated with propagated raw entities.
Users will be able to use a column Entry Method filter in the raw entity view to filter on how an entity was created, either manually or propagated.
Database Sampling now supports SQL Server MDF files.
Now, after running consolidation, our system will take all the Elements in the Consolidated Master Entities and automatically add (“propagate”) these Elements to unreviewed documents when either the detected element is a PUID or is a detected non-PUID element associated with a detected PUID. Read more about this in the Entity Propagation Workflow documentation.
Now, the user can filter on all documents that contain Personally Unique Identifiers (PUIDs). Utilizing this filter in combination with documents “not batched,” the user can batch all files potentially containing PUIDs that have not been reviewed.
Now users can use range search on the count of elements by type found on each document. Read more about this in the search documentation.
Processed archive files of type .gz were displaying as viewable documents and showing up as “Ready for Review” in the Impact Assessment Report. These archive files will now show up in the archive file count on the Impact Assessment Report and will not be listed on the Document List.
For some .bak files, the hash was incorrect because the file was appearing empty before the hash was calculated. This has been corrected.
When setting the permission to create tags for individual users, if you use the filter to find a user and deselect the individual users, all users are deselected. If you save the change before you remove the filter, it removes the permission for all the users instead of just the one you selected. This has been fixed.
When a user reruns PII elements, the search on the Processing Template PII Elements screen is not expanding the categories and showing the search hits. This has been fixed and the search functionality is now working properly.
In the Advanced Search section, under Bulk Search and Tagging, clicking on “Add 1 Tag” from the Bulk Tag dropdown causes the screen to suddenly reset back to the Document List screen. This has been fixed. Now, “Add 1 Tag” will function properly when bulk tagging.
The use of an apostrophe created a bug when entering a project name on the Project Details page in Project Settings. Now, when a user tries to type an unsupported character into a project name, an alert will appear to inform them what characters can be used.
A green back arrow icon has been added to the Update Project page in the Project Details section of Project Settings to make navigating away from this page easier for the user.
Multi-select field (system as well as custom) values were getting saved as string instead of array and causing formatting issues. This has been fixed.
OCR processing notes have been adjusted for accuracy in both the Processing Default Settings under Templates and Layouts and from the Upload Processing Settings screens.
- Updated Skype Message document view to include from, to, and subject.
- Fixed processing issue related to handling long file names in zip.
- Fixed skipped OCR issue related to invalid characters in file filename.