Product Documentation
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Structured Data

Canopy’s advanced processing prepares structured data to be easily added to the entity database.

Database Files

During processing, databases are restored and data-based tables are presented to the user in a mappable form. Reviewers can select a header row, import from a selected row, and view the total rows present within each document. Clicking on the “Map” button will initiate mapping to entities. The following database file types are supported:

Extensions File type
.accdb, .mdb Access DB
.bak, .mdf MS SQL

Binary Data

Binary file type formats containing structured data include the following:

Extensions File type
.sas7bdat A database storage file created by Statistical Analysis System (SAS) software. It contains a binary encoded dataset used for advanced analytics, business intelligence, data management, predictive analytics, and more.
.dcm Digital Imaging and Communications in Medicine (DICOM). See more information on DICOM files here.

Text Files

During processing, Canopy attempts to detect delimited structured data and automatically create a modified mappable version of the file. Specifically, processing searches for comma (,), tab (\t), pipe (|), colon (:), and semicolon (;) delimiters in any mime/media type equals ’text.'

This conversion is helpful in speeding up the review process because the user can simply click on the modified version of the file to map it.

While the file types where delimited text can be present are endless, we test against the following:

Extensions File type
.csv CSV (Comma-Separated Values): CSV files use commas as delimiters to separate data fields. These files are widely used for storing and exchanging structured data.
.tsv, .tab TSV / TAB (Tab-Separated Values): TSV files use tabs as delimiters. These files are similar to CSV files, but use tabs instead of commas to separate values.
.psv PSV (Pipe-Separated Values): PSV files use the pipe character (\) as the delimiter to separate data fields. These files are less common than CSV and TSV, but are still used in some applications.
.txt TXT (Text files): Plain txt files may contain any type of delimiters to separate data fields. Canopy can detect delimiters included in these files.
.dat DAT (DAT files): DAT files are another form of a CSV type file.