Input File Formats
Sample Tracker accepts input from laboratory result files in various formats, some of which are based on the standard interchange format (SIF) concept. You may implement variations on the SIF layout, provided that the basic structure is adhered to. Currently, these implementations include fixed format, Comma Separated Values (CSV) and Extensible Markup Language (XML) files.
File Structure
A typical lab results file consists of two sections: a header section and a data section.
Header Section
The header section consists of a number of lines (records), containing general information relating to the sample receipt (such as the despatch identifier, lab job number, date and comments) as well as details of the analyses performed (i.e. analytes, method codes, units of measure and detection limits).
The requirements for this section are as follows:
- Despatch number, lab job number, receipt date, comments, etc. (if present) must appear in the header section of the file
- Element codes must appear in consecutive fields on a single record
- Lab method codes (one for each element) must appear in consecutive fields on a single record
- Units of measure must appear in consecutive fields on a single record
- Lower detection limits must appear in consecutive fields on a single record
- Upper detection limits must appear in consecutive fields on a single record
The rows containing element, method, units and lower detection limit are compulsory.
In Sample Tracker, analytical details are grouped together in collections known as "combos" (element/method combinations). Each combo consists of an element code, a lab method code, a units of measure code, a lower detection value and (optionally) an upper detection value. Two combos are considered to be identical if the element, lab method, units and lower detection limit are the same.
Data Section
This part of the file comprises a number of records containing the actual analytical results. Each row must contain at least the sample identifier, followed by the results (in the same order as the corresponding combos defined in the element header row). The data section can also contain additional fields, such as sample tag qualifiers.
Format Layouts
No further configuration is required if you are planning to use standard, fixed format SIF files to import analytical results. However, if necessary you may implement variations on the standard SIF layout by defining a new format.
The idea behind a format definition is to tell Sample Tracker where to locate important data in the input file. In other words, a format definition provides various points of reference to be used by Sample Tracker when it decodes the input records.
Customised format definitions are stored in the Sample Tracker database. Two database tables are used for this purpose:
ST_XS_FORMAT
This table contains a list of all your customised format definitions.
Column | Description |
---|---|
FORMAT_ID | An abbreviation used for this format. |
FORMAT_SEQ | Display order (e.g. in format lists). |
FORMAT_DESCRIPTION | A short description of the format. |
FORMAT_TYPE | Either SIF, CSV or XML. The CSV option is used for files that contain the same basic data as a SIF file, but with the field values separated by commas (i.e. not in a fixed format). The XML option is used for XML files. 1 |
FILE_MASK | A filter used in file dialog boxes (e.g. *.SIF for SIF files). |
Although it is possible to define XML content in other ways, the XML data definition is included in the Sample Tracker format tables for the sake of consistency. The XML implementation is described in more detail later. Note that the XML option is not available in version 7.1.
ST_XS_FORMAT_LAYOUT
This table stores column definitions for the fields that could be present in a customised SIF file format.
Column | Description |
---|---|
FORMAT_ID | Format abbreviation, as stored in ST_XS_FORMAT |
FIELD_ID | The system name by which this field is known to Sample Tracker. |
DESCR | Description of the field contents (used for display purposes in the receipt and receipt report). |
SHEET_ID | In Version 7 and upwards, this field is used to store a default value for fields that do not appear in the input file. (See discussion regarding tag qualifiers). |
FIELD_SEQ | Display order (e.g. in field lists). |
FIELD_ROW | The row number, identifying the data record on which this field appears (where 1 is the first record in the file). A value of zero indicates that the field is not present in the input file. (See note under FIELD_COL.) |
FIELD_COL |
In a fixed format SIF, this value refers to the actual starting position
for the field on the input record (with 1 indicating the first character).
In a CSV format, this value refers to the field index in the record, where
1 is the first field, 2 is the second field, etc. (separated by commas).
A value of zero indicates that the field is not present in the input file. However, if the value is zero and the value of FIELD_ROW is greater than zero, then a default value will be used for this field (as stored in SHEET_ID). |
FIELD_LEN | In a fixed format SIF, this refers to the actual width of this field in the input record. (This value is not used for CSV files.) |
The value in the FIELD_ID column of the format layout refers to the system name of a field. This is a predefined name by which the field is known to Sample Tracker. For example, Sample Tracker looks for the field "DESPATCH" when it needs to retrieve the value of the despatch identifier. Similarly, System fields usually have a specific meaning or function, therefore it is important that they be correctly identified.
The value of FIELD_ROW has particular importance in the case of the SAMPLEID field. This row is assumed to be the start of the data section (in other words, the first record in the file containing analytical results.) Consequently, the row values for the SAMPLEID and RESULTV fields should be the same. All rows preceding the data row are assumed to fall in the header section.
The following system field names are recognised by Sample Tracker:
Name | Description | Section |
---|---|---|
DESPATCH | Despatch identifier | Header |
LABJOBNO | Lab job number | Header |
DATERECV | Date of report (string in the format ddmmyy) | Header |
PERSON | Person processing the receipt* | Header |
DELIVERY | Delivery method* | Header |
INVOICE | Invoice number* | Header |
AMOUNT | Invoice amount* | Header |
CURRENCY | Currency of Invoice | Header |
COSTCODE | Cost code* | Header |
ELEMENT | First element code | Header |
UNITS | First units code | Header |
DETECT | First lower detection value | Header |
UDETECT | First upper detection value* | Header |
METHOD | First method code | Header |
COMMENTS | First comment | Header |
SAMPLEID | First sample tag | Data |
RESULTV | First result value | Data |
These fields do not appear in a standard SIF file, but can be configured for customised SIF formats. It is possible to define addition fields to cater for special cases, such as sample tag qualifiers.
You do not need to define a format layout if you intend to use only standard SIF files. For Sample Tracker purposes, a Standard SIF File is one for which the format layout is as follows:
System Name | FIELD_ROW | FIELD_COLUMN | FIELD_WIDTH |
---|---|---|---|
DESPATCH | 2 | 1 | 6 |
LABJOBNO | 1 | 1 | 4 |
DATERECV | 2 | 21 | 6 |
ELEMENT | 2 | 27 | 8 |
UNITS | 3 | 27 | 8 |
DETECT | 4 | 27 | 8 |
METHOD | 5 | 27 | 8 |
COMMENTS | 6 | 3 | 80 |
SAMPLEID | 8 | 1 | 16 |
RESULTV | 8 | 27 | 8 |
Here is an example showing the top section of a standard SIF file, as well as the corresponding layout as it would appear in the ST_XS_FORMAT_LAYOUTtable:
Here is an example of a CSV implementation of the standard SIF format:
Extended format definitions
The above examples implement only the standard system fields. It is possible to extend the format layout for the header section to include some of the other fields recognised by Sample Tracker, e.g. upper detection limits.
It is also possible to extend the field set for the data section. However, the important difference here is that any fields defined in the data section are assumed to be related to the analytical results and will therefore be stored in the results table in the Sample Tracker database.
The data section is assumed to start at the row on which the first sample tag appears, therefore any field that has the same FIELD_ROW value as the SAMPLE_ID field will be treated as data.
One of the most popular extensions in the data section is the use of Sample Tag Qualifiers.