Problem:

OCR text files are not loading into iBlaze or Enterprise. An error occurs "Text File Not Found" or "Not a Text File" in the import log. The pathing seems correct in the load file (e.g. dii or control list).

Solution:

1) Summation IBlaze and Enterprise are not compatible with Unicode. Please convert any unicode text files to ANSI ASCII and attempt to import the OCR again.

2) If the OCR text file is an ANSI ASCII text file and the load still fails with "Not a Text File", then consider this:

 
Summation iBlaze performs two tests to determine if an ocrBase candidate file is actually a text file:

2a) In the first 256 characters, there can be no control characters (00h through 1Fh) except for the four allowed in a text document:


09h - Tab
0Ah - Line feed
0Ch - Page break
0Dh - Carriage return


2b) In the first 256 characters, there must appear at least one Line feed character (0Ah).

If the text file fails either test, then Summation rejects it as a non-text file, and will not load it into the ocrBase.

 

Overview:

In general, the presence of control characters in a file suggests that it is a binary file, like an executable, instead of a text file. The tests above are to protect the ocrBase from having invalid characters ingested into it.

 

Applies to:

Summation iBlaze

Summation Enterprise