> There was a company that scanned and converted legal cases and put them on
> line. They had an interesting strategy. The did not correct the material.
> However, they did produce a very sophisticated search engine that would find
> stuff even with lots of errors and the person reading could generally sort
out
> the meaning.
This is very typical in digital libraries. If you are interested in scanning
documents for presentation on the Web, it is possible to provide raw OCR as the
fodder for search engines while a JPEG or GIF derivative of the original
(presumably TIFF) is presented for the reader. Search engines that handle
"fuzzy
logic" are readily available.
For more information on this, please feel free to contact me.
Stephanie James
Early Canadiana Online
[log in to unmask]http://www.nlc-bnc.ca/cihm/ecol/