Tech Tip: Ensuring All of Your Documents Are Full-Text Searchable
August 13th, 2008 Comment on this articleFor a document to be full-text searchable, it must contain text and the text must be indexed. For a document to contain text, it must be OCRed or, with respect to electronic documents, have its text extracted.
By default, Laserfiche will automatically OCR/extract text from all new documents. In addition, documents are automatically indexed when they are created. Nonetheless, it is recommended that you periodically double-check that all your documents both contain text and have been indexed. This prevents documents from slipping through the cracks. It can be done quickly and easily via two separate Laserfiche advanced searches.
Tip: For best results, do not copy and paste the Laserfiche advanced search syntax from this Tech Tip into the Laserfiche Client. Some non-text characters (such as quotes) may not copy/paste correctly. Manually type each character into the Laserfiche Client Advanced Search box.
Tip: Some of the Laserfiche advanced search syntax used in this Tech Tip will only work on Laserfiche 8.01.
Finding Documents without Text
This search finds documents in your repository that:
- Contain pages but no OCRed text.
- Do not contain pages.
- Only contain OCRed text on some (not all) pages.
{LF:AssociatedPages=”N”}|{LF:OCR=none}|{LF:OCR=some}

After running the search:
- Select all returned documents and right-click them.
- Select Generate Searchable Text.
- In the Generate Searchable Text dialog box, select OCR/Extract Text.
- Click OK.
Finding Documents that Have Not Been Indexed
This search finds all documents that have not been indexed
{LF:Indexed=”N”}

After running the search:
- Select all returned documents and right-click on them.
- Select Generate Searchable Text.
- In the Generate Searchable Text dialog box, select Index entire document.
- Click OK.
Tip: If either search finds a large number of documents, we recommend you perform the batch OCR/text extraction and/or indexing during non-peak hours.
Tip: Electronic documents that do not contain text in their native format, such as audio or video files, may be returned by one or more of these searches. It is not necessary to extract text from or index these files.


