Laserfiche Quick Fields offers a convenient way of ensuring accurate OCR readings with an easy to use “find and replace” Substitution process. This process allows you to find and modify words in tokens or page text using regular expressions in Pattern Matching. A common use case is to find and replace specific terms or phrases in a document’s text that has been extracted from a poor quality image. For example, TGC scans in old invoices whose text has faded over the years. The OCR process generates the word “invoice” as “inv01ce.” To fix this issue, the Substitution process can be configured to replace all instances of “inv01ce” with “invoice.”

The Substitution process also has another feature called Match Groups. These are groups defined in a pattern match that can be modified to reformat data, e.g., match groups can be used to change a “Student Name” token that is formatted as First Name Last Name to Last Name, First Name. This helps organizations standardize their information when populating fields, naming entries, etc.

Example: Central Florida University uses a Lookup process to retrieve each student’s five-digit student ID number from an external database. The ID numbers stored in the database are not formatted according to University standards. Each ID number should consist of two numbers, a dash, and three more numbers (11-111). The database stores them as five digits with no dash (11111). The Lookup process retrieves the incorrectly formatted student ID and saves the value in a “Student ID” token, which is then used to name each student record. To format the retrieved student ID’s correctly, they use pattern matching to separate the five-digit number into two groups using parentheses (\p\p)(\p\p\p). Each set of parentheses is a match group and is named after their position and denoted as ${position}.

(\p\p) = ${1} since it is listed first and (\p\p\p)=${2} since it is listed second

To add a dash between the groups, add the first match group ${1}, then a dash, then the second match group ${2}.

${1}-${2}

TechTip02252013

Select Replace the input token’s value with the result to modify the token to match the input value. For example, if this option is selected for the “Student ID” above, the value 11111 retrieved will be replaced with 11-111.

Note: You can enter the match group syntax manually or click Insert Group and select them from the list. The groups will be labeled as Match Group 1, Match Group 2, etc.

TechTip02252013_2

For more information, see the Substitution topic in the Laserfiche Quick Fields help files.

By: Misty Kalousek

Related Posts