|
 |
Frequently Asked Questions |
 |
|
| Click on question to reveal the answer. |
|
 |
| 1. What is a production imaging system? |
You may have seen or used a scanner, or you may even have a scanner at home or on your desk. If so, you may have the experience that scanners are slow and cumbersome. If so, you almost certainly have only seen a consumer-grade scanner designed for the odd scan here and there.
Production scanning is different. The technology is approximately the same, but the implementation is different. Unlike consumer scanners, production scanners are fast (up to 100 pages per minute), have automatic document feeders with capacities up to 500 or more sheets, and are interfaced with special high-speed drivers.
There are a wide variety of scanners on the market from several reputable manufacturers (like Canon, Fujitsu, and Kodak) and offered through a number of vendors (like www.scanstore.nl). For a production imaging system, you will want to get a production scanner. There are a wide variety of scanners to choose from, and you'll want to get one that will suit your requirements for some months. Changing the scanner later is not a problem -- it won't affect the system overall -- but you will have wasted the money on the wrong scanner. |
|
 |
| 2. How do I determine what speed my production scanner should have? |
Speed (measured in scans-per-minute) is probably the one thing that will cause the most frustration with scanning users, and which could cause you to have a scanning backlog. In determining how fast a scanner you need, consider how many sheets you need to scan in whatever time period, also keeping in mind that many businesses receive a greater number of documents during certain times of the month or year. |
|
 |
| 3. What driver should my scanner support? |
| Most high-speed scanners will use a scanning driver interface called ISIS, which results in high-speed scanning. The lower-speed TWAIN interface, which is used by most consumer scanners, is also supported on some production scanners. But for high-speed scanning, you want to use an interface other than TWAIN. |
|
 |
| 4. What kind of interface should I choose? |
There are 4 different types of interfaces available for scanners: USB, Firewire, Video and SCSI.
USB can be connected without any special requirtements on almost any PC, but this interface is reletaively slow when it comes to a professional imaging solution. Use USB for solutions where you would need a maximum speed of 30 scans per minute.
Firewire is a much faster interface. Often the production scanner comes with a Fireware interface preinstalled. So Firewire can be used in a high speed production environment. The disadvantage of Firewire is that when you would require image enhancement, this needs to be done on the PC.
The Video interface is not used any more.
The SCSI interface is speedy and there are special hardware cards on the market that allow for image enhancement on the card itself. When in a high volume environment, choose a SCSI interface. |
|
 |
| 5. What is the maximum paper width supported by production scanners? |
| Most scanners support a maximum paper width of A4 (or 8.5 x 11 in USA). They will support longer sheets of paper but not wider. There are scanners available that supports larger paper widths, like A3 and even larger, which cost a bit more. If you will need to scan documents larger than A4 width, you will want to make sure you get a scanner that can support them. |
|
 |
| 6. Simplex or Duplex? |
Simplex scanners can scan only one side of a document, while duplex scanners can scan both sides at the same time. If you have only a low percentage of two-sided documents, you can run each side separately through a simplex scanner; but if you have a moderate amount of duplex documents you'll want a duplex scanner. |
|
 |
| 7. ADF or not? |
| You will almost certainly want an automatic document feeder (ADF), and the volume of documents you are scanning will determine what capacity of the ADF (number of sheets) you will need. If your ADF cannot hold enough pages, you will constantly be feeding the scanner, which can be especially inconvenient if the scanning user is also indexing and has to get up and down every few minutes to load the scanner. |
|
 |
| 8. Flatbed or not? |
Most modern scanners do not have flatbeds, and most corporate scanning applications do not require one; however, if you have many small documents you want to scan (like check stubs, expense receipts, and pay slips) you will want to consider a scanner with a flatbed. Not having a flatbed scanner does not mean you cannot scan small documents, but you would likely need to tape them to a larger sheet of paper or insert them into a clear plastic carrier. A flatbed adds cost to a scanner, and also can greatly increase the size of the footprint of space you would need to provide. |
|
 |
| 9. Color versus Black & White? |
Normally, a color scanner is not required for corporate scanning but if you have documents where color is significant you would want to get a color-capable scanner. Also note that when scanning in line mode (not grayscale) certain colors like red read as black. So if you have something like a red rubber stamp on top of black text you may otherwise have difficulty in reading the text. Note that color and grayscale scanning results in larger image file sizes than line scanning. If you are contemplating scanning using multi-function devices (e.g., those that also print and/or fax), and especially if you have a number of those devices in your company, you may find that you can spread scan volumes across them for an effective solution. |
|
 |
| 10. Will every scan be top-of-the-line? |
Just because you bought a top-of-the-line scanner does not mean every scan will be top-of-the-line. While scanners are quite accurate things can go wrong, either due to scanner problems, paper problems, or human problems; such as:
- Papers sticking together are fed at once
- Documents are fed wrong side up (so there is no image)
- Paper jam, due to one or more tears on the leading edge, staples, tape, etc. This is complicated for multi-page documents, where multiple pages need to be combined to form the document.
|
|
 |
| 11. How can I seperate documents? |
This requires a so-called patch page between each documents pages, or by using a barcode sticker on each first page of a document or the scan operator needs to indicate where the documents separate. Mistakes can occur with both of these methods, the result being that multiple documents, or pages from some documents, are combined with other documents resulting in electronic misfiles. Human care must be taken when separating multi-page documents. |
|
 |
| 12. How do I know if the scanner has scanned all pages? |
Modern automatic document feeders are quite good (as good as those for photocopiers) and almost never miss a page, however multifeeds do occur especially for NCR (carbonless) documents, which are chemically treated (and for which special scanner rollers are recommended). If multi-feeding is a problem, it is necessary to batch the documents and count the number of pages being scanned and compare that to the count recorded by the scanner or scanning software. |
|
 |
| 13. Contrast and clarity of my scans are not always good, how can I improve this? |
If you have problem documents like those with dark colored backgrounds or light print, you may need to adjust scan settings (contrast and density) for those or use special software that will automatically rescan those documents. If you have several problem documents, you can scan them more efficiently by batching like documents together and scanning them at once with adjusted scan settings. |
|
 |
| 14. What is Document Imaging? |
"Imaging" has a number of meanings but for our purposes we are talking about document imaging, which is converting paper documents to electronic files through scanning and stores them a repository/database. |
|
 |
| 15. What is a Document Imaging Solution? |
Document imaging solutions are conceptually simple:
- A scanner converts paper documents into electronic images
- Each image is placed in a folder, database, or other repository on disk (hard disk or otherwise), optionally with one or more indexes that act as lookup keys
- Users can retrieve images based on their access rights and print, email, or fax them
- Stored images can be backed up like other computer data Electronic document images are more efficient to manage than paper documents and, most important, they can be easily safeguarded so should disaster strike you can easily have an offsite backup (otherwise, how in the world can you back up your paper?)
|
|
 |
| 16. Why would I use a Document Imaging Solution? |
There are many reasons to use document imaging; the three primary ones are:
- Save costs in paper handling, filing, storage, and retrieval
- Save time in filing, retrieving, and duplicating documents, and searching for misfiled documents, and recreating lost ones
- Safeguard documents from destruction and secure them from theft and unauthorized access
|
|
 |
| 17. How to select a Document Imaging Solution? |
What do we mean by a great document imaging solution versus a good one? Let's use an example. ABC Corporation (ABC) produces 1,000 Proofs of Delivery (PODs) per month and receives almost all of them back with signatures. ABC wants first of all to have an electronic archive of all returned PODs so that it has a durable record of all customer acceptances of goods shipped. For the frequent cases in which a customer does not pay an invoice because they cannot substantiate that the goods were delivered, ABC wants an easy way to send the customer a copy of the related POD.
One approach is to set up a folder for each customer and scan PODs for each folder into their respective folders. When a POD for a customer is needed, an ABC user can browse through the images in the folder, find the one for the POD that reflects the order in question, and print and fax it. We would consider this a bad solution.
A refinement to the above solution, which would raise its rating from bad to poor, is to scan all PODs to a single folder and type index values for the customer name, POD number, and date. But multiple folders are cumbersome, and with hundreds or thousands of customers would become unwieldy. And having to manually target each POD to its respective customer folder is time-consuming and error prone: should the user put the scanned image in the wrong customer folder, it would be electronically misfiled.
A refinement to the poor solution, which would raise its rating from poor to ok, is to allow users to electronically fax the PODs from their desktop without having to print them and walk over to a fax machine to fax them.
The solution does not become good until there are controls for indexing, such as providing a listbox (drop-down lists) for the customer name index, which would be dynamically loaded off ABC's customer database. This would reduce the chance of electronic misfiles. (True, the user could select the wrong customer from the listbox, but those problems are less likely to occur than mistyping a customer name.) The solution goes from good to quite good by using OCR technology (explained later) to read the index values off the POD, thereby greatly reducing the need for indexing; but because OCR is not 100% accurate, there is still a need to inspect the OCR'd values and correct those that are wrong.
The solution rating gets raised from quite good to very good by validating the OCR'd values against the database that generated the PODs to determine OCR error that have occurred (if no match is found, or if the match is not valid for the current period). This solution is considerably more efficient, because the user can be prompted to deal with just the exceptions.
The solution achieves our value of great by, instead of using OCR to read text data on the POD, modifying the POD to include a barcode containing the index values, or a value that can be used to lookup the index values in the database from which the POD was generated. The bar code is recognized at scanning time, and the index values are gotten and populated. This not only eliminates the need to key index values entirely (unless the printed bar code is damaged and cannot be read), it also eliminates the possibility of electronic misfiles because there is neither a need to key index values nor select them from a list: the POD will be electronically filed exactly according to the information printed on the POD. Consider the amount of time, effort, and opportunities for mistakes that the bad solution versus the great€ solution provide, and you can see that it is worthwhile to put some thought (and get some helpful expertise) in implementing a document imaging solution. |
|
 |
| 18. Could a Document Imaging Solution improve a company's busioness processes? |
In the case of ABC Corporation, missing PODs (those that go out but do not come back), those that come back without signatures, and those that come back with notations on them, are all exceptions that must be detected and actioned. Those business process tasks can be assisted by document imaging in conjunction with some minor programming and/or workflow technology in the following ways:
Missing PODs can be detected by comparing the database that generated the PODs against the contents of the document imaging system: POD entries in the database that have no match in the document imaging system are missing.
Unsigned PODs can be flagged by the user via a special index field which routes those PODs to a workflow process that sends them to the appropriate users to action, or even directly to the customers with a letter requesting them to be signed and returned.
PODs with notations can likewise be flagged via a special index field that would similarly route those PODs to workflow, and the actioning user would be able to see the electronic images on their screens and take the appropriate actions. Other business processes can similarly be made more efficient and accurate using the same technologies. |
|
 |
| 19. What type of documents should/can I scan? |
Generally, you will want to scan most of the documents in your company. Anything that you cannot afford to lose in a fire should be scanned. Almost without exception, anything with a signature (like contracts) should be scanned. Anything for which you are carrying liability (and therefore can get sued over) should be scanned.
Also consider that documents that need to be accessed by more than one user at the same time should be scanned, since you can easily provide concurrent access to multiple users electronically. Documents that need to be accessed by remote users can also benefit from scanning, if you provide remote access to them rather than copying and faxing or mailing them.
After implementing a document imaging solution, there is often a tendency to scan only new documents that come in after the solution is implemented, but consider the value of scanning older documents as well. After all, they are no less vulnerable to fire, flood, theft, and other disasters that may be just as valuable.
Also, you would rather not have customer files, HR files, etc. separated into paper and electronic ones as it makes document retrieval more cumbersome. Such scanning of existing documents is referred to as a backfile conversion and can be accomplished by scanning off-hours and/or using temporary staff to scan. There are also scan bureaus (a service that we offer) that can perform backfile conversions, either off-site or on-premises. In contemplating a backfile conversion using your scanner, consider its duty cycle as it may not have the capacity to handle such additional volumes.
Many companies make the mistake of wasting time and resources scanning documents that are already in electronic format. Some even print out certain reports and documents just so they can be captured in the document imaging system. If you have even a moderate amount electronic documents you want to archive and access the same way as scanned document images, you would be well advised to get a document management system that can also accommodate computer-generated reports, email messages, etc. |
|
 |
| 20. How to index files so they are easy to lookup? |
A more important consideration than getting documents into a document imaging system is getting them out. If users cannot find the documents they need, they will rue the day that imaging came into their lives. So you want to be sure that you not only take steps to prevent electronic misfiles (as discussed previously and hereafter), you want to index documents to make them easy to lookup by one or more values.
That being said, there may be documents that will be infrequently accessed and which you are scanning just for retention purposes, and to spend the time to index them in detail may not be necessary. For those documents, you may want to trade off the time for indexing with a longer time for lookup. In such cases, you can create appropriate folders and just scan documents into their respective folders without indexing. For example, if you have not many expense reports and receipts you may want to combine them together into weekly folders. Users would need to browse through the various documents to find what they are looking for, but this is an infrequent occurrence, and especially if it does not impact many users, it may be perfectly fine.
Typically, however, you would want to associate multiple indexes with each document. For example, if you are scanning customer contracts you might want to index them by customer number and perhaps also customer name. For supplier invoices, and you might want to index them by supplier name and invoice number.
Assigning multiple indexes let you lookup documents by more than one value, which not only provides more flexibility in accessing documents but also helps prevent electronic misfiles due to miskeying indexes: if you cannot find a document by one index, you can try another. |
|
 |
| 21. How to avoid mistakes when indexing? |
| You should strive avoid keying some or all index values, using methods such as database lookup, listbox selections, encoding index values or database handles into bar codes, and other available methods. Not only does manual indexing take time, mistakes can be made which can lead to electronic misfiles. |
|
 |
| 22. How can I check indexes on their accuracy? |
| If you cannot avoid keying index values, those which are critical for lookup success can be double-keyed. Ideally, such double-keying is done by two separate people, since the same person may have a tendency to make the same mistakes with certain character combinations when keying. Or automatically check indexes against a database. Or if you have different document types with the same kinds of values, like customer number, make them consistent (same length and data type) across all those document types. |
|
 |
| 23. How can I create a virtual/digital dossier? |
Certain indexes will be used in different document types, such as: customer number (order forms, invoices, contracts). When enforcing consistent indexes across all those document types a good Document Imaging Solution will automatically combine those document types into a digital dossier. |
|
 |
| 24. Should I make indexes mandatory? |
Often, there are certain index fields that may not be relevant for all documents of a particular type. For those, you would make the index values optional; but you would not want to make important index values optional because users may inadvertently or purposefully skip them. |
|
 |
| 25. Can I use dynamic listboxes when indexing? |
If you are using listboxes for index value selection, your document imaging product may allow you to load those listboxes dynamically rather than having to use static values. Dynamic listboxes, such as those populated from an external table, are generally preferred because new values that come into the table are automatically available for indexing. Users can get frustrated when they are forced to select a value from a listbox and the value they need does not exist. |
|
 |
| 26. How powerful is OCR? |
OCR is a method that can potentially be used for extracting index values from documents. OCR stands for Optical Character Recognition, a technology that reads a scanned document and converts bit-mapped characters into individual machine-readable characters that can be intelligently processed, just as if they were keyed in. OCR recognizes spaces between words and therefore can form the individual words that comprise a document.
There are two methods of OCR that can be useful in document imaging applications: full text or zoned, where:
- Full-text reads an entire document and attempts to convert all text to a machine-readable format.
- Zoned OCR reads only certain parts of a document and converts only what is in that zone.
Full-text OCR is problematic in several ways, like being thrown off by grids, columns, changing typefaces, and such. It is also not appropriate for structured documents like invoices that have repetitive values, since for example the word invoice would occur a huge amount of times among the document indexing and therefore not be useful for searching.
Zoned OCR is more appropriate for most document imaging applications and for structured documents, since zones can be defined for the locations in which index values occur and only those values are available for indexing. Zoned OCR is, of course, only useful for documents that have the same format, with index values occurring in the same page on each document. So zoned OCR would not be appropriate for supplier invoices, where formats vary, but would be appropriate PODs that you create and which come back with signatures. (There are, however, technologies that can locate index values even on inconsistent document formats, using full-text OCR combined with intelligent text processing.)
While OCR can seem a bit magical, it is not a magic bullet. Besides not being able to handle unstructured documents well, OCR is also prone to errors. Depending on text quality, typeface, size, and other factors, OCR can achieve 90% accuracy or better.
OCR can be trained to attain higher levels of accuracy, and can filter its word determinations through dictionaries. There are even applications where multiple OCR engines are used for the same documents and their results compared and internally voted on to determine which is most accurate.
One fortunate feature of OCR is that it generally can determine and advise that it was not able to correct process certain text, so it is possible to inspect and correct those only; however, this is not always the case. Therefore, critical values should not be OCR'd.
Combining OCR with database lookup or other forms of validation can yield higher levels of accuracy and is an appropriate solution in certain situations where OCR is the best alternative for indexing.
OCR alone is also not a solution for everything because values you may want for indexing may not occur in the document at all, for example a supplier invoice which you would want to index by your supplier number may not appear on the document. This can be assisted by database lookup or other technique.
So while OCR is a useful technology for indexing, it does have limitations and should be treated with suspicion and made to prove itself in every application. |
|
 |
| 27. What backup strategy should I implement? |
One of the biggest benefits of a document imaging is the ability to back up paper documents as electronic images. Imagine, how could you possibly back them up otherwise? Photocopy every document that comes in the door and send the copies offsite everyday? Once documents are converted to electronic images, and once those images and their indexes are backed up to nonvolatile media, and once copies of those backups are stored in a safe offsite location, you can destroy the documents, or file them in a much simpler scheme. But scanning alone does not provide the necessary safeguards. Note the three backup caveats in the previous paragraph:
It is not enough to back up the images: their indexes need to be backed up as well (or else you won't be able to find anything easily, and you may not be able to ever properly recreate the indexing)
Backups must be made to nonvolatile media. If you back up to unreliable media or that which can crash easily, the backups may be useless
Backup copies should be stored offsite at such a location as is not likely to be wiped out in the same disaster that wipes out your company offices You should devise a sensible backup strategy for images, which is a bit different than for your normal files. Unlike typical data files, document image files do not change, so once they are backed up they do not need to be backed up again. A backup scheme, with periodic full incremental backups and frequent differential backups is advised.
In addition to backup considerations, you should store document images on a RAIDed disk, optical media, or some other fault-tolerant media. While you can always rescan and reindex images. |
|
 |
| 28. Should our company change the way we file paper documents? |
Once you have a live document imaging system in production, you can and should change the way you file paper documents. After all, if you continue to file documents the same as you did before, you are not getting the full cost savings from the solution. There are two possible extremes for changing the way you file:
- One extreme is to keep filing as usual
- The other is to destroy the documents
Most companies choose a strategy somewhere in the middle: destroying some documents and filing others, but more efficiently. Once documents are in a document imaging system where they can be readily located, there is no need to provide easy access or perhaps any access to the paper documents.
For documents you decide to not destroy, you can file them, for example, in large folders by scanning date. The document imaging system shows the scan date of each document, so if you need to track down the paper document you know which folder to go to.
An important issue in determining how to file and what to destroy are the legal requirements for document retention. In most western countries, electronic documents that are true representations of their originals are admissible under evidentiary laws. In fact, they are often preferred to paper documents because, unlike paper documents, they cannot be manipulated or tampered with once the documents have been scanned. Electronic documents can provide a better defense and are more compliant than paper documents. |
|
 |
| 29. Should I "sell" the solution? |
We know, you are not a salesperson but like it or not, we all do a bit of selling from time to time in our jobs and in our lives. Whether you are selling your boss on a getting a new company car, your staff on working late to finish a project, or your wife on taking the next vacation to a place that you want to go, you sell.
Resistance to change is a part of human nature, and people often need to be sold on changing the way they do things especially if the better mousetrap is something new or unfamiliar, or carries a negative preconception.
Your users may have seen scanning solutions, or may currently use a scanner. They will know such scanning to be slow, cumbersome, and something that they would rather not have in their jobs. But the scanner they have in mind is probably an A4 flatbed scanner, on which they have to place one sheet at a time, and interfaced with a slow TWAIN driver.
Also, consider that users who do not embrace the document imaging system may see scanning as extra work that must be done, and may have the tendency to continue to work off paper rather than electronically. They may get into the bad habit of printing everything they are working on rather than working off screen images. You do not want that, since if the system is not properly used you will not get the full benefit.
As with any system you will implement, you will want to have the buy-in from the people who will use it. You will want to tell your users about the benefits to your company and their own jobs. We recommend that you start by identifying one or more "champions" or at least supporters who show an interest in the system, and have them do the initial scanning to get them emotionally invested in the system and help you "sell" it to others.
Ideally, you will want to pilot the solution in a small department that would have good benefit and make sure it is successful before expanding to other departments. Better to get the kinks out of it in a low-impact environment than embarking upon a new solution in a high-stakes environment.
We also feel it is important that you strive to give access to electronic documents to everyone who might benefit from having it, even if they are not likely to use it. After all, you probably do not keep all your filing cabinets locked and you do not want to alienate users who may feel that something is being taken away from them. Then again, you likely do lock some filing cabinets or the offices in which they reside, so you will want to appropriately restrict access to sensitive documents (like HR records) only to users who are permitted to access them. |
|
 |
| 30. How to implement a Document Imaging Solution? |
Within the same theme as selling the solution, you are better off implementing a solution over time that works smoothly and reliably rather than having one that is prone to problems or has users scratching their heads wondering where their documents have disappeared to.
So, as suggested above, better to start small and expand into other areas in a reliable way than to get too exotic and do everything at once especially if you are doing it on your own.
When designing an imaging implementation, there are a number of configuration decisions that need to be made. It is best to make certain up-front while leaving other decisions a bit flexible. It is often too difficult to foresee all the details and to gauge user reaction until you put something into the users hands. Prototyping and then piloting with user acceptance is a tried-and-true method.
You should choose an imaging solution that is flexible enough to easily accommodate changes in case you make mistakes, and let you add functionality over time. |
|
 |
| 31. Did I select the right Solution for my company? |
There are a variety of imaging systems on the market. Some are complete systems (like MagSoft's Orion EDMS system), which offer a scanning facility and a repository in which scanned documents are housed. Some are partial systems that offer just scanning with no repository (like Kofax Ascent Capture) or a repository with no scanning (like Microsoft SharePoint). Some have in-built viewing clients with features like electronic annotations, while others use generic clients like web browsers. In selecting an imaging system for your company, you should make sure it has all the functionality you require. There are good reasons for buying a complete system or components, and various issues to consider, like functionality, scalability, extensibility, openness, and compatibility with various platforms, databases, and application. And since implementing an imaging system is as much an art as a science and something that typically grows over time, you should be sure that your vendor of choice offers not only good service and support but also a growth path into other areas of electronic document management like electronic report management, workflow, and business process integration. Whatever solution you choose, and whomever you choose to do business with, you will be helping your company save time and money (and trees), and in the event of a fire, flood, theft, or other unfortunate circumstance, maybe your own job. |
|
 |
|
|