What you needed to know before you buy!
Purchasing a document image management (DIM) system has many potential pitfalls that one can fall into without realizing it until its too late. Hopefully this will help you avoid these costly pitfalls. Here are some of the areas we will briefly cover:
How to determine the size of the system needed and what to be concerned with in the area of expandability.
What to look for in the software and what things about software to avoid. Not only the DIM software, but the other necessary software running in the background, such as the database, backup, storage management, CD authoring, etc.
A gauge as to your requirements in the capacity of the various computers and peripheral items, such as scanners.
Hidden costs that show up after the initial investment.
General things to know document image management (DIM) before you buy.
How to determine the size for both hardware and software
In order to determine size you will need to know how many pages you plan to store; what it will take in equipment to keep current on a daily basis; and how to estimate your future needs.
Here is how to calculate the number of pages to be stored.
Count the number of file drawers you are using. A full file drawer holds about 3,500 pages. So if you have 10 four-drawer filing cabinets, that would be 40 file drawers at 3,500 would equal 140,000 pages.
Arbitrarily, select out 100 pages and count how many are double sided. Since you will be storing images remember to increase your counts by this percentage of double-sided pages when calculating storage requirements. Let's say that 15 of the 100 pages were double sided, that is 15%. So of the 140,000 pages in the above example would be increased to 161,000 pages.
Next, count the number of storage boxes transferred to remote storage. If you know how many are transferred each year this will help you in determining how many will be added each year. Say each year you send 22 boxes to storage that would be 77,000 pages. Then increase this count by the 15% for double-sided pages giving 88,500. Then by dividing by the 240 working days in a year you have 369 pages per day to scan.
A double check might be to have your mailroom count all the incoming mail for a week, then multiply that count by 52 weeks.
Of course you can use our Sizer Program to calculate needs OR do it yourself using the following:
Total file drawers times 3,500 equals "history" pages
New file drawers (the number added each year) times 3,500 equals "current" year pages times 3 equals more "history" pages. To allow for future needs triple (at a minimum) your current yearly pages.
Take all the "history" pages and divide by 25,000 (pages per GB) and this gives an estimate of the gigabytes (GB) of storage required.
Next divide "current" year pages by 240 workdays to determine pages per day to be scanned.
CFC Standalone versus Network.
When it comes to determining required capacity there are three areas:
The number of concurrent users.
The pages to be scanned daily.
The pages (images) to be stored for a three-year period. Now you can see why the need for all those questions and page counting.
If your firm can get by with only ONE person having access to the files at a time, then a "CFC Standalone" system is the least costly.
But, if there is a need for more than one person to have access to the files, then a "Server" system is required.
Software considerations
The images MUST be stored in a standard industry format. Beware, some firms use a proprietary image formats. I.e. they take a standard TIFF and embed controls or links that force you to use their viewer.
Any database software that is not OPEN architecture also locks you to that vendor…and if something happens to them…well. Know what database is used with the DIM software. You want a SQL type of database and even then, beware of databases that constantly need to be reorganized. When these types of databases become corrupt, data is normally lost.
Make sure that the backup software also has a backup "agent" for the database. This will allow the database, while active, to have a snapshot for the backup tape.
The software needs to be "obviously" easy to use. Even the training of the administrator should not require more than 1 day. Software that requires "days" of training may be too complicated for firms with less than 1,000 employees.
Remember, software not bundled with the hardware requires a competent staff to install: the DIM software, the database, the storage management, and the backup software as well as configuring the operating system on the server. Many extremely qualified MIS people have never handled some of these areas.
Hidden Costs or overlooked costs
Software worth having has an annual update charge. The developer needs this revenue in order to keep the software current with the constantly changing world of computers. Beware of firms that do not charge, as they will have to make up the difference in other ways. However, some firms make it a practice not to inform the buyer of this annual charge before the sale. Most software bill for the annual update after 30 - 90 days from the initial sale.
Support is a separate matter. Most firms will not provide support unless the customer has installed current upgrades. Support can be paid for on an annual basis or on a call-by-call basis. Make sure you know the costs of support before buying.
Maintenance of scanners and computers. Normally, workstations and scanners can be down for a day or two, while servers need rapid response. Most on-site contracts need to be initiated at time of purchase so make sure your server and scanners are covered when installed.
Scanners have a number of moving parts; some will wear out (i.e. feeder pads and rollers and lamps.), make sure you have coverage of these items.
Always anticipate that your needs will grow from the current requirements. Make sure the software can grow to meet those future needs.
Make sure the software allows for other interfacing, either by the developer or by other software programmers. If not now, in the future you may want to interface other software with you DIM system and not be able to.
General things to know before buying
A common question is "We already have a server why do we need another?" Most data servers do a number of tasks and the amount of data selected at any one time rarely exceeds 4K, whereas image servers deal in images ranging in size from 25K to 250K and many times ten to fifty images are being processed, so you see imaging processing on an existing server would SLOW it down.
If a "Server" system is required then the question is how big a server? How many concurrent users are needed? Will they ALL be locally attached? The choices are between a "Client-Server" and a "Terminal Server". On a "Client-Server" only those workstations with the software loaded can run the software, versus the "Terminal Server" where all who can logon can use the system. Here is a simple chart on how to estimate the server capacity.
Servers Viewers Terminal Server
Regular 5 - 10 Users 5 Users
Medium 15 - 25 Users 15 Users
Heavy 25 - 50 Users 25 Users
Heavy extra memory 100+ Users 50 Users
Regular - single CPU, 256MB, 2-6 36GB HDD
Medium - single CPU, 512MB, 2-6 36GB HDD
All servers should have RAID controllers.
You can store an average of 25,000 normal office documents per gigabyte (GB). Don't buy more disks than you expect to need over the next three years because three years from now the cost per GB will be much less. However, do allow for the possibility that your storage may increase more rapidly than you currently anticipate.
Selecting which scanner or scanners involves considering several variables:
How many pages need to be scanned daily.
How many indexes need to be key entered per document.
On average how many pages make up a document.
How often will the flatbed be required? (Too thin or thick for the Automatic Document Feed "ADF")
What is the maximum size of the page to be scanned?
Is color required?
What is the percentage of double-sided pages? Rule of thumb over 15% requires a duplex scanner.
Can all the documents easily be brought into one location? In other words, do you need a scanner in more than one location?
Quality and condition of documents to scan. I.e. if these are dark copies of an original then special hardware and software connected to the scanner may be required to produce readable images.
There are several methods of setting up scanning operations.
The operator job may be to remove staples and clips and then scan. This is normally referred to as pre-scan prep work.
In some offices the pre and post scan work is assigned to another than the operator, thus allowing the scan operator to concentrate on the scanning. Post scan work is the assembling of the pages after the scan, if they are going to be saved instead of destroyed.
Then consider how many indexes per document will be required. If for example, the account number is required (and the rest imported from other databases - discussed later) or if the account number, date, name, phone number, etc. need to be key entered from each document, then the scanning throughput is greatly reduced.
Another option used by some offices is to scan into a "batch" area and then later index and transfer the document to an appropriate folder.
The speed of the scanner is not the most important factor in selecting a scanner. Actually the factors such as size of document, color or two-sided pages can force the choice of scanners. The following is a "guesstimate" of scanner throughput by a single operator doing minimal pre and post scan work.
------------ Indexing ------------
Minimum Average Heavy
Keystrokes 15 30 60
Pages per Document 10 10 10
Daily (6 hr) Throughput Using
15 ppm 1,888 1,588 1,350
20 ppm 2,069 1,714 1,440
27 ppm 2,236 1,827 1,519
29 ppm 2,272 1,851 1,535
50 ppm 2,500 2,000 1,636
These calculations presume that the operator is very diligent in their work. We know of a customer with a 50-ppm scanner and only require 16 keystrokes per document and they are pleased to get 1,900 pages per day. With the exception of the pre and post scan there are other ways to increase the scanner input.
There are several methods of automating the filing and filling of the cross-index fields.
The placing of the document in the DIM folder and the cross indexing of the document in some cases can be automated.
By having barcode labels on documents and letting the DIM system read and automatically create a folder, name the document and save the image. This type of operation will place individual pages or the page with the barcode and all the pages that follow until the next bar coded page.
Zone OCR (Optical Character Recognition) of a form is one method. When there is sufficient quantity of the same form to justify the cost of setting up and using this type of software. This software recognizes the form and knows to OCR a specific area on the form for say the account number. Using the account number the software automatically locates the proper folder and names the document and places the form and all the pages with that form.
Another way to increase the input into the system is to enter the very minimum amount of data at the time of scanning, then at the end of the day have the computer select all the new entries, connect to another database (such as customer or accounting) and retrieve the other information to be used in the cross reference fields. I.e. an insurance company might only enter the policy number and then, that night, retrieve the policyholder name, agent, effective date, type of coverage, etc.
Some systems even allow for instant online verification of data as well as the populating of the cross-reference fields by linking the databases.
Workflow is the current hot button in imaging. Workflow developers give examples of firms saving up to 70% of the clerical costs. This has excited the imaginations of cost conscious managers. Please DO NOT plan to install a workflow engine at the same time you install a DIM. People need to become comfortable dealing with images on their workstation. They need to build confidence in the storage and retrieval of document before adding additional software. When everyone feels comfortable then consider Workflow. It can do wonders, but requires a thoroughly thought out implementation. Workflow in effect puts your business practices (for handling documents) in the hands of the workflow engine.
This gives you some of the basics that need to be considered before installing a document image management system. Our dealers are specially trained to assist you in analyzing your needs. Some of the reasons as to why we only sell the software and hardware bundled should be obvious. One that is not obvious is that we wanted our dealers to spend their time helping you focus on getting the job done and on the behind the scenes mechanics. Repeatedly our customers are amazed…the system is delivered and the next day they are using it.
Original Post