Requirements Gathering
Following are some guidelines to identify the application requirements and the details to design the capture system or solution using IBM Datacap Taskmaster Capture.
Requirements are classified into following sections...
ü Current capture or document
processing environment
ü Physical locations that receive and
process documents
ü Types of documents, their
characteristics, and the data they contain
ü Business rules that validate whether
the data is valid
ü Document volumes and time
constraints
ü Business requirements for dealing
with exceptions
ü Output requirements for data and
documents
ü Scanner requirements
ü Hardware and software requirements
1. Requirements for current capture or
document processing environment
a. Scanning
With this, you discover
the characteristics and details of the business processes and systems that are
currently in place. Identify the scanning requirements:
Ø Are paper documents currently being
scanned?
Ø At what point in the business
process are they scanned: upon arrival, in the middle of the process, at the
end of the process, or a mixture?
Ø What equipment and software are
being used to scan the documents?
Ø Will the current equipment be
replaced, or will it be used with the new system?
Ø Can the scanners handle the
projected peak volumes based on comparing the scanner specifications to the
scan volume?
Ø Will the scanner handle de-skewing
and noise removal?
Ø For each location, will scanning be
done by using thick-client ISIS or thin-client TWAIN scanner drivers? (The
preferred practice is to test a specific scanner interface, driver, and scan
hardware in a test environment.)
Ø What happens to the paper documents
after they are processed? Are they stored on-site, returned, stored off-site,
or destroyed?
Ø Will the new system change the way
paper documents are handled after they are scanned?
b. Processing
By explicitly
documenting the current processes, you can identify the specific areas of
process improvement.
Ø How many people “touch” the document
from arrival to completion, and in which departments or locations do these
people work?
Ø What is the current document
handling process?
Ø Are documents processed centrally or
at remote locations?
Ø How many people are involved in
processing documents?
Ø Which processing is currently being
performed?
– Receiving documents,
logging, counting, batching, and date-stamping
– Sorting documents for
filing and distribution
– Preparing file folders
– Filing documents
– Distributing files or
documents for processing
– Photocopying for
distribution
– Manual typing of data
– Retrieving files from
file cabinets
– Searching through
files to find documents
– Matching documents
against exceptions reports
– Refiling documents and
files
– Pending or suspense
file management
– Keeping calendars or
diaries to track follow-up documents
– Searching for
misplaced or lost files
– Reconstructing lost
files
– Purging files and
removing selected documents for disposition
– Transporting documents
to and from storage rooms or off-site storage
– Filing internal forms
or copies of correspondence
c. Policies and systems currently in
place
Identify the policies
and systems that are currently in place:
Ø Has our organization approved the
destruction of original paper documents following scanning?
Ø What systems are used for tracking
and inventory of paper documents and files?
Ø What ECM or other systems are
currently involved in the current scanning or capture operation?
d. Time frames
Identify the
requirements regarding time constraints:
Ø How long does it take for a document
to be processed from arrival to completion?
Ø Are there significant differences in
time depending on document type? If yes, identify the differences.
Ø What steps in the process take
longer than desired?
2. Processing location requirements
a. Physical documents
Identify the
requirements for physical documents:
Ø How many physical locations create
or receive physical documents?
Ø Are the physical documents processed
in the location where they are received, or are they moved to a central
location for processing?
– How
are they moved: by mail, internal courier, or external courier?
– Are
photo or scanned copies made before they are moved?
b. Electronic documents
Identify the
requirements for electronic documents:
Ø How many physical locations create
or receive electronic documents?
Ø Are the electronic documents
processed in the location where they are received, or are they moved to a
central location for processing?
– How
are they moved: by email, electronic media, file copying, or file transfer?
– Are
copies made before they are moved?
3. Document type requirements
The questions in this section help to identify the documents types, how
they are created, and their characteristics. You must identify and gather
single and multiple page samples of all document types.
Identify the requirements for document type:
Ø What are the document types and any
subtypes, that we process? Consider the following examples:
– Packing slips for complete,
partial, back ordered shipments
– Invoices, including
purchase order invoices, non-purchase order invoices, preapproved invoices,
trade Invoices, non-trade invoices, and credit memos
– Attachments, including shipping
confirmation notices and acknowledgement of receipt forms
– Loan applications, including the
application form type by form number
– Insurance claim, such as the claim
form by form number
– Tax forms, including the form
number and year
Ø Who creates the documents?
Ø Can the design of the documents be
changed if necessary to increase recognition accuracy?
Ø If documents are created by external
parties, approximately how many sources are involved?
Ø What is the input source for each
type of document: scanner, fax, email, or other systems?
Ø For each type of document, does it
have a fixed number of pages or a variable number of pages?
Ø What is the number of pages per
document?
Ø For images, what is the image resolution
and format (black and white, color, gray scale)?
Ø What is the input file format for
electronic documents?
Ø Do documents contain more than one
business transaction?
Ø Do people stamp, mark up, or write
on documents as they are processed?
4. Captured data requirements
With this information, you can determine the data recognition
requirements and other aspects of handling the data, including validations,
lookups, verification, indexing, and data entry.
Identify the requirements for captured data:
Ø What fields should be manually
entered at the batch level (for example, Scan Date, Expected Number of
Documents, or Expected Number of Pages)?
Ø What fields should be captured at
the document level (for example, Invoice Number, Invoice Date, or Invoice
Total)?
Ø What fields should be captured at
the line item detail level (for example, Item ID, Item Quantity, or Item
Price)?
Ø For each document type, is data
primarily machine printed or hand printed?
Ø For hand printed documents, is the
print constrained or unconstrained?
Ø Are there pages that do not have
data that must be recognized, such as attachment pages? It is common for forms
to have instruction pages that are scanned but that do not have data on them.
Ø How is data located on the pages
where you need to use recognition to read the data?
– Fixed form layout. Fields are on specific
zones where the location can be used to find the data.
– Variable form layout. Fields have text labels
where a search for the text label can locate the field.
– Data is contained in a barcode.
Ø Is data validated by using an
external database?
Ø What are the business rules for
validating the values of the fields?
Ø Do fields have lists of valid
values?
Ø Is it data optional or required?
Ø Does the data printed on the page
conform to a repeatable pattern? (For example, Credit Memo Number startswith
the letters CR followed by six numerics, a hyphen, and three numerics.)
5. Verification requirements
Verification intersects users with the documents. You must understand
where these users are located and what tasks they are authorized to perform on
each type of document. Business rules need to be applied that might mirror
existing practices for handling paper-based data entry. Verification might also
be desired as a quality control step to ensure that every image is readable.
Identify the requirements for verification:
Ø Will verification be handled in a
central location or from remote locations?
Ø Are there business rules or policies
that will require multiple verification steps?
Ø Who will perform verification?
Ø Does verification need to restrict
access to specific document types by different groups of users?
Ø Do we need to display every document
or page or can we display only documents or pages were we have exceptions?
Ø Will some documents require manual
page identification by an operator?
Ø Based on the information gathered on
input documents, captured data, and export requirements, how should low
confidence data, invalid data, unidentified documents, and incorrectly
identified pages be handled?
Ø When recognition results are high
confidence, do you want an operator to view the document anyway?
Ø Do operators need to visit all
fields with low confidence characters?
Ø Under which circumstances can the
operator split out a document from the batch to finish processing the other
valid documents in the batch? How should the split-out documents be handled?
Ø Should operators be able to mark
document for deletion (documents will not be exported)?
Ø Should deletion trigger a follow-up
process or automatic notification?
6. Export requirements
Identify the format, content, and target system or systems of the data
and images for export:
Ø What is the document format for the
exported documents for each type of document: TIFF, PDF, PDF with text, PDF/A,
original input format, or other?
Ø Is the original image or the
enhanced image used for export?
Ø Are color and gray scale images to
be exported as color and gray scale?
Ø Is a specific file naming convention
needed for the exported document?
Ø What are the document properties of
the exported documents?
Ø Do the images have areas that need
to be redacted?
Ø What are the target application
systems for the exported data?
Ø What are the interfaces that are
available in the target systems for ingesting the data?
Ø What data fields are exported to
target application systems?
Ø Does the data need to be reformatted
to accommodate the needs of the target application?
7. Volume and timing requirements
You must size and appropriately install and configure various Taskmaster
components (remote/local clients, background processing, Fingerprint Service,
Fingerprint Maintenance Tool). To assist with this task, create a matrix based
on the following information:
Ø Sources of input
Ø Input volumes for each source
Ø Approximate number of unique
documents (number of fingerprints)
Ø Peak periods
Ø Timing (processing windows)
requirements
Identify the volume and time requirements:
Ø What are the input sources: scanner,
fax, email, or other systems?
Ø How many document or image files are
processed per day from each source?
Ø How many documents per document type
are processed per day?
Ø For highly variable documents, such
as invoices, how many different document formats are processed?
Ø What is the peak volume of documents
and image files? Are there peak processing cycles daily, weekly, monthly, or
annually?
Ø Are there peak volume requirements
per day, or are there specific service level agreements about how quickly
documents will be processed?
Ø Do existing paper files need to be
scanned (backfile)? Quantify the volume and time frame for digitizing. Are the
processing requirements different for historical documents compared to new
documents?
Ø Is the ability to prioritize batches
or to change the sequence in which the batches are processed required?
Ø What are the availability
requirements for the system?
8. Administration requirements
Ø What are the production reporting
requirements? Compare Taskmaster standard reports to determine if custom
reports formats are needed.
Ø What information needs to display
for job monitoring? Compare Taskmaster monitoring views to determine if additional
fields are required.
Ø What is the organization model for
administering the system?
Ø What are the security requirements
for authentication?