Document Imaging Glossary
Application Metadata
Application metadata is information not visible on the printed page, but embedded in the document file, remaining with the file if it is copied. Microsoft Office routinely embeds many different types of metadata in word processing, spreadsheets and other applications. The same is true of other computer applications as well. Important types of metadata that may be embedded in Microsoft Office files includes: title, subject, author, comments, revision number, last print date, creation date, last save time, total editing time. Some documents may also include prior revisions and comments embedded in the metadata.
Bates - Apply A Bates Range
(Also “assign a Bates Range” or “Bates Number”) This term means to start a numbering sequence at a specific predefined number.THIS DOES NOT MEAN THAT THE IMAGES WILL HAVE A BATES LABEL APPLIED TO THEM, but simply that the documents when loaded into your software, will be addressed by this number/name in that software.
Bates Label
Probably the most confusing of all of these terms, “Bates Label” has been used for multiple and very different descriptions of services provided by our company. By definition, a Bates Label is a physical label printed and attached to a paper document, or a mechanically ink-stamped name consisting of a prefix and a running sequence of numbers. Pre-printed Bates Labels may also include barcodes, as well as human readable alpha numeric sequences. (Also see Electronic Bates Label or Endorsing)
Blowbacks
Blowbacks are paper copies that are produced from electronic copies. They are frequently produced from an electronic copy set that our clients receive on CD/DVD. They are often documents that we already have in our scanning software. Blowbacks are often confused with a copy set (see below) If we have scanned the documents or have acquired them electronically into our software, and you want a set of prints, these are blowbacks.
Boilerplate
We often ask if you would like to include or exclude boilerplate material. This is the ready-to-print copy or form, the detailed standard wording of a contract, warranty, etc, phrases or units of text that appear to repeat verbatim multiple times in a production, as in correspondence produced by a word-processing system. Common boilerplate appears on the back of each copy of a carbon copy set. Eliminating the boilerplate can greatly reduce the page count in many cases.
Copy Set
A copy set is an actual copy set produced by someone standing at a copy machine and feeding pages into the machine. This is the term used if you don’t need a load file for you legal software, or if you don’t need to have electronic endorsements. If you want these to be Bates labeled, the original would be Bates labeled, then the subsequent copy sets will also have the label.
Deduped Data
An approved method of removing duplicate information
Denisted Data
Removal of all files, system files are one example, that are understood as not responsive to nearly all legal proceedings.
Endorse
Endorsing a scanned image means to use the digital image file that has been either scanned or imported, and then electronically “stamping” information onto a page or document. Commonly endorsed elements are Bates numbers and a confidential designation.
Export
When you are requesting a project to be scanned by us, we will always need to know what export format you would like. Most people have an idea as to how they would like their documents produced, but some examples are Summation, Trial Director, Sanction, Concordance, IPRO, searchable PDF and many others. If you are unsure as to what export format(s) you would like to have, give us a call and we can assist you in deciding what is going to be right for your situation.
Electronic Discovery
Electronic Discovery is the process used to assess and incorporate Electronically Stored Information (ESI) into a legal dispute. The subject data is processed using powerful software packages that allow the user to define parameters and process in large groups. The starting point is often bulk data stores produced from a party in a dispute, and the resultant data will include available metadata and retrievable text, usable in any number of litigation support software applications.
Electronically Stored Information (ESI)
Electronically Stored Information (ESI) has become the most abundant source of data that is the subject of legal disputes today. The term simply refers to information that exists in some form other than printed paper. ESI is found on hard drives, CDs, DVDs, floppies, all forms of telephones, thumb drives...and other places that might go without mention. Harvesting data that is stored electronically is usually the most efficient way to bring data into a database. However, the large volumes of ESI usually cause the harvesting of that data to become expensive in the end.
When handled properly, ESI allows the parties to use advanced databasing to gain a thorough knowledge of the data that is available to search.
Logical Breaks
When we (PIC) are asked to apply logical breaks, we will break the documents into what we perceive to be logical break points in the documents. Note that this is subject to our interpretation of the documents only, and if asked to “Re-break” according to another parameter, we will apply additional charges for this request.
OCR (Optical Character Recognition)
When you scan a page, the file is saved as an image file. Image files are not searchable unless a process called OCR is performed. With OCR, the computer “reads” the image, identifies letters and numbers, and saves it as searchable text. In the case of a searchable PDF, it can embed or layer the OCR file behind the actual image file making it appear as a true searchable file. One thing that should be pointed out is that OCR is only as good as the document quality of the image. If you have a poor quality page or scan, the OCR can be nearly useless. As a general rule, hand written material will not produce usable OCR'd text.
Physical Breaks
When we (PIC) are asked to use physical breaks for scanning, we will break the documents into a new document at a physical barrier, such as a paper clip, staple, rubber band, folder, etc.
Redweld
Usually an accordion style expandable folder, sometimes refered to as a redrope or a redbook.
System Metadata
System metadata is not embedded in the file, and instead is stored externally on the computer file system. System metadata does not remain with a file when it is copied. System metadata may include a file name, size, location, path, creation date and modification date. While application metadata can be modified, it is very difficult to modify system metadata.
Video Conversion
Usually video is provided to us on VHS tape or in DVD Movie format. In order to be useful in most legal applications, these will require Conversion to MPEG1. We understand that this is the most primitive of the MPEG formats, but it allows almost universal compatibility with the most commonly used legal applications. Conversion to other formats is available at your request, but we will always inform you of the compatibility issue.
Video Clip Creation - Hard Clips
When an excerpt is created from a larger MPEG file, we can produce an exportable, free standing MPEG file that exists independently from the MPEG file it originated from. At PIC we call these hard clips, because they are independant, and we use hardware resources to create them.
Video Clip Creation - Soft Clips
When an excerpt is created from a larger file using a software application such as Trial Director or Sanction, there is not actually a clip created. The application simply records the start/stop points that the user has requested, and uses this information to play only the portion of the MPEG described within those start/stop points. The clip does not exist as a freestanding clip, and the original MPEG file must be present to play the clip. At PIC we call these soft clips, because they are a representation of a clip that exists only within the original application.
Video Synching
When a video deposition is taken, there is the video portion and the transcript portion, recorded by the court reporter. Video synching is the process of causing the 2 components to play together, with the appearance of closed captioning. In order to be candidates for synching the traditional video file needed was an MPEG-1 file (mpg), and the transcript portion needed to be in some form of a text file. In the recent past different aplications allow for an expanded range of video file types, so check with your software requirements for compatibility. In applications like Trial Director and Sanction, you are able to synch the transcript to the video using a manual process, and several vendors will synch the two components with an automated approach that can save a great deal of time. Give us a call, and we'll give you the pros and cons on any of the above approaches.
The main reason to synch video is NOT TO PLAY THE TEXT WHILE THE VIDEO IS RUNNING. Rather, the main reason to synch video is for searching and clip creation. If you have never seen these 2 aspects of video demonstrated, let us know, and we'll show you the REAL advantages of a synched deposition.