For those of you who do any sort of preservation reformatting or digitizing you know how time consuming the quality control process can be. Our best practice would be to check completeness and initial quality of the original, especially if we are sending them to a vendor, and then to quality control page-by-page or frame-by-frame the facsimile or digital version. Maybe over time, as we become more confident in our process or the vendor’s, we may choose to do some spot checking or sampling if we are doing a large project. This is the step that is often overlooked when planning a project and budgeting staff time. It seems like such a waste of resources, especially when there are no mistakes to be found.
Well, let me tell you a little story and provide a warning. Like many academic institutions, our dissertations were sent to UMI for microfilming dating back to the 1930s. We did not receive copies because the student was required to submit two paper copies to the library (one for general collections and the other for University Archives). In 2006, we caught up with the times and moved to electronic submission of both MA theses and PhD dissertations through ProQuest’s ETD process. At that time, ProQuest made an offer to members of the Greater Western Library Alliance to digitize older theses and dissertations at a reduced cost so full-text versions could be accessed through ProQuest’s Dissertations & Theses database. Our administration decided to have all of our dissertations digitized. We sent nearly 2,000 print titles and ProQuest used an additional 12,000 microfilm titles from their holdings for the project. The majority of print titles were early dissertations that needed a little attention; graphs, charts, and photographs were re adhered, pages mended, and bindings were cut.
Because we did not receive digital copies, we never performed any post-production quality control, and also thought that since ProQuest was making these available for sale it would behoove them to be diligent and capture them accurately. Flash forward to the present. Our Digital Repository (DR) was established in 2012, giving us a place to provide open access to dissertations and theses. Administration purchased the digital dissertations from ProQuest and they are being added to the DR by our Metadata and Cataloging staff. Each title page is checked against the record to confirm that the PDF is what it claims to be. Well, so far our diligent MD and Cat staff have identified 15 ProQuest screw-ups.
Each dissertation usually begins with bibliographic information and a UMI statement indicating the text was filmed directly from the original and if anything is missing or of poor quality, it is because the author submitted it that way; although, missing pages would be noted. At first the Catalogers were finding minor problems such as no title page, the wrong title page, or missing front matter. Then they started finding parts of other dissertations added in, the wrong dissertation (sometimes from other institutions!), or, it gets better, portions of two different dissertations, neither of which were the correct dissertation, pieced together. So far it appears that all of the mistakes are coming from microfilm scans from the 1970s-90s, and since we do not hold microfilm copies, I cannot determine if the mistake is with the microfilm original or the scanning process. (ILL requests for two microfilm copies were not received by the time of this post). The incorrect digital versions we were sent are the same ones that ProQuest has made available.
Preservation is now scanning these mistakes in-house and adding them to our open access DR. In the near future, the OCLC MARC records for all ISU theses and dissertations will include the URL to the DR object without a URL to the ProQuest version. Researchers will be able to find complete and accurate representations in our DR for free.
I would suggest that if your institution has worked with ProQuest to convert microfilm versions, you may want to do some checking of your own. Maybe we should ask ProQuest if they would like to purchase correct digital files from us.
Quality control, quality control, quality control!