Post by ferthalangur (aka Rob) on Mar 26, 2023 13:42:23 GMT -5
As far as the move to Bellefonte ... it's 12 minutes from State College where the APS headquarters used to be.
Wandering way off-topic big-time here, but ...
My CompSci + Archivist experience would respectfully disagree with your CompSci. It just is not as simple or inexpensive to digitize collections as most people believe it is. I have built several digital archives repositories (on a relatively small scale and seriously "on the cheap") and I have managed several digitization projects. It is massively complicated and very expensive.
The cost of digitization per cubic foot, in staff time (regardless of whether they are volunteers or employees, their time has a value) as well as the cost of an I.T. infrastructure to store the images in and make the accessible, is hard even to quantify. Materials must be selected, (copyright researched and cleared), prepped, then scanned. OCR text extraction must be verified, the text needs to be indexed (you do want to find stuff by searching, right?), arrangement (you do want the pages from a book to be browsable, right?), adding metadata .... then uploading everything to an online system that can be accessed by your members / the public via the Internet. You also have to have systems engineers to keep the computers and storage systems running.
The cost of digitization per cubic foot, in staff time (regardless of whether they are volunteers or employees, their time has a value) as well as the cost of an I.T. infrastructure to store the images in and make the accessible, is hard even to quantify. Materials must be selected, (copyright researched and cleared), prepped, then scanned. OCR text extraction must be verified, the text needs to be indexed (you do want to find stuff by searching, right?), arrangement (you do want the pages from a book to be browsable, right?), adding metadata .... then uploading everything to an online system that can be accessed by your members / the public via the Internet. You also have to have systems engineers to keep the computers and storage systems running.
I can't give you a dollar number, per page, to digitize and make accessible a collection of documents / books / catalogs. For an organization that has very large projects and plenty of funding, there are economies of scale, whether it's done in-house or outsourced to professionals. The actual numbers do not look like $0.05 to $0.25 cents per page, which is what I think your typical person would guess is the cost. I think that they are more in the $5.00 to $10.00 per page. What does that work out to per cubic foot? About $12,500 to $25,000 per cubic foot. That is a total SWAG, I admit. Maybe it's possible to digitize a collection, "soup-to-nuts" for $1 per page (I truly doubt it ... I have done projects using incarcerated prisoners and volunteers for the labor and there were still ongoing expenses and it took forever). So if I'm of by an order of magnitude, that's $1,250 to $2,500 per cubic foot. Plus annual maintenance costs for the systems.
Space for the growing volumes of paper documents of [potential] enduring value is a difficult problem, faced by everyone from your local neighborhood historical group up to biggest repositories in the USA : National Archives / Library of Congress / Smithsonian Institution. You can't digitize your way to more space unless you rob a bank or two.
Buying current information resources in a "born digital" format stops the exponential need for more space, but the ongoing cost is also staggering, and is fraught with risk of losing those resources if the publisher goes out of business, etc.
OK, enough about that!