Tuesday, December 29, 2009

Basic Tools for Researchers

I have been on at least six trips to archives outside of my state (I.E. involving a significant investment in money and time) and countless trips to the NARA Seattle branch near where I live. I've built up what I consider a good set of tools to research with. Some may fall under the "duh!" category but some may surprise you. They are based loosely on a necessity/nice to have scale:

1) Laptop & scanner:

This might be the "Duh!" I mentioned earlier but it may also be a new concept for some, so it is at the top of the list. A researcher can reproduce documents at various archives (speaking of more than just the National Archives), but usually at cost. If one plans several tips where copies are going to be made, it may very well be a cost savings to buy these up front. Primarily though, it just gives you a better foundation to work with and makes it easier to use your research; you can back it up, trade it, send it to a publisher, etc., fairly easy in digital form.

The laptop does not need to be anything fantastic as it's main function is a scan engine. Choose something reliable and to to taste; perhaps you like a bigger screen or smaller, more compact design. One thing to keep in mind is USB capacity; my Dell Latitude has four ports and there are times when I've used all of them (scanner, mouse, USB memory, headset). Hard drive space is something to consider; will the laptop be your primary storage area or will you keep copies elsewhere, such as a home computer or external hard drive. My preferred image format is tif, and my collection of research is over 150 gigs.

Scanners: multiple flavors with different abilities and costs. My current tool is a Microtek that was a gift from my wife. It is not as portable or rugged as the Canon LIDE 30 I used previously, but the bundled OCR software more than makes up for it with its better recognition of text (To be honest, it's a couple of years newer than the software that was bundled with the Canon).

I'm a big fan of the Canon line as they are compact and don't need a separate power cord. A couple of things to be aware of, however. In my experience the Canons have an extremely narrow focal point and the edge is raised; this means that if you have a photo with any sort of curl or ripple to it you have to take care to firmly press it down or you will have areas that are blurred and out of focus. Since it has a plastic "glass" plate, there were times when I was pressing down on areas too hard and bowing the scanner plate down such that it contacted the scanner arm as it passed. The raised lip mentioned above may be an annoyance when scanning documents larger than the scanner plate to digitally stitch together later.

Along with that, notes are crucially important. You may come across something that you don't have time to scan in now but might in the future and you will definitely want to make it quick to grab. Some people prefer to save things based on the subject, I.E. they will create a folder structure something like "US Navy/Battleships/North Carolina" but I prefer to keep my documents in the original structure I found them in, something like "Settle NARA/Ship Files/Box 4/BB55 (Folder 1 of 5)." This is purely personal preference but I find it helps me retain some familiarity with the records structure and provides a bit of a backup in case I lose my spreadsheet (Not bloody likely!) The Spread sheet I have organized with different tabs on the bottom for different Archives (I.E. College Park, Seattle, San Bruno, Laguna Niguel) and then three columns set so that I can print them out if I need a hard copy without cutting off any text. I create a row or two where needed that lists the Accession (I.E. Ship Files 1940-50, declasssification review #12345) and then underneath that the left most column is the box number, then the second is the folder, and the third and most wide is the notes. If I have too many notes for one line I copy the folder down, indent it and add "(Continued)" and then write more. This excel spread sheet is THE MOST important document I have; I keep multiple backups so that even if my house and the office I keep an off-site backup burns down and both copies of my scans are lost I still know where I found stuff and can at least go back and get it. Additionally, things that I've seen but not scanned in can be used in trade; perhaps you have contacts with another researcher who wants to research a topic; I have traded information in the past.

Well, that went a little longer than expected....

2) Digital Camera:

Depending on what you are after this may make a better tool than a scanner. I have known researchers after bulk documents to use a camera for the speed. If you are only after a deck log and know that you have 1,000 sheets to do, will a scanner that takes 20-30 seconds per scan or a camera that takes a second or two per shot work better? Additionally, in some cases items that are too large for the scanner can be easily photographed.

3) USB Memory Drive or external hard drive.:

Once you have the data, how do you protect it? If you spend $500 and a week on a research trip, what is the value of what you find, and what does it "cost" if you lose it? I typically travel with my laptop and a 16 gig USB Memory stick. Once the day's scanning is over I COPY the scans on to the USB drive so that I have a backup right then and there. Once home, Copies go on to the home work station and an external hard drive that mostly stays off-site. Copies of my afore-mentioned spread sheet are saved to all four, plus a copy on a server that is out of state. The 16 gig memory stick was a cheap investment in 2009 of $40.

4) Mental Diversion:

With the extended hours some archives have, it is possible to spend up to 60 hours in 5 days researching. If you are doing research that requires mental acuity I believe it pays to bring along something to keep your mind as fresh as possible. Since rubber chickens are disallowed, it's usually easiest to set up something on the computer, be it some music (using headphones of course) or something like solitaire. Don't be afraid to get up and walk around too; a five minute break will work wonders.

Friday, December 18, 2009

Mind Eraser, No Chaser

Proof-reading while sick and foggy is always an interesting experience.... I'm on page 8 of the Passive defense, about 24 of 98 paragraphs (they're numbered; I didn't count them). Little behind where I'd like to be, but I'm taking my wife and her puppy down to their first dog show tomorrow and am anticipating a little time then, if I can keep my mind focused. We'll be hitting the Centralia Veterans Memorial Museum on the way back, which I've seen for a couple of years but have never been able to visit. I've been mulling over adding a museum section to the site as I have hit up quite a few and taken pictures.

Sunday, December 13, 2009

This month's haul

Not a lot of progress over the last couple of days with the Passive Defense proof reading, but yesterday was NARA Saturday so I do have some news there. One of the earlier finds was this memo from King Neptune that I was able to OCR and work up while waiting for other scans to run. I have a 35 page document on refueling instructions from 1942 that will hopefully be helpful to the modelers out there. Nothing else earth shattering, but there will be some additions to the Miscellaneous section in the future.

One of the other researchers was having scanner issues.. Not fun when you are on a limited time research trip.

Sunday, December 6, 2009

Passive Defense Progress

I have mentioned Passive Defense before, and I've got a progress report for those interested. There are three parts to the Passive Defense Camouflage folder in Seattle NARA; the initial handbook, a supplement regarding camouflage paint, and a two-page memo with color chips in camouflage colors for fuel tanks. The first one has all of the text and figures done and just needs the color chips added and 50 pages of proofreading. The supplement has the textual pages done, but not all of the figures, and I haven't started the fuel tank memo at all. Still on track for a January release.

All of these will be linkable, so if you want to link to the section on the "Importance of Indirect Observation" for an online discussion you'll be able to.

In other camouflage news, I should be taking delivery of a large collection of US Navy camouflage documentation from Ron Smith later this year, which will hopefully expand the camouflage section quite a bit.

Saturday, December 5, 2009


I've had Google Analytics on my site for a bit now, and it's interesting to watch the results. For the last month, and fairly consistently before that, the top "page" on the site is the root, or main, front page. The Ship's index is second, and San Francisco's Damage report is third.

I had 1,703 unique visits last month, and of those the largest percentage (204) had no source (I.E. they came from bookmarks or favorites. 232 came from Google searchers, and 222 came from Navsource links. The two big ship modeling sites, ModelWarships and SteelNavy, sent me 154 and 103 respectively, 98 came from Google Image searches, and 97 from Wikipedia links. 86 came from Yahoo and 39 from Bing. Stats read that 15% came from direct sources, 23% from search engines, and 62% from links from other sites.

Almost 1,200 of the visitors were from the US, with Spain, the United Kingdom, Canada, and Germany rounding out the top 5. Of the United States, the top five were California, Washington, Virginia, New York, and Texas. Lowest was New Mexico, with two.

Average visitor viewed 4 pages on their visits and spent just over three minutes on my site. 71% of the visitors were new and had not been before.

57% of the visitors were using Internet Explorer, with 44% of them being on version 8, 37% on seven, and the pitiful rest on on IE6 (for shame!). Firefox was the second largest at 30% with Safari in third place at 5% and Chrome at 2%. 87% of my visitors are running windows, about 7% are on Macs, and the balance are on an assortment of *nix and smart phone OS's.

I size most of my pages no more than 800 pixels wide; 3% are running 800 x 600 and the top five are 1024 x 768 (33%), 1280 x 1024 (14%), 1280 x 800 (11%), 1400 x 900 (9%) and 1680 x 1050 (7%).

About 71% of my visitors are on broadband, 3% are on dial up, and 23% are "unknown." I spend a lot of time making my code as clean and small as possible for the dial up people; I hope y'all appreciate it!

Of the broadband, 9% are on Comcast (that statistic may be skewed as I am on Comcast and hit the site regularly to proofread, etc.), 7% are on Roadrunner, 5% are on Verizon's network, and 2.5% are on Bellsouth. The rest is incredibly fragmented.

It may be boring as hell but I find it interesting....

Updates are up and linked to, if you could make it through the above post ;)