Friday, December 10, 2010

Summing it up

This semester has been a whirlwind tour of various open source Content Management Systems and similar software packages that can be used as digital repositories. It's obvious that there is no obvious choice when it comes to selecting the best package for a repository - and even more problematic at large institutions where there are so many differing communities and needs.

My takeaway this semester is that there will be lots more to learn and that collaboration will be one of the keys to success in the digital information management field. I think it is a promising area and I'm really grateful to have started down this path.

I want to keep working on some form of digital repository, just to keep my irons in the fire. I think I'm going to try installing Omeka on a server and see how far I can get with it.

Sunday, November 28, 2010

Of Virtual Machines and Real Learning Experiences

I've been working with Virtual Machines a lot during the past semester and I think I've gained a lot of valuable experience in the process of repeatedly building and configuring an Ubuntu / Linux server. I'm not sure how many more times I really would want to do that (I'm up to five or six now, I think), but I certainly think that, for this kind of class, it is valuable to learn about some of the nuts and bolts of working with these kinds of systems. I'd even like to learn how to take the platforms online, but realize that time may prohibit that opportunity.


The possibility of a prefigured VM might offer more time to work on other things, such as collection development, and that would be a good thing. I doubt that I'm going to have to serve as a system administrator anytime soon, and perhaps my time would be better spent on things that relate more to the collection and less on technical stuff. Still for me, I'm a bit of a computer geek and I really enjoyed learning the command line stuff and I'm sure there will be times when that experience will give me the confidence and knowledge to boldly go where I might never have gone before.

Friday, November 26, 2010

Impressions of sites for digital repositories and related platforms

The Omeka home site is very nicely designed and implemented. It features both video and text instructions as well as easy access to forums and other support areas. I found it inviting and easy to use.

Eprints presents me with an overwhelming amount of information on the home page and as I go deeper in any given direction, I have the sense that I'm getting lost and that all these layers are just glommed on to each other. It is a much more confusing site than Omeka, perhaps reflective of the age and complexity of the Eprints platform itself. It's pretty utilitarian in a way, but not in a really good way, to my taste. It does give me a sense that this is a very BIG platform with a lot going on, which is a good thing. I think Eprints is due for a major overhaul of it's website.

DSpace.org has a much more spacious and inviting atmosphere. The left column menu bar gives an early indication of the DSpace platform interface and the News and Upcoming events column at right give a sense of an active user community. The placement of the New User features front and center engenders a sense of welcome for prospective newcomers. The weakest point in the DSpace platform seems to be support. The User Forum has been down for at least weeks and the DSpace wiki has been in the process of cleanup after migration for as long. I personally feel lost in the DSpace support area and this single fact may mean the difference between using DSpace for my repository and using it.

Drupal shows its strong points right from the home page. The design is clean, but contains lots of information. It offers good reasons to use Drupal with all the Modules, Themes, Active Developers, etc. etc. It also welcomes new users with a get started button and gives a feel for the active community of Drupal users with the various News, Updates, Forum Posts and Commits (sp?). The slogan is very persuasive: "Drupal: Come for the software, stay for the community" - wow. Indeed, there is a huge Drupal Community and it is easy to access and very active. Hmmm, maybe I need to reconsider Drupal as an option, now that I think about it!

JHOVE is pretty bare-bones and technical-looking. Which is probably a valid reflection of JHOVE in general. You have to be a geek to hang with JHOVE. PKP Harvester has a much friendlier design, interface and implementation. There are a couple support options, through manuals or discussion forms and although the information is dense, the organization of the site really helps navigate through so much information.

I think the various aspects of the home site for each of these platforms or applications gives me important information about whether or not it might be the right platform for my digital collection. The key areas are support, support, support and community. After that, I would say the functionality of the site, including design, navigation and indications of vibrant ongoing deployment and development.

Sunday, November 14, 2010

Exploring Open Archives and Service Providers for OAI

This week, after installing the PKP Open Archives Harvester application (in the same VM as my ePrints repository) I explored various service providers using links from The University of Illinois OAI-PMH Data Provider Registry with the following impressions:


colLib: which then linked to http://libriotech.no/ which is a Norwegian company that provides installation and management for library software systems.


digitAlexandra: which took me to a dead page at http://www.digitalexandria.com/


DL-Harvest: which took me to a page with a CSRF security error and Error: Document Not Found notice, so I tried another link from the page for DList which linked to http://dlist.sir.arizona.edu/arizona/ and was able to access the University of Arizona Campus Repository. From there, since I was familiar with the DSpace format, I recognized that I could browse by Communities and collections and was able to make sense of what kinds of resources were there and how many. It is apparent that DList is the deepest area of the archive with 1477 resources. There was also a smattering of other data from the Tree Ring Lab, School of Music (Vern Yocum Collection, which actually got me hooked for a while) and Arizona Anthropologist.


Sydney College of the Arts Archive: which initially took me to http://gita.grainger.uiuc.edu/registry/details.asp?id=3537&sets=all#ListSets, where I could then find the link to the actual repository at http://va.library.usyd.edu.au/oai. It's not a very big repository, even smaller than DList, so again, the Browse feature came in handy. Now I'm looking for something larger scale that allows me to really search for something.


OAIster: taking my cue from the assignment, I figured I would take a look at this monster of library indexes at http://oaister.worldcat.org/. A search for Edward Weston gave a result of 2853 resources available, and took a whopping 19.51 seconds to deliver. Obviously, this kind of search would bring up a lot of records, but my attempts to utilize the OAIster faceted search to narrow the result were fruitless. My first attempt came up with an error message and the second attempt gave me a dialog window with a green spinning circle of death. I went back to the home page and tried again and got a message that the database I was trying to search was not responding. Try again. So, I tried the advanced search and didn't have much better luck.


My takeaway from the OAI section is that I have a better understanding of how resources are shared between institutions, especially libraries. I also get the feeling that it's not a smooth process and needs time and resources in order to make searching various collections easier and more reliable for the end user. The importance of metadata becomes more and more clear as I see the various pieces of the digital library/repository come together. There is much to do!


Tuesday, November 9, 2010

I've been spending a LOT of time with ePrints this past week. More than I expected, even after I had my items entered in the database. This is because, after all that work, my host machine crashed (actually, it was being very slow and I got impatient and initiated a hard reboot) and the VM with ePrints on it got corrupted. Luckily, I had a snapshot not too old - but still had to reconfigure ePrints, redo the subjects file (good practice with my Linux CLI, I suppose), and then re-enter my items. It went faster the second time through, and I did a better job, I think. Experience does make a difference, after all!
The other thing that really makes a difference is that ePrints allows me to make templates from records. Once I have a certain type of item, say image or video, most of the settings will stay the same. I can make a template and just change the link to the item, upload it, and then update the title, abstract and keywords. That usually takes care of it. I like that about ePrints!
One problem with ePrints, though, is that the results are returned alphabetically by author. I would prefer that the results be returned by subject, but since ePrints is really sort of library-centric, the results by author will have to do. I think this could be a real problem if I had a lot of records, but since I don't, it really doesn't slow down my searches.
I am still looking for the "killer app" that is really made for images and video, though. I see Omeka looming out there, but I have to do some harvesting first, I guess.

Wednesday, November 3, 2010

Getting started with EPrints - not the easiest install, but good CLI practice

This week has been spent installing and configuring EPrints. I found the EPrints installation to be slightly trickier than DSpace or Drupal. There were some things I had to wrap my head around in terms of setting up EPrints with the LAMP and SSH. One thing that tripped me up was that I had already set up the mySQL server with a username and password and needed to relay that through EPrints. Once I figured out what was happening there, things got easier. It does remind me that I need to be very careful about keeping track of passwords and I've been generally going along with what the installation instructions from Bruce recommend. It just makes it easier. The other thing about having problems is that I "get" to repeat the procedures multiple times, which actually helps me learn them a lot better. I genuinely try to make sure that when I'm typing in commands, I understand what I'm doing. My CLI skills are getting better, but I sure do get tired of typing in long strings of folders to get back and forth in Linux. I'm assuming that if I got better, I would find workarounds for this such as creating symbolic links for frequently used directories.


I customized or "branded" the EPrints front page with the name of my collection and a logo. This all had to be done using the CLI interface and moving things between my user home folder in Linux and then using CLI to copy it over to EPrints. I found this a little time-consuming and wished that there were more administrative features in EPrints that would allow better customization of it. The same goes for changing the taxonomy in EPrints. One has to go into the CLI interface and move things back and forth (or work in nano) in order to customize the taxonomy, which defaults to LCSH. I guess they assume that most people won't need a custom taxonomy, but I wonder about that. As more and more types and varieties of collections get digitized, it seems likely that custom taxonomies, folksonomies and tagging will be expected from users.


At any rate, I am looking forward to adding my collection to EPrints and seeing it in action. I've been impressed with some of the photo and video collections that I've seen in EPrints, which is a bit unexpected, but a pleasant surprise to be sure.

Friday, October 29, 2010

Deeper into DSpace

This past week I've been working more with DSpace and getting deeper into configuring thumbnails and previews. Whereas Drupal had modules such as imagemagick, DSpace has the native ability to display thumbnails and previews, it just needs to be turned on in the configs. In the process of trying to accomplish this (I never did get thumbs and previews to work in DSpace), I spent a considerable amount of time trying to find support forums or other communities such as those available in Drupal, but with no luck. DSpace apparently has either very little community support or everyone who uses it is way smarter than I am and they just figure it out on their own. I am pretty sure I could eventually figure this out on my own, but what's the point, really? Why shouldn't we take advantage of the human impulse to share problems and solutions and thereby build relationships and support communities so that, someday, when someone has a problem that I know the answer to, I can reciprocate and tacitly ask them to "pay it forward."


One positive thing about having these "problems" with DSpace is that I am learning a lot more about how the application is structured and where various folders and files are located. I am also coming to love the Virtual Machine's ability to take snapshots at various stages of implementation. If I'm not sure about something that I'm doing, I readily take a quick snapshot and make notes about where I am in the installation process and what's next, so that I can backtrack to a spot before problems start showing up and re-install or configure as needed to problem solve until I understand what needs to be done to get things working properly.


I've enjoyed working with DSpace more than working with Drupal. I appreciate both programs, but the fundamental approach of DSpace appeals to me as I contemplate ways to preserve multiple files from the same master file. That and the ability to run checksums on the files to check for possible corruption seem like really critical functions in a digital archive.