Accepted Presentations for the DSpace User Group Sessions

Session 1: Governance and Architecture

  • DSpace Status Update
    MacKenzie Smith

    A review of and update on the current status of the DSpace open source software community governance plan, including a separate 501c3 non-profit corporation, and a series of activities that the organization may undertake to help the community continue to use the DSpace platform more easily. This plan is the result of a series of recommendations by an ad-hoc DSpace advisory group that met in 2006 to discuss the long-term governance and sustainability of the project. Information about this is available on the project wiki at http://wiki.dspace.org/index.php/DspaceGovernance.

  • DSpace Architecture Update
    John Ockerbloom

    A review of the recently approved new technical architecture for DSpace that will improve key aspects of the system like its add-on mechanism and customizable workflows, as well as an improved data model with better support for versioning and FRBR-like structure. Detailed information from this group's deliberations is available on the project wiki at http:// wiki.dspace.org/index.php/ArchReview.

  • Introducing Manakin: Overview and Architecture
    Scott Phillips, Cody Green, Alexey Maslov, Adam Mikeal, and John Leggett

    Manakin is the second release of the DSpace XML UI project. Manakin introduces a modular interface layer, enabling an institution to easily customize DSpace according to the specific needs of a particular repository, community, or collection. Manakin's modular architecture enables developers to add new features to the system without affecting existing functionality. First the project's goals will be introduced, followed by a discussion of Manakin's relationship with DSpace. Finally, an architectural overview of the primary components will be given.

Session 2

  • Using DSpace for Digitized Collections
    Lisa Spiro, Marie Wise, Sidney Byrd and Geneva Henry

    As organizations adopt institutional repositories (IR) to store and make accessible scholarly materials, they are finding new and expanded uses for these powerful tools. Institutional repositories can archive not only "born-digital" assets such as pre-prints and dissertations, but also digitized materials such as books, photographs, and recordings. Such primary source materials serve as building blocks for research, particularly in the humanities and social sciences. Although DSpace, one of the leading IR systems, was originally designed for born-digital resources, Rice University has adopted it as a platform for digitized materials as well. Using a single IR for different kinds of scholarly assets provides unified access to diverse materials and can be more efficient than running multiple systems. Making DSpace work for complex collections of digitized materials can require developing new tools and processes. Rice is using DSpace for several digitization projects, including the Travelers in the Middle East Archive (TIMEA), a collection of XML-encoded texts, images, and maps focused on Western interactions with the Middle East; the Shepherd School Archive of digital audio of performances at the music school; and the Rice Institute Pamphlets Archive, PDFs and XML-encoded text of a significant academic journal. Each of these projects poses unique challenges. This presentation will include a discussion of how Rice has confronted these challenges in employing DSpace for digitized assets.

  • Digital Repository Projects at the North Carolina State University Libraries
    James Jackson Sanborn and Jim Tuttle

    The North Carolina State University Libraries has undertaken a number of projects related to the development of digital repositories. In the creation of our open repository of scholarly content, we have taken an uncommon approach. Rather than develop or deploy a repository system and solicit contributions from campus researchers and organizations, the NCSU Libraries has worked to leverage existing collections and to identify other untapped resources already rich in content. This approach will result in three major collections: published faculty papers which draws upon a pre-existing database of over 20,000 citations of papers published by NCSU affiliated authors; NCSU promulgated technical reports from departments and institutes across campus which was created through harvesting content and automated ingest; and a collection of NCSU electronic theses and dissertations (ETDs) submitted through legacy ETD software. DSpace is also functioning as the repository for geospatial data acquired through the NCSU Libraries NDIIPP (National Digital Information Infrastructure Preservation Program) investigation. The NDIIPP DSpace instance is a controlled-access archive selected to investigate repository-agnostic pre-ingest workflows on highly complex content. This session will share the high-level architecture of our system, focusing on the ways we have worked to integrate DSpace with other systems and processes. Management, ingest and access tools created for these projects will be presented. We will also discuss the decisions made regarding planning, policy and implementation, as well as future goals for these, and other, digital repository projects.

Session 3

  • Introducing New Services with DSpace
    Julie Griffin, Kent Woynowski and Susan Wells Parham

    The Georgia Tech (GT) Library and Information Center established SMARTech (http://smartech.gatech. edu/), our DSpace Institutional Repository (IR) in August 2004. We envisioned an open access (OA) system of user-submitted scholarly faculty output. It soon became clear that few faculty members would submit their own work. Submitting to SMARTech was also not the highest priority for their graduate students or administrative assistants. To obtain initial content for the system, we began batch loading technical and research reports from departmental and lab web sites. We enriched the descriptive metadata by including our own subject headings and keywords. We began to think of SMARTech as less of a product, and more of a service. This service-oriented focus broadened the collecting scope of our IR, and expanded our use of DSpace as a tool for providing publishing and preservation services to the GT community. Our first service was to submit faculty research ourselves; to supply item level metadata and review copyright. We decided adding more publishing services would make supplying content for SMARTech easier for faculty. We also decided it would be mutually beneficial to expand our use of DSpace to include these new conference and journal publishing services (http://epage.gatech.edu/ ) because faculty include publications, conference participation, and editorial positions in tenure and promotion packages. The new services would offer faculty a low-cost model for creating and maintaining conference web sites and OA journals, allowing them more time to focus on content rather than system support. These expanded services will reinforce the position of SMARTech as a valuable service to the GT community.

  • SPECTRa - Federated Data Reposition Using DSpace
    Jim Downing and Alan Tonge

    The SPECTRa (Submission, Preservation and Exposure of Chemistry Teaching and Research Data) project is a JISC funded collaboration between the university libraries and chemistry departments at the University of Cambridge and Imperial College, London. The project addresses the provision of open access to primary research data ("Open Data") in experimental chemistry through Institutional Repositories. This presentation will describe the project and its outputs, go in depth into the technical interactions with DSpace and investigate how SPECTRa could inform federation interactions between Institutional Repositories and institutional science research.

Session 4

  • DM-DSpace and PF-DSpace: Standards-based Peer-to-Peer DSpace Federation and Federating DSpace-Based Digital Museums in China
    Wei Liu, Xukun Shen, Yue Qi, Yuhong Xiong, Baoyao Zhou, James Rutherford, Xiaoyu Li, Weihua Huang, Shu Wang, Bailiang Chen, and John Erickson

    The China Digital Museum Project (CDMP) is an ongoing collaboration involving the Chinese Ministry of Education, HPLabs and several Chinese universities. The goal of CDMP has been to enable a federation of universities to provide a large-scale infrastructure based on DSpace to store, manage, preserve and disseminate the digitised versions of university museum artefacts. In the final phase of CDMP, there will be more than 100 university museums with digital artefacts stored in federated DSpace installations. The federation architecture of both DM-DSpace and PF-DSpace consists of a number of repository nodes exposing an OAI-PMH data provider interface to enable authorised repositories to harvest METS Dissemination Information Packages (DIPS). These DIPs are the currency that enables a harvesting repository to replicate the underlying digital object. We are also able to distribute OAI-PMH "friends" lists within a pool of nodes: given the OAI-PMH baseURLs for a set of repositories, a PF-DSpace instance asks each repository what repositories it knows of and add those to the set, recursing through the total set. Our next steps will be to explore possible vocabularies for performing selective harvests of remote collections and relating this to the existing METS ingest/dissemination mechanism that already exists in DSpace; implement and verify a joint protocol for admitting new peers into federations; develop a set of "strawman" administrative best practices for distributed, peer-based repository federations. We also plan to explore the integration of PF-DSpace into the research environment ("LabSpace") by peering it with other elements of the research infrastructure, including departmental wikis, existing technical report repositories and the like.

  • Configurable Submission System for DSpace
    Tim Donohue

    The DSpace Submission User Interface is somewhat limited in its abilities to be configured for locally developed policies and procedures. DSpace does allow for custom metadata schemas and metadata gathering interfaces, but there is little configurability beyond that. UIUC has developed what we call the Configurable Submission System which modularizes the DSpace submission process into a series of "steps". Each "step" generally represents a single submission "module", in charge of gathering specific information important to constitute a single DSpace submission package. In this session, the presenter will discuss the benefits and usage of the Configurable Submission System for DSpace. He will provide high level details of how to rearrange, remove and create new steps within the normal DSpace submission process, as well as how you can customize the submission process on a collection-level. Finally, the presenter will discuss some of the upcoming ideas/plans that IDEALS has for providing more automation to the submission process, by implementing custom non-interactive steps.

Session 5

  • If we build it, will they come?
    Philip Davis and Matthew Connolly

    Much of the work on institutional repositories has focused on their rationale, design, or implementation. While institutions have devoted significant resources to implementing IRs, there has been a scarcity of work on evaluating their IRs. If IRs are to achieve the vision of "a universal service for author self-archived scholarly literature"(Ginsparg, Luce, & Van de Sompel, 1999), strong contributions from faculty are absolutely necessary. This presentation will present a multi-faceted approach to evaluating the success of one institution's implementation of DSpace in terms of faculty participation. First, we provide an empirical analysis of participation in and growth of the Cornell University DSpace, using item submissions and downloads as primary metrics. We identify three typical patterns of community growth and investigate the properties of the most highly-downloaded objects. Second, we provide a comparative analysis of data harvested from other institutional DSpace sites to compare patterns of growth and models of organization. For example, is it more effective to organize DSpace as a small number of general communities, or a large number of specific communities? Lastly, we report on a series of detailed interviews conducted with Cornell faculty across disciplines to better understand how faculty disseminate the findings of their research. We consider attitudes, motivations and rationale for behaviors such as sharing preprints with colleagues, depositing preprints in disciplinary archives, and posting published articles on personal websites. How scholars communicate is largely determined by the reward structures of their discipline. Based on these structures, we suggest why participation in digital repositories has become culturally engrained in some disciplines and largely ignored in others.

  • A DSpace-based Preservation Repository Design
    Joseph Pawletko and Ekaterina Pechekhonova.

    At NYU's Digital Library we are building a Digital Preservation Repository (PR) that uses DSpace as a core component. During the system design phase we were faced with the question "Should we build a monolithic application that does everything, or distribute the preservation functionality over a collection of components?" We decided upon the latter approach. In this talk we will discuss why we chose the component approach; the DSpace features and add-ons that enabled us to use DSpace as a component; the role DSpace plays in the overall PR architecture; our strategy for dealing with large files (> 4GB); other components and implementation technologies used in the PR (Java, Ruby, SRW/U, XML-RPC, Shibboleth, the Handle system, METS, MODS, LC-AV, SRB, and others); the current system development status; and future plans.

  • DSpace as a Platform: Creating Custom Interfaces with Content Packaging Plugins
    Don Gourley and Larry Stone

    The latest major release of DSpace, 1.4, introduced a simple plugin mechanism that provides a powerful means to extend and customize DSpace. This presentation includes two parts: First, I describe the new extension to the DSpace architecture and some of the plugins included in DSpace 1.4. Then, as an example of usage, I recount how content packaging plugins are used at the Washington Research Library Consortium (WRLC) to integrate DSpace with local tools that create and present digital objects. This case study illustrates how plugins can be used to define a custom network interface of simple HTTP GET and POST requests to access DSpace resources and services, opening DSpace up as a flexible and customizable repository platform.

Session 6: Manakin Themes and Applications

  • Manakin Themes: customizing the look-and-feel of DSpace
    Alexey Maslov, Cody Green, Adam Mikeal, Scott Phillips, and John Leggett

    A cursory examination of the more than 150 registered DSpace instances reveals a striking degree of conformity in style. Although there are many reasons for institutions to customize the look and feel of their repository -- such as institutional branding or imparting context -- the current JSP paradigm makes this a tedious task. Furthermore, some customization tasks, such as applying a different style to a specific collection or community, are currently not supported. Manakin enables these customizations to be easily applied to communities, collections or the entire repository. This portion of the presentation demystifies the process of creating a Manakin theme and adapting it to the unique needs of your institutional repository. The presentation will be structured into four main sections: theme components, basic and complex theme development techniques, and an overview of advanced topics.

  • Manakin Case Study: visualizing geospatial metadata & complex items
    Adam Mikeal, Cody Green, Alexey Maslov, Scott Phillips, Kathy Weimer and John Leggett

    Increasingly, repositories are responsible for preserving complex items, and items with specific/unique metadata, such as geospatial metadata. These collections present unique challenges for the repository interface, and traditional approaches often fail to provide adequate visualization mechanisms. This portion of the Manakin presentation is a case study of a particular collection that exhibits a Manakin solution to both of these challenges. The Geologic Atlas of the United States is a series of 227 folios published by the USGS between 1894 and 1945. Each folio consists of 10 to 40 pages of mixed content, including maps, text, and photographs—with an emphasis on the natural features and economic geology of the coverage area.

  • Content Interchange and the Invisible Repository
    Scott Yeadon

    The Australian National University (ANU) will be undertaking development work for the Australian Partnership for Sustainable Repositories (APSR) in 2007. Much of this work will be focused around repository interoperability and the integration of a repository service within the university's application infrastructure. This presentation will discuss and demonstrate some of the prototype DSpace-related development work undertaken so far and planned for further development in 2007. Specifically: a METS SIP/DIP profile intended to be used as a national standard for the meaningful exchange of digital objects between repositories; separation of concerns at a functional level so an institution can select best-of-breed software, with an example using Open Journal Systems (OJS) to manage publication workflow, DSpace to manage preservation and Manakin as an access/publication point; and a Manakin theme incorporating Google Earth and Google Maps functionality.