5CRMWG Recommendations Document
Five Colleges Record Merge Working Group Recommendations Document
Prepared by
Steve Bischof, UMass Amherst, FOLIO Implementation Team Liaison Sharon Domier, UMass Amherst
Jennifer Eustis, UMass Amherst
Rebecca Henning, Amherst College
Colin Van Alstine, Smith College
Last Updated
2019
Introduction
The Five Colleges invested in the current integrated library system, Ex Libris’s Aleph, in 2005. It is now 14 years later and Aleph is showing its age. Though improvements continue, it is clear Aleph lacks crucial features. As a result of analyses and discussions on how to proceed, in 2018 the Five Colleges selected FOLIO as a beta partner with EBSCO and the FOLIO Implementation Team (FIT) was created. FIT decided to move to a single tenant system for FOLIO, meaning there will be one instance.
Currently, there exist overlaps between bibliographic records reflecting what each of the institutions in the Five Colleges has acquired over the years. This level of duplication can cause issues for access and discovery of resources which deduplication of bibliographic records can help alleviate. It should be noted that not all bibliographic records are created the same or for the same purposes. As a result, some are good candidates to be merged while others are not.
The Five Colleges Record Merge Working Group was formed to analyze and provide recommendations on the complex issues of merging records for the move to our new library service platform FOLIO. This working group has created this set of recommendations that attempt to provide solutions for the merge based on the knowledge of the working group members and advisors. These recommendations cover communication plan, scope, merge in relation to specific collections, preparing for the merge, merge workflow, matching algorithms, best practices/procedures as a result of a single tenant system, merge assistance, risks, and references.
These recommendations cover roughly the merge cycle. The merge cycle includes pre, during, and post merge actions. Actions include clean up projects that help prepare for an easier merge and migration; and post clean up projects such as re-establishing links between parent and child bibliographic records. Recommendations aim at suggesting both tasks and plans that form the merge project. These recommendations are not intended to be the final word on any tasks and/or plans. They will be used in formulating the merge project plan which will detail the merge of recommended bibliographic records into FOLIO.
5CRMWG, Page 1
Working Group Charge
Background:
The Five Colleges invested in the current integrated library system, Ex Libris’s Aleph, in 2005. It is now 14 years later and Aleph is showing its age. Though improvements continue, it is clear Aleph lacks crucial features. As a result of analyses and discussions on how to proceed, in 2018 the Five Colleges selected FOLIO as a beta partner with EBSCO and the FOLIO Implementation Team (FIT) was created. FIT decided to move to a single tenant system for FOLIO, meaning there will be one instance. Currently, there exist overlaps between bibliographic records reflecting what each of the institutions in the Five Colleges has acquired over the years. This working group was created to analyze and provide recommendations on the complex issue of merging records for the move to our new library service platform FOLIO.
Charge:
This working group will draft a recommendations document on the following: people, scope, merge preparation, merge workflow, matching algorithms, best practices/procedures, EAST, communication plan, vendor, and risks.
● People
Recommendations should identify the key people involved in the project including those responsible for signing off on documents and/or parts of the project (i.e. sampling data, proceeding to whole project, vendor contracts).
● Scope
Recommendations should include which bibliographic records will and will not be merged. Those records that are determined to be candidates for merging will fall within the scope of the recommendations on merge workflow, matching algorithms, best practices/procedures, communication plan, and vendor. If there are exceptions to the scope, these will be noted appropriately in the document.
● Merge Preparation
Recommendations should include any clean up projects to be prioritized. ● Merge Workflow
Recommendations should include the determination of when and how the merging of bibliographic records fits into the general migration workflow which includes any impacts on the migration.
● Matching Algorithms
Recommendations should include an initial plan on how to match records within the scope of the merge project and provide determinations for selecting primary or winning matches, what data are to be merged (i.e. local fields, fields used for discovery).
5CRMWG, Page 2
● Best Practices/Procedures
Recommendations should provide an initial path forward for consorital best practices and procedures in relation to technical services operations.
● Merge & EAST (Eastern Academic Scholars’ Trust)
Recommendations should include an initial plan on how to deal with resources committed to be retained as part of the EAST program.
● Communication Plan
Recommendations should include a proposal communication plan that addresses questions about the risks and issues that arise pre, during, and post merge.
● Vendor
Recommendations will identify if a vendor should be used to help with the merge. If it is determined that work should be contracted out, recommendations should include the scope of work to be undertaken by the vendor and by the Five Colleges in addition to proposals from key identified vendors.
● Risks
Recommendations should identify the risks and issues involved with the merge ranging from data concerns such as data loss or mismatches to supporting financially and administratively the merge project going forward.
Membership:
Steve Bischof, UMass Amherst (FIT Liaison)
Sharon Domier, UMass Amherst
Jennifer Eustis, UMass Amherst
Rebecca Henning, Amherst College
Colin Van Alstine, Smith College
Membership represents the following areas of knowledge: deep MARC21 expertise, archives and rare books, music, non-Latin scripts, serials, boundwiths. For these knowledge areas and other relevant topics to the merge, the working group can call on knowledge experts within the Five Colleges to assist in our tasks.
Chair:
Jennifer Eustis, UMass Amherst
Meetings:
Weekly meetings or as needed at the call of the Chair or the request of the Folio Implementation team. Other staff from the Five Colleges may be asked to join meetings when their specialized feedback or knowledge is needed.
5CRMWG, Page 3
Information Sharing:
Agendas and notes will be saved in the FIT Google Drive. Information will also be shared via the Slack channel #5crmwg. Additional updates to Folio Implementation Team as requested, and to all staff of the Five Colleges as appropriate.
5CRMWG, Page 4
People
Recommendations should identify the key people involved in the project including those responsible for signing off on documents and/or parts of the project (i.e. sampling data, proceeding to whole project, vendor contracts).
The working group recommends the following:
● Working Group Members
The 5C Record Merge Working Group is responsible for preparing the recommendations document, communicating with stakeholders and the community, and coordinating with the Aleph Advisory Group on priority cleanup projects for the merge.
● FOLIO Implementation Team (FIT)
FIT is responsible for approving the recommendations document and use of a vendor for the merge.
● Five Colleges Librarians Council (FCLC)
The Five Colleges Consortium council is responsible for the final approval of the recommendations document and/or use of a vendor to be used for the merge.
5CRMWG, Page 5
Communication Plan
Recommendations should include a proposal communication plan that addresses questions about the risks and issues that arise pre, during, and post merge.
We recommend the following goals:
● Ensure stakeholders and community are aware of what is meant by a merge ● Communicate updates from the working group to stakeholders and community ● Address issues related to merging bibliographic records with the move to FOLIO ● Provide forums for open discussion on merging bibliographic records for the move to FOLIO
These recommended goals aim to communicate to these audiences:
● Stakeholders: These are the persons who are working directly on the FOLIO project, FIT, and those who work directly with MARC bibliographic records.
● Community: These are persons who do not work directly with MARC bibliographic records but still need to be aware of the upcoming changes.
We recommend using these tools:
● FOLIO Blog
● Five Colleges FOLIO Slack channel #5crwmg
● Email and Five Colleges Listservs
● Virtual web conferencing tool Zoom
● Google Drive
These recommend tools will be used to:
● Publish regular blog posts providing updates on the group’s progress ● Schedule open discussion meetings
● Ensure documents are available
● Coordinate with stakeholders and key people at the 4 institutions
These recommendations are based on communication pathways and tools that already exist. With this in mind, the working group recommends for the long term:
● A shared space that can be accessed by Five Colleges staff for shared documentation for best practices and FOLIO documentation
5CRMWG, Page 6
● A feedback mechanism that can be accessed by Five Colleges staff that tracks, manages, and organizes issues logged in the short term for the merge and in the long term for issues going forward.
5CRMWG, Page 7
Scope
Recommendations should include which bibliographic records will and will not be merged. Those records that are determined to be candidates for merging will fall within the scope of the recommendations on merge workflow, matching algorithms, best practices/procedures, communication plan, and vendor. If there are exceptions to the scope, these will be noted appropriately in the document.
The working group recommends that the following types and/or collections of records be merged:
● General collections that are not part of any linkage group or other designated type of bibliographic record not within scope
● Cataloged collections of the Special Collections & Archives (Please Refer to notes in section Merge & Special Collections)
● Bibliographic records that have EAST statements
● Bibliographic records for electronic resources that are not a part of any batch load process
The working group recommends that the following types and/or collections of records not be merged:
● Rented collections such as the McNaughton Leisure Collection at Mt. Holyoke ● Records that have an item material type of “Admin Record - Bib Suppressed” ● Batch loaded records that have an 035 9\ storing the collection code and OCLC/Vendor number match point
● Suppressed bibliographic records
● Bibliographic records for equipment, keys, and other like materials owned and circulated by each institution
● Theses & Dissertations including ETDs
● Bibliographic records that have linkages to other records, expressed through the use of a LKR field or a specific STA field status
5CRMWG, Page 8
Merge & EAST (Eastern Academic Scholars’ Trust)
Recommendations should include an initial plan on how to deal with resources committed to be retained as part of the EAST program.
The working group recommends merging bibliographic records that indicate an EAST retention commitment where:
● The EAST retention statement in the 583 is kept in the bibliographic record and also copied to the holdings statement. Having this statement in both the bibliographic and holdings record makes it clear which institution retains the commitment for which bibliographic item.
● EAST statements need to continue to follow current practice of the 583 1 _ marc field. ● Documentation should outline an example of EAST statements for each institution of the Five Colleges and procedures for withdrawing an EAST commitment (for instance, if a mismatch occurred).
5CRMWG, Page 9
Merge & Special Collections
Recommendations should include an initial plan on how to deal with resources that are part of the five Special Collections & Archives departments and that are cataloged in Aleph.
The working group recommends that those resources currently cataloged in Aleph which are primarily rare books and monographs and that are part of one of the special collections and/or archives departments of the Five Colleges be merged with the following understanding:
● Ensure local notes include the subfield 5 when possible and appropriate institution code in the bibliographic records before the merge
● Representatives from special collections & archives departments agree to the spreadsheet that details what is to be merge and what is not to be merged, in relation to fields used frequently in special collections and/or archival cataloging
● Ensure language is consistent for local notes
5CRMWG, Page 10
Merge Preparation
Recommendations should include any clean up projects to be prioritized.
The working group recommends:
● Creating a spreadsheet of all MARC fields where there are instructions on whether to overlay/protect that field, whether to add $5, subject headings, and comments (pre merge)
● Creation of multiple Aleph test instances for testing (pre and during merge) ● Creation of FOLIO testing environment (pre, during and post merge) ● Create a baseline and testing procedures for merge (pre merge)
● Deduplicate each institution’s database (pre merge)
● Clean up malformed records (pre merge)
● Bound-with cleanup (post merge)
● Depository bibliographic record cleanup (pre merge)
● Create a project plan (pre merge)
● Continue the discussion on merge and migration and include discussions on cleanup projects and strategies (pre and post merge)
● Changing the record merge group into the merge and migration group ● Consider copying some local notes from bibliographic to holdings record
5CRMWG, Page 11
Merge Workflow
Recommendations should include the determination of when and how the merging of bibliographic records fits into the general migration workflow which includes any impacts on the migration.
The working group recommends the following:
● Export data from Aleph
● During export perform any batch processes that have been predetermined to help clean up data
● Deduplicate data
● Import into FOLIO
The working group recommends that this merged migration:
● Take place after fiscal year close
● Accounts for a planned downtime and a possible gap file if processes continue ● Prioritize cleanup projects to prepare for the merge
● Before the merge, tests with sample sets to refine the process and algorithms against a baseline and a set of testing procedures
5CRMWG, Page 12
Matching Algorithms
Recommendations should include an initial plan on how to match records within the scope of the merge project and provide determinations for selecting primary or winning matches, what data are to be merged (i.e. local fields, fields used for discovery).
The working group will work with the vendor to refine and determine the algorithm used to merge records. The working group recommends the use of a more refined algorithm rather than to match only on the OCLC number. This means that the algorithm should include searching on title, author, publication date, and edition for instance. This expanded algorithm beyond the OCLC number is particularly important for resources that belong to special collections and/or archives where only the edition or publication date might be different.
5CRMWG, Page 13
Best Practices/Procedures
Recommendations should provide an initial path forward for consorital best practices and procedures in relation to technical services operations.
The working group recommends:
● That a cataloging/metadata working group be formed. This group should include Five Colleges representation from cataloging/metadata units, special collections staff in charge of rare book and/or special collections monograph cataloging, and acquisition and ERM staff in charge of creating new bibliographic and inventory records.
○ Short term recommendations include creating shared consortial best practices for cataloging and metadata
○ Long term recommendations include maintaining those shared practices ● That a ERM working group be formed. This group should include Five Colleges representation from ERM or like units, acquisition staff in charge of some aspect of ERM, and cataloging/metadata staff in charge of some aspect of ERM.
○ Short term recommendations include creating shared consortial best practices for ERM.
○ Long term recommendations could include collection management of electronic resources.
● The Five Colleges website be expanded to include pages outlining the shared consortial practices for FOLIO. These pages outline the standing committees, links to discussion groups, issue tracking mechanism(s), documentation. An example can be found at Florida Academic Library Services Cooperative.
5CRMWG, Page 14
Merge Assistance
Recommendations will identify if a vendor and/or temporary help should be used to help with the merge and/or post merge/migration. If it is determined that work should be contracted out, recommendations should include vendor proposals submitted separately FIT. If it is determined that temporary help should be used to help with the merge cycle (i.e. here merge cycle means pre, during, and post merge), recommendations will outline type and amount of assistance needed.
The working group recommends vendor assistance for the merge cycle. The working group has solicited two vendor proposals which include estimating pricing for the merge and post merge cleanup for 5 million records. These proposals will be shared separately with FIT and appropriate administrators.
The working group also recommends employing temporary labor to help with the merge cycle. This help will be particularly useful in the post merge cleanup for bound-withs and other complex bibliographic records that involve parent and child relationship based on system numbers which will break with the introduction of a new system numbering. The recommendations include:
● 2-4 temporary employees in technical services to work on specific projects identified by the Record Merge group
● Temporary employment should be made available before, during, and after the merge for a period of up to 5 years
5CRMWG, Page 15
Risk Tolerance
Recommendations reflect the level(s) of tolerance communicated during the open discussions and meetings with stakeholders.
The working group found two levels of risk tolerance in relation to the merge. The first is in relation to staff working in special collections and archives, the second to staff working in other technical services departments more broadly. These risk tolerances were determined based on the open discussions and speaking to staff in technical services, archives, and special collections.
The risk tolerance for staff working in special collections and archives is found in general to be medium to high where only one of the institutions of the 5 Colleges is high. The concern rests primarily on the potential for merging incorrect editions. The working group recommends in light of this to work with special collections and archives:
● A correct scoping of fields for the merge
● Specific coding that needs to implemented and/or cleaned up before and during the merge
● List of requirements and assumptions
● Continued discussions with representatives from the 5C special collections and archives
The risk tolerance for other staff more broadly representing technical services units are low to medium where only one representative from one of the institutions of the 5 Colleges is low. The concern consist in determining what is in and not in scope of the merge, protecting those bibliographic records that need protecting, and mismatched merges. The working group recommends in light of this to:
● Create a project plan that outlines in more detail than these recommendations the scope, assumptions, and requirements of the merge
● Communicate cleanup projects that facilitate the merge
5CRMWG, Page 16
References
Bibliographic Control and Discovery Subcommittee (2015). State University System of Florida Libraries: Guidelines and Procedures for Shared Bibliographic Catalog, Version 1.02. Retrieved from
https://sharedbib.pubwiki.fcla.edu/wiki/images/SharedBib/4/48/SharedBibGuidelines_ver.1.02.p df
University of West Florida (2018). 18ITN-06AJ Invitation to Negotiate: Next Generation Integrated Library System (ILS). Retrieved from
https://uwf.edu/media/university-of-west-florida/offices/procurement/bids/18itn-06aj/18ITN-06AJ Next-Generation-Integrated-Library-System-(ILS).pdf
University of West Florida (2019). 18ITN-06AJ Invitation to Negotiate: Next Generation Integrated Library System (ILS), Addendum Number 1. Retrieved from
https://uwf.edu/media/university-of-west-florida/offices/procurement/bids/18itn-06aj/18ITN-06AJ Addendum-Number-One.pdf
Broward County Board of County Commissioners (2018). Solicitation TEC2115735P1: Next Generation Integrated Library System (NGS/ILS) and Discovery Services (DS). Retrieved from http://cragenda.broward.org/docs/2018/CCCM/20180320_558/26071_Exhibit%201%20-%20RF P%20No.TEC2115735P1.pdf
Downey, Kay (2012). The OHDEP Project: Creating a Shared Catalog
for the Northeast Ohio Depository. Collection Management, 37:3-4, 322-332. https://doi.org/10.1080/01462679.2012.685832
Five Colleges Consortium. FIT: Feature Prioritization Spreadsheet. Retrieved from https://docs.google.com/spreadsheets/d/1bxNH4TkKV6hWKt09-ywtL9D3N9skfAu9kUE4xxbio7 g/edit#gid=1396876918
Florida Academic Library Services Cooperative. Services and Information for Florida’s College and University Libraries. Retrieved from https://libraries.flvc.org/standing-committees
Florida Virtual Campus. Technical Services Standing Committee (2019). Project List for FALSC. Retrieved from https://falsc.libguides.com/c.php?g=845752&p=6072335
Han, M.K., Carlstone, J., & Harrington, P. (2018) Cataloging
Digitized Continuing Resources in a Shared Record Environment. Cataloging & Classification Quarterly, 56:2-3, 155-170. https://doi.org/10.1080/01639374.2017.1388324
5CRMWG, Page 17
Heron, S.J., Simpson, B., Weiss, A.K., & Phillips, J.(2013). Merging Catalogs: Creating a Shared Bibliographic Environment for the State University Libraries of Florida. Cataloging & Classification Quarterly, 51:1-3, 139-155. https://doi.org/10.1080/01639374.2012.722591
Koury, R. & Brammer, C. (2017). Managing Content in EBSCO
Discovery Service: Action Guide for Surviving and Thriving. The Serials Librarian, 72:1-4, 83-86. https://doi.org/10.1080/0361526X.2017.1309828
Research Information Network (2009). Creating catalogues: bibliographic records in a networked world. Retrieved from
http://www.rin.ac.uk/system/files/attachments/Creating-catalogues-report.pdf
State University Libraries of Florida (2013). Deduplicating Records. Retrieved from https://sharedbib.pubwiki.fcla.edu/wiki/index.php/Deduplicating_Records
State University Libraries of Florida (2010). State University Libraries of Florida: Guidelines and Procedures for the Shared Bibliographic Catalog. Retrieved from
http://csul.net/sites/csul.fcla.edu/uploads/SULGuidelinesandProceduresforSharedBibCatalog.pd f
State University Libraries of Florida (2019). Cataloging wiki. Retrieved from https://sharedbib.pubwiki.fcla.edu/wiki/index.php/Cataloging
State University Libraries of Florida (2018). Cataloging Guidelines for the Florida State University System and State College System Bibliographic Database, adopted November 2018. Retrieved from https://falsc.libguides.com/ld.php?content_id=45779306
Van Kleeck, D., Langford, G., Lundgren, J., Nakano, H., O'Dell, A.J., & Shelton, T. (2016) Managing Bibliographic Data Quality in a Consortial Academic Library: A Case Study. Cataloging & Classification Quarterly, 54:7, 452-467.
https://doi.org/10.1080/01639374.2016.1210709
5CRMWG, Page 18