Thank you to all of you who offered your thoughts on my query, I am very grateful.

 

  damien

 

Damien McCaffery
Electronic Resources Librarian

Jefferson (Philadelphia University + Thomas Jefferson University)

4201 Henry Avenue

Philadelphia PA 19144
T 215-951-2674
mccafferyd@philau.edu

PhilaU.edu    

 

cid:image004.jpg@01D2F744.EB5F8BF0

 

From: Serials in Libraries Discussion Forum [mailto:SERIALST@LISTSERV.NASIG.ORG] On Behalf Of Diane Westerfield
Sent: Tuesday, December 12, 2017 5:01 PM
To: SERIALST@LISTSERV.NASIG.ORG
Subject: Re: [SERIALST] Journal collections; help determining overlap/duplication

 

Another option in Excel is Conditional Formatting.  Paste both entitlement lists into one spreadsheet. I usually “bold” one title list to differentiate it from the other. You may have to move columns around in one entitlement list to get all the ISSNs lined up in one column. (Be careful that you don’t scramble your data while doing this). Select the ISSN column and apply Conditional Formatting on duplicates. This defaults to shading the duplicate cells pink. Then select each entitlement list (not the entire spreadsheet) and sort by cell color (beware it can take some time). This will show you the duplicates in pink, and unique titles with no shading.

 

Alternatively you can use MS Access to match ISSN from two data sources. You have to import the title lists as two “tables” and allow the column headings to become the column names in Access. Then run a Query on the two tables that matches on ISSN, and you’ll get a list of the duplicates. I can’t really explain this in a plain text email, I suggest looking up videos if you don’t know how to do this (thank you to DU’s Chris Brown for a workshop on Access).

 

Both methods rely on the ISSN being present, and that it is clean and correct. Obviously there will be at least a few errors when working with large entitlement lists.

 

You can do some messy matching on titles but that requires a lot more cleanup. I usually do ISSNs first with Conditional Formatting for duplicates, then repeat with titles. Since publishers often don’t treat titles the same, I insert another column called “Clean title”, copy and paste the journal titles into the Clean title column and do a lot of Find & Replace operations such as removing initial articles and turning ampersands (&) into “ands”, then Conditional Format for duplicates. You can iterate this another time by lopping off subtitles demarcated with colons or semicolons, and get rid of any trailing spaces or double spaces.  Some of this you can do in Python IDLE by copying columns into doc strings (triple quotes) and performing simple string-slicing operations, then print out the output and paste the output (carefully) back into Excel.

 

Eventually you get into a long tail of acronyms, diacritics and other oddities, and it’s up to you how much you want to hand-check or cleanup those journal titles.

 

Diane Westerfield, Electronic Resources & Serials Librarian

Tutt Library, Colorado College

diane.westerfield@coloradocollege.edu

(719) 389-6661

 

 

 

From: Serials in Libraries Discussion Forum [mailto:SERIALST@LISTSERV.NASIG.ORG] On Behalf Of Melissa Belvadi
Sent: Tuesday, December 12, 2017 12:05 PM
To: SERIALST@LISTSERV.NASIG.ORG
Subject: Re: [SERIALST] Journal collections; help determining overlap/duplication

 

Learn about using vlookup with spreadsheet software (Excel or Google Sheets) to combine title lists for comparison purposes.

In this case you'd be using the ISSNs as the matchpoint. 

I assume you're talking about comparing Sage or Springer publisher packages with aggregate packages like EBSCO's Academic Search, because Sage and Springer's packages would never overlap with each other - either one or the other owns the title, but never both.

 

I recently did a 2-hour(!) webinar for my regional consortium about prepping Excel spreadsheets and using vlookup.

The recording is online and you are welcome to watch it: http://ca.bbcollab.com/recording/524a3f69f97e441d8e9e49a6b5118979

I explain some tips for dealing with ISSNs that get mangled (eg if no hyphens and leading zeros get dropped) and then using vlookup with wildcards if you have multiple ISSN columns and aren't sure which one to use to match on the other list.

 

 


Melissa Belvadi

Collections Librarian

University of Prince Edward Island

mbelvadi@upei.ca 902-566-0581

 

 

 

On Tue, Dec 12, 2017 at 1:25 PM, McCaffery, Damien <mccafferyd@philau.edu> wrote:

Hello all,

 

Relatively early-career librarian here – hope this is the appropriate forum for this query.

 

I am trying to learn:

 

1.       How I might determine if the journals in one subscription package are duplicated in another?

2.       If I find the collections do overlap, how I can ascertain which collection holds more full-text, peer-reviewed / scholarly journals than the other?

 

The journal publishers I am looking into are SAGE and Springer. I don’t believe that either of the journal collections we get via subscriptions with them comprises a standard, widely-offered package.

 

The instructor is looking for a way to help students do a systematic review (course in physical therapy). Objective is to filter our many databases down to the ones that will yield unique material (i.e. material that does not overlap with other collections), but also encompass a broad range of scholarly, full-text journals.

 

My objective is to avoid printing out title lists, placing them beside each other, comparing them, and crossing off dupes. Short of this, my only other idea is to consult you, or simply contacting the publishers themselves for methods(unlikely to yield impartial accounting?).

 

I have searched around to find if others have tackled this question recently, or even if there is a standard method for large-scale duplicate-spotting between journal publishers. I’ve turned up some likely prospects, but nothing definitive or accessible – mostly in the form of research papers.

 

If anyone may point me in the direction of either current research out there on this subject, or a tested way to compare, and evaluate the features of, journal collections, I would be grateful.

 

Thanks in advance,

 

  damien

 

Damien McCaffery
Electronic Resources Librarian

Jefferson (Philadelphia University + Thomas Jefferson University)

4201 Henry Avenue

Philadelphia PA 19144
T 215-951-2674
mccafferyd@philau.edu

PhilaU.edu    

 

cid:image004.jpg@01D2F744.EB5F8BF0

 

 


To unsubscribe from the SERIALST list, click the following link:
http://listserv.nasig.org/scripts/wa-NASIG.exe?SUBED1=SERIALST&A=1

 

 


To unsubscribe from the SERIALST list, click the following link:
http://listserv.nasig.org/scripts/wa-NASIG.exe?SUBED1=SERIALST&A=1

 


To unsubscribe from the SERIALST list, click the following link:
http://listserv.nasig.org/scripts/wa-NASIG.exe?SUBED1=SERIALST&A=1



To unsubscribe from the SERIALST list, click the following link:
http://listserv.nasig.org/scripts/wa-NASIG.exe?SUBED1=SERIALST&A=1