Identifying HathiTrust volume IDs suitable for inclusion in a corpus of images for identifying sheet music. Order to run the scripts: 1. To get a fresh HathiTrust tab-delimited metadata dump (for the time of writing: March 2019) ./HATHI-GET-TAB-DELIM-DUMP.sh 2. To winnow the file down to a more manageable size (just the columns we're interested in) ./HATHI-EXTRACT-FORMAT.sh 3. To generate a list of Music Notation entries that are in the public domain, and not-scanned by Google (so called 'open-open'): ./HATHI-EXTRACT-PD-NON-GOOGLE.sh