sites cleanup: the clone wars kara m. lewis, collections information program manager patricia l....
TRANSCRIPT
![Page 1: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/1.jpg)
Sites Cleanup: The Clone Wars
Kara M. Lewis, Collections Information Program ManagerPatricia L. Nietfeld, Collections Manager
Smithsonian Institution, National Museum of the American Indian
![Page 2: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/2.jpg)
Long, long ago…in 2006…
• NMAI migrated all geographical data from two legacy databases into the Sites module in EMu
• Much of the data was not standardized• Much of the data was “duplicate”
information• Made the decision to migrate “as is” and
use the tools in EMu to clean it up• As a result…
![Page 3: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/3.jpg)
The Plot
![Page 4: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/4.jpg)
The Conflict
• ~39,000 Unique combinations • ~90,000 Sites records created• ~337,500 Catalog records affected• At least half were duplicates, or data
was in the wrong field• The rest were “variations,” obsolete
place names, misspellings, or just plain wrong
![Page 5: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/5.jpg)
The Plan of Attack
![Page 6: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/6.jpg)
“Do or do not. There is no try.”
• Conventions:– No abbreviations
• no St. for Saint
– Names in language of country– Alternate versions in
parentheses • Lac Saint-Jean (Lake Saint John)
– Use 1st level political subdivision• Ecuador, Manabí Province
– Use current names
![Page 7: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/7.jpg)
“Do or do not. There is no try.”
• Conventions:– Country – Region? – State
• Pará State, North Region
– Subdivisions on case by case basis– Leave blank if can’t determine higher
subdivision - Fill it in if known - Most specific info.
in Provenience
![Page 8: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/8.jpg)
“You must unlearn what you have learned.”
• What Pat did not do:– Not a lot of energy spent
on US state archaeological site numbers
– This was cleanup, not verification
![Page 9: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/9.jpg)
“Control, control, you must learn control!”
• Started with spreadsheet unique combinations of geographical data
• Split into smaller spreadsheets bystate or country
• Learn about the country
![Page 10: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/10.jpg)
“Ready are you? What know you of ready?”
• Content Resources:– General: Wikipedia,
Statoids.com– International Travel Maps and
Books of Vancouver, Canada– Country’s official website– Archaeological websites– Indigenous peoples’ websites– Government agencies– Maplandia.com– Google, JSTOR– MAI publications
![Page 11: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/11.jpg)
“It is the future you see.”
• Nomenclature Resources:– US: Geographic Names Information System
(GNIS)– Canada: Geographical Names Search
Service (GNNS)– Others: GEOnet Names Server (GNS)
![Page 12: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/12.jpg)
The worksheets (83 in total)
![Page 13: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/13.jpg)
The Implementation in EMu
• Contractors do not have to be content experts
• Create new Sites, rather than “reuse”
• Practice first• I do the actual
deletions
![Page 14: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/14.jpg)
The Confrontation1. Start in the Sites module2. Create list view with all fields3. Search and group “old” Sites
![Page 15: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/15.jpg)
The Confrontation4. Open 2nd window and create “new” Sites.
5. Find the unique combos of Sites and Provenience in Catalog
6. Check the “Collection” field
![Page 16: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/16.jpg)
The Confrontation• Start with Objects – usually “one to one”
replacement• Sort & highlight those to receive new Site• Replace old IRN with new IRN
• Replace not Replace All• Replace the Provenience in those already
changed
![Page 17: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/17.jpg)
The Confrontation
• Photo Archives is a different story• Each record created a new Site
record = duplicates• Many IRNs to replace per “new” Site
record• Instead, use periods to represent
wildcards…
![Page 18: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/18.jpg)
The Confrontation
• Replace not Replace All• Then go through
Provenience as before
• Start with the number of digits that matches the “new” IRN
![Page 19: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/19.jpg)
The Climax
• Double check with View>Attachments>Selected Records
• When spreadsheet completed, retire the “old” Sites
![Page 20: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/20.jpg)
The Climax
• Contractors let me know what is retired• Double check that all are detached• DELETE!
![Page 21: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/21.jpg)
Triumph!
• New data export to check unique values
• Checked with Pat on questions• Final spreadsheet given to contractor
![Page 22: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/22.jpg)
The Resolution
• We now have just under 15,500 Sites Records
• We finished in one year• Averaged 2 contractors at a time
• Module is now tightly controlled• Data is ready for the web
![Page 23: Sites Cleanup: The Clone Wars Kara M. Lewis, Collections Information Program Manager Patricia L. Nietfeld, Collections Manager Smithsonian Institution,](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649c925503460f9494e07d/html5/thumbnails/23.jpg)
The End (or is it??)
• Sites was just the beginning…
• Kara M. Lewis, Collections Information Program [email protected]
• Patricia L. Nietfeld, Collections [email protected]