The purpose of this repository is to share Carnegie Hall's performance history as linked open data, and resources related to its creation and maintenance. For updates in 2019, follow our progress here.
🔴 Explore Carnegie Hall's linked open data here.
The Carnegie Hall Rose Archives believes in showing its work. To that goal, this repository includes:
- Link to explore the CH LOD via a SPARQL endpoint for querying
- An overview of Carnegie Hall's performance history
- Documentation about the structure and content of the Carnegie Hall linked open data
The initial release encompassed performance history data from 1891 through the end of the 2015-16 concert season (July 15, 2016). Beginning in August 2019, the data was updated on a weekly basis to encompass performance history data from 1891 – present. As of May 2024, bi-weekly updates were implemented.
Since it opened in 1891, Carnegie Hall has been a center of cultural and political expression, presenting and providing a venue for many different types of music and culture across multiple performance spaces. Since its transition to a not-for-profit institution in 1960, Carnegie Hall has continued to deepen its commitment to music education and community outreach by presenting concerts and events in neighborhoods throughout New York City, across the United States, and worldwide.
The Carnegie Hall Rose Archives maintains a database, the Orchestra Planning and Administration System (OPAS), with a goal to track every event – musical and nonmusical – that has occurred in the public performance spaces of CH since 1891. Since our archives were not established until 1986, there are some gaps in these records, which we continue to fill using sources like digitized newspaper listings and reviews; many missing pieces – concert programs, posters, etc. – are donated to us, or we buy them on eBay. This database now covers nearly 60,000 events across nearly all musical genres, as well as theatrical, dance and spoken word events, meetings, lectures, civic rallies, and political conventions. It also includes corresponding records for more than 115,000 artists, 27,000 creators and over 110,000 creative works.
Starting in 2013, Carnegie Hall began publishing some of these records to our online Performance History Search. The Performance History Search has records for nearly 60,000 events from 1891 to the present. Data cleanup efforts are ongoing, and new records are published each month to that HMTL presentation. The Carnegie Hall linked data prototype uses this published data set.
How is the Carnegie Hall (CH) performance history represented as linked open data? Characteristics about CH performance events fall into two categories:
- Information that applies to the entire event.
- Information that applies to each presentation of a work during an event (a work performance).
The separation of a work performance from the event enables us to provide specificity. Statements link performers to a particular work performance, rather than generically to an entire event. Let's explore the event data structure further:
-
Each event has its own Uniform Resource Indentifier (URI) and includes metadata related to:
- Date/Time (ISO 8601 date/time string)
- Venue
- Title (who performed or what took place)
- Entities who participate in the entirety of the program, like a conductor and/or an orchestra.
-
Components of an event, e.g. each work performed, is a sub-event with its own URI. Work performance metadata includes:
- Works (musical and non-musical)
- Performers
Interested in the CH LOD data model, namespaces, URI schemas, vocabularies, and ontologies? Check out CH's in-depth data structure and schema documentation in this repository.
Although the CH LOD includes about 4.5 million triples, there is still information missing from or out of scope of this initial release. Below is a sample of excluded content and topics. See how to get involved if you have feedback about the list of information not currently in the dataset.
- Some past performance records are missing; such data will be added as it becomes available.
- Complete, accurate biographical data is not always available for performers and composers. To the extent that this information has been provided to Carnegie Hall or is available from published authority sources, it has been added to the dataset. Existing Carnegie Hall URIs will remain stable, but additional or revised statements (e.g. newly acquired birth/death dates, corrected spellings, etc.) may be added at any time.
- Additional external authority IDs - we plan to add more external authority IDs for entities and creative works
- Credited non-performing roles, e.g. choral/ensemble preparation, technical roles, etc., are not included in the initial release
- Building LOD at Carnegie Hall - How did the Carnegie Hall Archives get from an internal database to 4.5 million triples containing open data from a dozen ontologies and vocabularies?
Want to help Carnegie Hall improve the performance history data? Use the Issues page to share:
- Feedback - What was useful about the data or the resources in this repository?
- Recommendations - Have a great idea for the content, structure, or resources we describe you'd like to share with us?
- Sample Queries - Did you write an interesting SPARQL query others might find useful?
- Issues or Inaccuracies - Notice something out of place or incorrect?
Did you use the CH LOD to build an interesting visualization, or port it into a new project? We'd love to see it! Submit a link and a description of how you utilized the data set.
Carnegie Hall offers the CH Performance History as Linked Open Data dataset as-is and makes no representations or warranties of any kind concerning the contents. Please see the data license statement below.
If you have questions about the dataset or its usage, please submit a new 'Issue' or email archives at carnegiehall dot org.
This code is provided “as is” and for you to use at your own risk. The information included in the contents of this repository is not necessarily complete. Carnegie Hall offers the scripts as-is and makes no representations or warranties of any kind.
We plan to update the scripts regularly. We welcome any feedback. Please let us know if you have found the contents of this repository useful!
Carnegie Hall is releasing this performance history dataset with a Creative Commons CC0 1.0 Universal Public Domain Data Dedication.
The Carnegie Hall Performance History dataset includes data from the GeoNames geographical database, which is licensed under a Creative Commons Attribution 3.0 License.
The MIT License (MIT)
Copyright (c) 2017 Carnegie Hall
All contents are released under the terms described in the MIT License included in this repository.
Thank you to Matt Miller and Gabe Mangiante for their contributions to this project.
Thank you to the following organizations for inspiration and commitment to the open data community: