Data

Data from the Philadelphia Playbills Project is available on GitHub.

The Philadelphia Playbills Project has produced two data sets in a Linked Open Data format (LOD), specifically JSON-LD.   One of these datasets was produced from playbills transcribed by the project, and the other was produced from existing metadata for the Furness Theatrical Image collection. LOD is an approach to data that requires adding Uniform Resource Identifiers (URIs), which is a kind of Permanent URL that has been agreed on as the authority for identifying a particular person, place, or work.   The PPP started by adding URIs from the Virtual International Authority File (VIAF), which is a significant source for such identifying links. On the playbills project, this means determining whether there is an existing identifier for names of people mentioned on the playbills, including actors, playwrights and theater staff, as well as identifiers for theaters on the playbills.

The reason for using these URIs is to disambiguate a particular person in the data.   Women, whose names change with marriage, who are especially challenging to identify. For example, Cornelia Frances Thomas Burke Jefferson, is billed as “Miss Thomas,” “Mrs. Burke,” and “Mrs. Jefferson.   Or, across time, there may be multiple “Mr. Wallack” billings on a playbill, so it is necessary to differentiate between the father, son and brother of that name. Once these links are added it is now possible for a machine to differentiate between people by the same name, or connect that multiple names belong to the same person. This means that it is easier to connect the playbills data with other data that include the same URIs, so that actors on the playbills can be connected with other information about them.

The challenge of documenting what was possible to know about these people not only enables the data to be more easily shared and re-ussed, but contributes to the work of unraveling the identities of people in Philadelphia’s theatrical past. It also indicates where future historical and digital work may need to be aware of gaps in who is already documented and recognized in forms like VIAF IDs, and who is not. The documentation for this project creates a start for this kind of work for Philadelphia theater history.

These URI’s were added to both the playbill transcriptions and the metadata for the Furness Theatrical Image Collection. This means that the two JSON-LD datasets can now be used together to model connections across data sets. For example, URIs added to the mentions of Edwin Booth across playbills in the transcription dataset can be matched with URI’s added to the image collection for images of an engraving of Booth and the skull used in Philadelphia productions of Hamlet signed by Booth and other actors. In this way the data can bring together resources across collections.

This same principle of connecting resources across these two experimental datasets can, in future work be used to more easily connect data on the project with data elsewhere, from other Theater History projects, to Wikipedia. Together these datasets create the basis for future work that will allow us to more easily see connections between the names of people on the playbills and other archival material or information about them elsewhere. In this way the project advances both approaches to creating transcriptions of playbills, and ways of meaningfully connecting the playbills to other relevant materials in the archives, contributing new ways of tackling challenges for research and discovery in the theater history field.