The IsisCB as a Network Graph of the Discipline
The new interdependent relational database that we have built (which I described in the previous post) begins with a very different concept of what the information universe looks like. Instead of treating bibliographical citations as units of information, it treats them as objects that are made up of many different kinds of things. Citations are records of intellectual activities created and fostered by many actors doing different tasks. The books and journal articles—the physical manifestations of the intellectual work—are closely tied not only to the subject matter, but to the people and institutions that created them. This means that both the social and intellectual world of the history of science can be revealed when the bibliography is unpacked.
This data universe can be visualized as a large network graph, that is, a graphical representation of the relationships between nodes, in this case relationships between citations and authorities. Each citation and each authority becomes a node, and the links between them are defined by specific roles or functions that the authority has with the citation. This conception of the bibliography is especially powerful because it links the persons and institutions directly to the published works which are directly linked to the semantic concepts that I and other bibliographers have added. All of these entities are part of the same graph, all interconnected.
In this networked system, the standard bibliographical record of an article looks like what you see in Figure 1. Here you see the citation with all of its links to those other entities that make it up: authors, institutional hosts (publishers primarily), and subject and classification terms. By itself, this is nothing new; indeed, most bibliographical databases have something like this. The EBSCO hosted HSTM database looks somewhat similar to this page. By restructuring the data, however, I have been able to create a more dynamic citation record that allows me to do more with this data than I had been able to do before.
The real power of the system can be seen when you look at one of the authority records. In the authority records, it is possible to see the citation-authority network from a different vantage point, and this new perspective turns out to reveal a lot about the context of the record you are looking at.
Figure 2. Prototype of an authority record as it appears in the new IsisCB Platform.
The ability to see all of the citations related to this authority record is part of this context. The list on the right tells you immediately about the term in question by showing you what was written about it, what was published by it, or what was authored by him or her. In this case the authority is an author, and we see what kinds of works she writes on. Of course we could just as easily look at the authority record for a publisher or a subject term. In each case, the citation list indicates the context of the term within the larger world of history of science.
More revealing, however, are the lists on the left that show the authors, subjects, and institutions that co-occur. These lists reveal even more about the semantic and social place of the authority record in the network graph. Here you see all of the people who have been authors or contributors to the citation records that are listed here. Below that are the co-occurring subject terms and the institutions. These are social and semantic markers that help you further place the item in its context.
Together, the citation and authority records provide immediate representation of the closely associated works for each of these objects. (See Figure 2.) In so doing, the bibliographic data helps us understand the intellectual universe in which we work: the people, institutions, and ideas that constitute the discipline of the history of science.
On a structural level, this system works by using several tables that refer to each other. The three core tables in the system are the Authority table, the Citation table, and the Relationship table (which links the other two together). The Relationship table contains one record for each relationship that exists. Currently I have about 160,000 authorities, 200,000 citations, and about 1.3 million relationships. Each one of those relationships shows that some Authority “X” is related to some Citation “Y,” and it describes the nature of that relationship. For example: the authority “Catherine Westfall” is related to the citation “The Physical Tourist” and the type of the relationship is “Author.” It is a very simple system, but that simplicity makes it very powerful.
There is nothing new in this sort of relational database. What is new is the way that I have employed this technique to produce the kind of records you see here, ones that give snapshots of each of the authorities. The same kind of information is shown for every authority record. The authority record for the periodical Physics in Perspective gives a list of all of the articles and reviews in the database from that journal, and it lists all of the authors who have written for it and all of the subject terms that have been used to tag all of the articles. Similarly, one can look at the subject authority “Cryogenics” and see the constellation of works, authors, co-occurring subjects, and publishing institutions for that term.
All in all, the radically new structure of the IsisCB database produces a new tool that reveals much new information about each of the nodes in the network graph that I have created. This was achieved simply by revealing the data relationships that already existed in the data at the start. In the next post, I will discuss some of the challenges that I faced in creating this new structure.