Women's Networks
Creating linked data implies analysing networks, and analysing networks when digital humanities researchers are involved implies making network graphs. Despite this, the Beyond Notability project has made remarkably few attempts at analysing networks using network analysis technique, including network visualisation. The reason is that we had other proxies for networks in our data: co-habitation, co-education, co-location, co-working on publications or excavations, co-signing letters of nomination. But as we started putting together these data essays, one area that seemed particularly amenable to network visualisation were data relating to event participation and committee membership.
Making Networkable Data
These data, whilst related, had distinct origins. 'Event' data constitutes a series of statements relating to participation at events: this included people speaking, attending, organising, and exhibiting at events (see Sharon's PPA Events blog for more info). 'Committee' data is much more simple, represented only by those statements recording when people 'served on' a committee or group.
Making the data into networks of people required slightly different approaches to what constituted a connection between two individuals.
For 'Event' data, this required a qualifier indicating that two or more people were at the same event at the same time, whether as the same type of participant - say, as both speakers or both attendees - or different types of participation - say, one person was an organiser whilst another was an exhibitor. This type of connection takes no account of how participants interacted, but rather takes co-location as a proxy for association (more on which later).
For 'Committee' data, we chose to create a networkable connection when two people served on a committee for the same association in the same year (thus creating the concept of a "service year"). This type of connection relies on two assumptions: first, that service is more likely to be an ongoing activity than participating in an event (which seems a reasonable assumption); and second, that committee membership involved engagement with other members (which for larger organisations like the Society of Antiquaries of London or Royal Archaeological Institute could be a risky assumption).
Conflations and assumptions notwithstanding this gave us something we could visualise as a network: the connections between people created by their participation at events and service on committees.
Visualising a Networks
Network visualisations (often referred to as network graphs) require a little explanation before we can dig into what they might well us. In this case we note the following features:
- Each 'node' (a round blob) represents an individual people. They are coloured according to whether the data - drawn from our wikibase - contains statements about individuals participating in events (yellow), serving on committees (red), or both (both).
- The size of each node reflects the number of connections between that node and other nodes (called "degree"). Note that, in order to improve legibility, node size is scaled: this reduces the relative size of very large nodes and slightly increases the size of nodes with very few connections.
- Each node is connected to one or more other nodes by 'edges', and the width of each edge (called "link weight") is determined by number of connections between a pair of nodes. In our visualisation, the default minimum link weight is two, meaning that nodes that connect to only one other node are filtered out (more on which later). All isolated nodes - that is, representing instances where an individual participated in an event or served on a committee without that activity connecting to another individual - are removed.
- Hovering over a node gives the number of 'appearances', a measure of the number of instances where an individual participated in an event or served on a committee. Note that this number is usually smaller than the number of connections between that node and other nodes ("degree"), and the ratio can vary considerably depending on the size of an event/committee or duration of service (e.g. a person who was in a single meeting with many participants might have a larger node than a person who was in several meetings with only a few other participants).
Reading a Network
The first thing to note about most network visualisation is that they are representations of both mathematical relationships between data and a visual organisation of those relationships designed to improve readability and legibility. In this case, we use d3 force-directed graph for disconnected graphs, which is designed to "prevent detached subgraphs from escaping the viewport". As scholars such as Ahnert and Ahnert demonstrate, we don't need network visualisations to analyse networks computationally. So when we do, it is vital to consider what we are actually doing: we are analysing a series of choices about the modelling of historical reality (see our Education data essay for more on modelling) through a series of mathematical choices deployed to organise that modelling as a network and a series of presentational choices about representing that modelling as a network.
Got that. Good. So what does this particular network visualisation seem to be telling us:
- First, networks of women in archaeology, history, and heritage circa 1870 - 1950 were dominated by those individuals recorded frequently in our wikibase as both members of committees and participants in events. These individuals appear as a large and closely clustered network, with a second smaller sub-network that also centres on individuals who did both.
- Second, those individuals in our wikibase only recorded as working on committees are closely connected to that primary network.
- Third, individuals in our wikibase only recorded as having participated in events form the periphery of the network, often in isolated small sub-networks.
- Fourth, there are no obvious bridges between discrete networks: there is a centre, a periphery, and then isolated networks.
- But, fifth, if we toggle to the 'events' view and then to the 'committees' view, bridges do emerge. Margerie Venables Taylor seems to be a bridge between a large network centred on Kathleen Kenyon and a small sub-group containing Tessa Verney Wheeler, Rose Graham, Dorothy Liddell and Winifred Lamb.
We have then, a story that starts to emerge: a big interconnected network, some bridges, lots of looser connections. The thing is though, if we change the data presented by the network visualisation then different stories emerge. For example, if we stay on the 'events' view, set minimum link weight to one, and zoom out a little, we see a big ball, with large nodes close together on one side, smaller nodes on other, and a smattering of one-time connection event nodes around the core. Here the interactive nature of our viz enables interpretation of further bridges. For example, hover on Margaret Alice Murray (centre right) and look at the her leftward connection to Charlotte Sophia Burne and rightward connections to Kathleen Kenyon and Tessa Verney Wheeler. In this view, Murray is at the heart of two separate - if unevenly composed - networks. And if we hover on Burne we see that Murray is Burne's only connection into the large network. Burne then seems an important bridge. Burne was principally a folklorist. Murray whilst a member and sometime President of the Folklore Society had broader interests, specialising in Egyptology. In our wikibase Burne and Murray co-occur at the 1913 Annual Meeting of the British Association for the Advancement of Science in Birmingham. Murray spoke on 'Evidence for the custom of killing the king in Ancient Egypt', Burne on 'Souling, Clementing and Caturning: Three November Customs of the Western Midlands'. Two scholars with ample evidence of possible connections - Burne also served as President of the Folklore Society - linked by a single event, an event at which they spoke on very different subjects, and may never have interacted. What does the Burne-Murray case mean for interpreting this network?
A Network of Association
To answer this, stay on the 'events' view and move to link weight eight. We now see two nodes: Jessie MacGregor and Rosa Wallis connected by a thick edge. This connection is explained by both appearing regularly at the Royal Academy Summer Exhibition: MacGregor on 28 occasions between 1872 and 1904, Wallis on eight occasions between 1881 and 1918, with their co-occurrences in the 1880s. What our wikibase does not tell us is if they exhibited in close proximity, if they met, and if so what they discussed. Here then, as with the Burne-Murray case, we arrive at the conclusion that this network is a network of association. It is a network created by connecting nodes that belong to "groups", rather than from direct interactions such as senders and recipients of letters, which historians will be more familiar with. This type of network visualisation then involves some potentially risky assumptions: for example, just because two people went to the same event does not necessarily mean they knew each other. We aren't of course the first people to ponder this. Network science often grapples with the purpose and utility of such weak networking (such as this research on animal networks). And so whilst can build interpretations from the knowledge that this network takes proxies for association to build a network, this involves knowing our data and not being fouled by the fact the visualisation looks like other visualisations we may be familiar with. Which works against the use of visualisation as a simplifying explanatory tool. Proceed then, as they say, with caution.
References
- D3 Force-Directed Graph, Disjoint
- Network Graph with d3.force grouping (for highlighting)
- D3 Force-Directed Graph with Input
- Drag Queens Netwerk Diagram
- Agents Network Visualisation
- Marvel Network
- Plot: Legends