“Maintaining the Duality of Closeness and Betweenness Centrality”

Ulrik Brandes, Stephen Borgatti, and Linton Freeman have an interesting paper in the latest volume of Social Networks: “Maintaining the Duality of Closeness and Betweenness Centrality.” Here’s the abstract:

Betweenness centrality is generally regarded as a measure of others’ dependence on a given node, and therefore as a measure of potential control. Closeness centrality is usually interpreted either as a measure of access efficiency or of independence from potential control by intermediaries. Betweenness and closeness are commonly assumed to be related for two reasons: first, because of their conceptual duality with respect to dependency, and second, because both are defined in terms of shortest paths. We show that the first of these ideas – the duality – is not only true in a general conceptual sense but also in precise mathematical terms. This becomes apparent when the two indices are expressed in terms of a shared dyadic dependency relation. We also show that the second idea – the shortest paths – is false because it is not preserved when the indices are generalized using the standard definition of shortest paths in valued graphs. This unveils that closeness-as-independence is in fact different from closeness-as-efficiency, and we propose a variant notion of distance that maintains the duality of closeness-as-independence with betweenness also on valued relations.

Ulrik Brandes, Stephen P. Borgatti, and Linton C. Freeman, “Maintaining the Duality of Closeness and Betweenness Centrality,” Social Networks 44 (2016): 153-159.

Leave a comment

Filed under Uncategorized

New: List of English Personal Nouns

I’ve put together a list of English nouns that refer to people (or, less clunkily, “personal nouns”). I plan to use it alongside existing text analysis tools (like David Bamman’s excellent BookNLP) to detect unnamed characters in the Gospels and other ancient biography. It should also, hopefully, make automatic social network extraction easier and more accurate.

The list, along with the code and sources I used to generate it, is available on my GitHub.

Leave a comment

Filed under Uncategorized

“Preparing Data is Most of the Work”

Over at Data Science Central is this interesting article on “data janitor work”: the fact that the biggest hurdle to large-scale data analysis is wrangling the data into a usable form. It is, of course, directly applicable to doing text analysis in the “Million Book Library.”

Data scientist[s] spend a comparatively large amount of time in the data preparation phase of a project. Whether you call it data wrangling, data munging, or data janitor work, the [New York] Times article estimates 50%-80% of a data scientists’ time is spent on data preparation. We agree. . . .

Before you start your project, define what data you need. This seems obvious, but in the world of big data, we hear a lot of people say, “just throw it all in”. If you ingest low quality data that is not salient to your business objectives, it will add noise to your results.

The more noisy the data, the more difficult it will be to see the important trends. You must have a defined strategy for the data sources you need and the particular subset of that data, which is relevant for the questions you want to ask.

Leave a comment

Filed under Uncategorized

Troubleshooting: Installing Gephi on Ubuntu

After spending the better part of two days wrestling with Gephi to get it installed on Ubuntu, I wanted to share the solution I found.

# Edit the sources file and add this to end:
# deb http://ppa.launchpad.net/rockclimb/gephi-daily/ubuntu precise main
# Then, in Terminal, run:
sudo gedit /etc/apt/sources.list

# You can't just install gephi because it's missing libgoogle-collections-java
# And that one's not packaged with Ubuntu anymore as of Trusty at least

# Download all three files to a temp folder:
# https://packages.debian.org/source/wheezy/libgoogle-collections-java
# Then navigate to that folder:
cd /path/to/folder

# Download keys for libgoogle-collections-java
gpg --keyserver keyserver.ubuntu.com --recv-keys 974B3E96

# Extract a source package
dpkg-source -x *.dsc

# Now try to build package.
cd libgoogle-collections-java-1.0/
dpkg-buildpackage -us -uc

# Fail, needs dependencies
# For me, it needed:
# maven-repo-helper maven-ant-helper cdbs default-java libjsr305-java
sudo apt-get install [missing dependencies]

# Now, build the package.
dpkg-buildpackage -us -uc

# This builds an actual .deb in folder above.
cd ..
sudo dpkg -i libgoogle-collections-java_1.0-2_all.deb

# Now install gephi.
sudo apt-get update; sudo apt-get install gephi

# Done

Modified from this AskUbuntu answer.

Leave a comment

Filed under Uncategorized

Data: Faceless but not Unbiased

Here’s an interesting think piece from Frank Pasquale in Aeon on the nature and role of data in society today.

Regulators want to avoid the irrational or subconscious biases of human decision-makers, but of course human decision-makers devised the algorithms, inflected the data, and influenced its analysis. No ‘code layer’ can create a ‘plug and play’ level playing field. Policy, human judgment, and law will always be needed. Algorithms will never offer an escape from society. . . .

An inference . . . may not be worth much on its own. But once people are so identified, it could easily be combined and recombined with other lists – say, of plus-sized shoppers, or frequent buyers of fast food – that solidify the inference. A new algorithm from Facebook instantly classifies individuals in photographs based on body type or posture. The holy grail of algorithmic reputation is the most complete possible database of each individual, unifying credit, telecom, location, retail and dozens of other data streams into a digital doppelganger.

However certain they may be about our height, or weight, or health status, it suits data gatherers to keep the classifications murky. A person could, in principle, launch a defamation lawsuit against a data broker that falsely asserted the individual concerned was diabetic. But if the broker instead chooses a fuzzier classification, such as ‘member of a diabetic-concerned household’, it looks a lot more like an opinion than a fact to courts. Opinions are much harder to prove defamatory – how might you demonstrate beyond a doubt that your household is not in some way ‘diabetic-concerned’? But the softer classification may lead to exactly the same disadvantageous outcomes as the harder, more factual one.

Leave a comment

Filed under Uncategorized

Luke Social Network on Terra Biblica

I’m excited to announce that my social network of the Gospel of Luke, which I created as part of my dissertation research, is now online within Terra Biblica, part of the Big Ancient Mediterranean project.

BAM is headed by Paul Dilley, Sarah E. Bond (both of the University of Iowa), and Ryan Horne (UNC-Chapel Hill). You can find out more about the project here.

Leave a comment

Filed under Uncategorized

Schmid, “The ‘Hellenisation’ of the Nabataeans: A New Approach” (2001)

Stephan G. Schmid, “The ‘Hellenisation’ of the Nabataeans: A New Approach,” Studies in the History and Archaeology of Jordan 7 (2007): 407-419.

In this article, Schmid “give[s] a short overview on what is known about Nabataean material culture in its best understandable categories today and to look for whether there is any common line of development or even a model that could fit to most of these categories” (407). He notes that, although the Nabataeans are historically attested from 312 BCE, there is no evidence of a Nabataean material culture until around 100 BCE; moreover, when it appears, it is thoroughly Hellenistic. Schmid argues, following Diodorus Siculus, that the Nabataeans were “nomads or semi-nomads frequenting once or twice a year the same place for trade and business” (415) until ca. 100 BCE, after which they sedentarized. Their sedentarization lead them to develop a material culture. In the absence of an existing material culture, the Nabataeans simply “oriented their new material culture according to the mainstreams of the contemporary Hellenistic world in its Near Eastern variant” (415), into which they gradually incorporated Roman and “proper Nabataean” (416) elements.

Leave a comment

Filed under Uncategorized