Thursday, October 20, 2011

Tending our garden

To an astonishing extent, Google really has changed the way the world's information is organized.

A pair of trivial examples:

[1] Liddell-Scott-Jones is the best lexicon of ancient Greek in any language, but I don't always find the search interface to the digital version from the Perseus project to be the easiest navigation system. I recently discovered a simple way to look up an entry in Perseus: type the lemma of the Greek word in Google. Duh.

The point is that Perseus does not — and should not — have to worry about how people search for articles in LSJ as long as the Perseus edition is cleanly organized by article with a recognizably tagged lemma. Since I'm frequently browsing the web from OS X, typing UTF-8 Greek is trivial, thanks to SophoKeys. Perseus does not — and should not — have to worry about keyboard input systems. The digital LSJ was available from Perseus long before Google or SophoKeys came along, but it's more readily usable and that much more valuable to me now because the Perseus project did take responsibility for its area of domain-specific knowledge, and properly organized the contents of the lexicon by article.

[2] urn:cts:greekLit:tlg0012.tlg001 is a URN referring to the Iliad. URNs are not direct addresses like URLs: they are technologically independent (although they follow a machine-parseable syntax). Enter that URN in Google, and you'll probably get some amusingly off-target advertisements for cremation urns, but I scrolled through several pages of Google's hit list without encountering a single irrelevant match.

One implication of these examples is not trivial. If we can publish meaningful units of content in association with canonical identifiers like URNs, we can count on the ever-increasing momentum of the internet to create new ways of discovering and working with that content. Like Pangloss, we need to tend our own garden; Google or others will find it rapidly (whether or not we think that makes for the best of all possible worlds).

Thursday, October 6, 2011

Steve Jobs on building things

Amid today's many recollections of Steve Jobs citing his quotable aphorisms, two particularly caught my attention:

People don't know what they want until you build it.


and related to that theme:

The builders are the real thinkers.


Jobs was immediately referring to engineers and consumers, but the statements are as true in scholarship as they are in any other endeavor.

Wednesday, September 28, 2011

Access to scholarly work

Members of Princeton University's faculty have unanimously voted for a policy guaranteeing open access to their scholarship (blogged with quotation of the key passage here).

Contrast with that the capital campaign of the American Philological Association (a professional organization purporting to represent the discipline of Classics). The APA's "Campaign for Classics" plans to offer access to digital resources, but in many cases that access will be restricted to APA members.

If the contrast is not pointed enough, think of it this way: as of September, 2011, Princeton faculty members risk violating their university's policy if they contribute scholarly work to the APA project.

If you believe that consistent principles should guide our behavior, then Princeton faculty members who are dues-paying members of the APA face a real ethical dilemma: how can they support the work of an organization that directly conflicts with the policies unanimously adopted by their university?

Tuesday, September 20, 2011

Druids in Oxford?

A colleague recently pointed me to the "Ancient Lives" project allowing members of the general public to view and transcribe papyri from the vast Oxyrhynchus collection that remains largely unpublished more than a century after its discovery.

I imagined something like the ground-breaking Suda Online (or, SOL) -- an astonishingly successful project that has now translated for the first time ever more than 29,000 of the 30,000-odd articles in a Byzantine encyclopedia of the ancient world.

Instead, when I followed the link on copyright on the "Ancient Lives" website, I read:

Images may not be copied or offloaded, and the images and their texts may not be published.

This reminded me not of the forward-looking SOL, but of Julius Caesar's description of Gallic Druids. From Caesar, Gallic Wars, 6.14:
Neque fas esse existimant ea litteris mandare ... Id mihi ... instituisse videntur, quod neque in vulgum disciplinam efferri velint ..

"They consider it wrong to commit these [sacred texts] to writing. I believe that they have established this practice because they do not want their professional knowledge to be published to the common people."

Tuesday, January 18, 2011

Chrome apps and extensions

You follow good practices, archive your priceless material in standard formats, and make it accessible through network services. Now where do you build applications to use those services?

In digital scholarship, your time is your single most valuable resource, so the cost of a full-blown cross-platform desktop application is usually prohibitively high. As CSS 3 and javascript give Web browsers better and better support for visually rich interactive applications, the browser becomes more attractive as a platform for applications, rather than just a viewer for documents.

HTML5 clearly understands the browser this way. (If you're not familiar with HTML5 yet, see http://www.html5rocks.com/) The blogosphere is full of comments about HTML5's audio and video capabilities, but its integration of local and networked resources raises more interesting architectural questions. HTML5 can give you access to local file storage or locally persistent databases as well as providing networked communication with remote processes. With an HTML 5 platform, the browser is more like a local application with good access to remote data.

Google has taken this idea a step further with the definition of extensions and apps for its Chrome browser. They are written in HTML, javascript and CSS: it is perfectly possible to write build a "Chrome app" that is nothing more than a web page displaying equally well in any browser. (For a web developer, the road to "Hello, world" has never been shorter.) The Chrome app is defined in a simple JSON manifest file. The manifest can define permissions for access to remote resources (no more work-arounds to deal with restrictions on unsafe cross-site scripting). It also defines how your app or extension is integrated into Chrome. A single line in the manifest can tie your app to a button permanently available on your toolbar, accessible from your choice of user action (button click, key-press combination, etc). Full Chrome apps can be distributed through Google's App Store where any Chrome user could install the app locally.

If you've ever looked at developing browser-based web applications, or have ever thought about extending Firefox with its extension mechanism, you owe it to yourself to take a quick look at
http://code.google.com/chrome/extensions/index.html. That's all it takes to get started.