The numbers game, again.

I see that Mel Terras is collecting stats for something which the young people of today apparently call “an infographic” about Digital Humanities activities. This leads me to excavate  a report I produced back in the day tabulating usage of the Humanist list in its first six months or so between August 1987 and January 1988. I leave it to the reader to determine whether anything much has changed.

What happened on HUMANIST?

This necessarily brief and partisan report attempts to review about six months of strenuous activity within the Humanist discussion group sponsored by the ACH, the ALLC and the University of Toronto’s Centre for Computing in the Humanities. At the time of writing, participants in this electronic discusion group numbered nearly 180 spread across 11 countries (see table 1), largely, but by no means exclusively, in North American academic computing centres. Table 2 shows that less than half of these participants actually create the messages that all, perforce, are assumed to read; out of over 600 messages during the last six months, nearly 500 were sent by just eight people, and out of 180 subscribers, 107 have never sent a message. In this, as in some other respects, HUMANIST resembles quite closely the sort of forum with which most of its members may be presumed to be most familiar: the academic committee. Personality traits familiar from that arena – the aggressive expert, the diffident enquirer, the unsuppressable bore – are equally well suited to this new medium: both young turks and old fogies are also to be found.

Some of the rhetorical tricks and turn-taking rules appropriate to the oral medium find a new lease of life in the electronic one; indeed it is clear that this medium approximates more closely to orature than to literature. Its set phrases and jargon often betray an obsession with informal speech, and a desire to mimic it more *directly*, re-inventing typographic conventions for the purpose. As in conversation too, some topics will be seized upon while others, apparently equally promising, sink like stones at their first appearance; the wise Humanist, like the good conversationalist, learns to spot the right lull into which to launch a new topic. Perhaps because the interactions in an electronic dialogue are necessarily fewer and more spaced out (no pun intended) than those in face to face speech, misunderstanding and subsequent clarifications seem to occur more often than one might expect. However, the detailed functional analysis of electronic speech acts is an interesting but immense task, which I regretfully leave to discourse analysts better qualified than myself. (Needless to say, Humanist itself reported at least two such studies of “electronic paralanguage” during the period under review).

For the purposes of this survey I identified four broad categories of message. In category A (for Administrative) go test messages, apologies for electronic disasters, announcements -but not discussion- of policy and a few related and oversized items such as the Humanist Notes for Beginners and the invaluable “Biographies”. These totalled 57 messages, 18% of all messages, or 25% by bulk.

In category C (for Conference) go announcements of all other kinds – calls for papers, job advertisements, conference reports, publicity for new software or facilities etc. The figures here totalled 39 messages, 12% of all messages, 20% of all lineage. As might be expected, categories A and C are disproportionately lengthy and not particulary frequent. I do not discuss them much further.

In category Q (for query) go requests for information on specified topics, public answers to, and summaries of such responses. These amounted to 20% of all messages but (again unsurprisingly) only 10% of all lines. I have been unable, as yet, to gather any statistics concerning the extent of private discussions occurring outside the main Humanist forum, though it is clear from those cases which are subsequently summarised that such discussions not only occur but are often very fruitful. What proportion of queries fall on stony ground is also hard, as yet, to determine.

In category D (for discussion) I place those messages perhaps most typical of Humanist: general polemic, argument and disputation. Overall, these messages account for nearly 50% of the whole, (44% by line) and thus clearly dominate the network. With the curious exception of November, the proportion of D category messages remains more or less constant within each month. As table 5 shows, the relative proportions of other types of message are by no means constant over time.

Of course, assigning a particular message to some category is not always a clear cut matter. Correspondents occasionally combine a number of topics – or kinds of topic – in a single message. Moreover, the medium itself is still somewhat unreliable. Internal evidence shows that not all messages always get through to all recipients, nor do they always arrive in the order in which they were despatched or (presumably) composed. This report is based only on the messages which actually reached me here in Oxford; concerning the rest I remain (on sound Wittgensteinian principles) silent. I am equally silent on messages in categories A and C above, which are of purely transient interest.

Space precludes anything more than a simple indication of the range of topics on which Humanists have sought (and often obtained) guidance. In category Q over the last six months I found messages asking for information on typesetters with a PostScript interface, on scanners capable of dealing with microform, about all sorts of different machine readable texts and about software for ESL teaching, for library cataloguing, for checking spelling, for browsing foreign language texts, and for word processing in Sanskrit. Humanists asked for electronic mail addresses in Greece and in Australia, for concordance packages for the Macintosh and the Amiga ST, for address lists and bibliographies; they wondered who had used the programming language Icon and whether image processing techniques might be used to analyse corrupt manuscripts; they asked for details of the organisational structure of humanities computing centres and of the standards for cataloguing of computer media.

Above all however, Humanists argue. Back in August 1987 HUMANIST was only a few months old, yet many issues which have since become familiar to its readership were already on the agenda. Where exactly are the humanities as a discipline? what is their relation to science and technology? Correspondents referred to the infamous “Two cultures” debate of the late fifties, somehow now more relevant to the kind of “cross-disciplinary soup we are cooking”, but rather than re-flaying that particular dead horse, moved rapidly to another recurrent worry: did the introduction of computers change humanistic scholarship quantitatively or qualitatively? Does electronic mail differ only in scale and effectiveness from the runner with the cleft stick? Do computers merely provide better tools to do tasks we have always wanted to do? The opinion of one correspondent (“if computers weren’t around, I doubt very much if many of the ways we think about texts would have come to be”) provoked another into demanding (reasonably enough) evidence. Such evidence as was forthcoming however did concede the point that “it could all be done without computers in some theoretical sense, but certainly not as quickly”. Reference was made to a forthcoming collection of essays which might settle whether or not it was chimerical to hope that computers will somehow assist not just in marshalling the evidence but in providing interpretations of it.

A second leitmotiv of Humanist discussions was first heard towards the end of August, when an enquiry about the availability of some texts in machine readable form provoked an assertion of the moral responsibility the preparers of such texts should accept for making their existence well known and preferably for depositing them in a Text Archive for the benefit of all. A note of caution concerning copyright was also first sounded here, and it was suggested that those responsible for new editions should always attempt to retain control over the rights to electronic distribution of their material.

With the start of the new academic year, HUMANIST became more dominated by specific enquiries, and a comparatively low key wrangle about whether or not product announcements, software reviews and the like should be allowed to sully its airspace. Characteristically, this also provided the occasion for some Humanists to engage in an amusing socio-linguistic discussion of the phenomenon known as “flaming”, while others plaintively asked for “less chatter about the computer which is only a tool and more about what we are using it for”. It appeared that some far flung Humanists actually have to pay money proportionate to the size of the mailings they accept, recalling an earlier remark about the uniquely privileged nature of the bulk of those enjoying the delights of this new time-waster, which was (as one European put it) “surely *founded* for chatter”.

In mid October, a fairly pedestrian discussion about the general lack of recognition for computational activities amd publications suddenly took off with the re-emergence of the copyright problems referred to above. If electronic publication was on a par with paper publication, surely the same principles of ownership and due regard for scholarly labours applied to it? But did this not mitigate against the current easy camaraderie with which articles, gossip and notes are transferred from one medium to another? as indeed are those more substantial fruits of electronic labours, such as machine readable texts? For one correspondent such activities, without explicit permission, were “a measure of the anesthetizing effect of the xerox machine on our moral sense”. For another, however “asking concedes the other party’s right to refuse”

In mid-November, after a particularly rebarbative electronic foul up, minimal editorial supervision of all Humanist submissions was initiated. Other than some discussion of the “conversational style” appropriate to the network, this appears to have had little or no inhibitory effect on either the scale or the manner of subsequent contributions.

An enquiry about the availability of some Akkadian texts led to a repeated assertion of the importance to scholarship of reliable machine readable texts. Conventional publishers were widely castigated for their short-sighted unwillingness to make such materials available (being compared on one occasion to mediaeval monks using manuscripts for candles, and on another to renaissance printers throwing away Carolingian manuscripts once they had been set in type). Humanists were exhorted to exert peer pressure on publishers, to pool their expertise in the definition of standards, to work together for the establishment of a consortium of centres which could offer archival facilities and define standards. More realistically perhaps, some humanists remarked that publishers were unlikely to respond to idealistic pressures and that a network of libraries and data archives already existed which could do all of the required tasks and more were it sufficiently motivated and directed. At present, said one, all we have is “a poor man’s archive” dependent on voluntary support. Others were more optimistic about the possibility of founding a “North American text Archive and Service Center” and less optimistic about the wisdom of leaving such affairs to the laws of the marketplace. One intriguing proposal was that a national or international Archive might be managed as a giant distributed database.

Following the highly successful Vassar conference on text encoding standards in mid-November, a long series of contributions addressed the issue of how texts should be encoded for deposit in (or issue from) such an archive. No one seems to have seriously dissented from the view that descriptive rather than procedural markup was desirable, nor to have proposed any method to describe such markup other than SGML, so that it is a little hard to see quite what all the fuss was about – unless it was necessary to combat the apathy of long established practise.

One controversy which did emerge concerned the desirability (or feasibility) of enforcing a minimal encoding system, and the extent to which this was a fit role for an archive to take on. “Trying to save the past is just going to retard development” argued one, while another lone voice asserted a “rage for chaos” and praised “polymorphic encoding” on the grounds that all encoding systems were inherently subjective (“Every decoding is another encoding” to quote Morris Zapp). Anxiety was expressed about the dangers of bureacracy. Both views were, to the middle ground at least, equally misconceived. In the first case, no-one was proposing that past errors should dictate future standards, but only that safeguarding what had been achieved was a different activity from proposing what should be done in the future. In the second case, no-one wished to fetter (or to “Prussianize”) scholarly ingenuity, only to define a common language for its expression.

There was also much support for the common sense view that conversion of an existing text to an adequate level of markup was generally much less work than starting from scratch. Clearly however, a lot depends on what is meant by “generally” and by “adequate”: for one humanist an adequate markup was one from which the “original form of a document” could be re-created, thus rather begging the question of how that “original form” was to be defined. To insist on such a distinction between “objective text” and “subjective commentary” is “to miss the point of literary criticism altogether” as another put it.

One technical problem with SGML which was identified, though not much discussed, was its awkwardness at handling multiply hierarchic structures within a single document; one straw man repeatedly shot down was the apparent verbosity of most current implementations based on it. However, as one correspondent pointed out, the SGML standard existed and was not going to disappear. It was up to Humanists to make the best use of it by proposing tag sets appropriate to their needs, perhaps using some sort of data dictionary to help in this task.

At the end of 1987 it seemed that “text markup and encoding have turned out to be THE issue for humanists to get productively excited about”. Yet the new year saw an entirely new topic sweep all others aside. A discussion on the styles of software most appropriate for humanistic research soon focussed on an energetic debate about the potentials of hypertext sustems. It was clear to some that the text analysis features of most existing software systems were primitive and the tasks they facilitated “critically naive”. Would hypertext systems, in which discrete units of text, graphics etc. are tightly coupled to form an arbitrarily complex network, offer any improvement on sequential searching, database construction, concordancing visible tokens and so forth? Participations in this discussion ranged more widely than usual between the evangelical and the ill-informed, so that rather more heat than light was generated on the topic of what was distinctively new about hypertext, but several useful points and an excellent bibliography did emerge.

A hypertext system, it was agreed, did extend the range of what was possible with a computer (provided you could find one powerful enough to run it), though whether or not its facilities were fundamentally new remained a moot (and familiar) point. It also seemed (to this reader at least) that the fundamental notion of hypertext derived from a somewhat primitive view of the way human reasoning proceeds. The hypertext paradigm does not regard as primitive such mental activities as aggregation or categorisation (this X is a sort of Y) or semantic relationships (all Xs are potentially Yd to that Z), which lie at the root of the way most current database systems are designed. Nevertheless it clearly offers exciting possibilities – certainly more exciting (in one humanist’s memorable phrase) than “the discovery of the dung beetle entering my apartment”.

Considerations about the absence of software for analysing the place of individual texts within a larger cultural context, lead some humanists to ponder the rules determining the existence of software of any particular type. Was there perhaps some necessary connexion between the facilities offered by current software systems and current critical dogma? One respondent favoured a simpler explanation: “Straightforward concordance programs are trivial in comparison to dbms and I think that explains the situation much better than does the theory of predominant literary schools”. It seems as if humanists get not just “the archives they deserve” but the software that’s easiest to write.

-----------Tables for the Humanist Digest------------------------

Table 1 : Humanist Subscribers by Country 

|country     |nsubs        | 
|?           |            2| 
|Belgium     |            3| 
|Canada      |           54| 
|Eire        |            1| 
|France      |            1| 
|Israel      |            4| 
|Italy       |            1| 
|Netherlands |            1| 
|Norway      |            3| 
|UK          |           37| 
|USA         |           73| 
  Total                 180 

Table 1a. Subscribers per node 

|nusers       |nsuch        | 
|            1|           70| 
|            2|           17| 
|            3|           11| 
|            4|            2| 
|            5|            1| 
|            7|            1| 
|            8|            1| 
|           13|            1| 

Table 2. Messages sent per subscriber 

|n_mess_sent  |number_such  |messages     | 
|            0|          107|            0|
|            1|           31|           31| 
|            2|           10|           20| 
|            3|            9|           27| 
|            4|            3|           12| 
|            5|            2|           10| 
|            6|            3|           18| 
|            7|            3|           21| 
|            8|            2|           16| 
|           10|            1|           10| 
|           12|            1|           12| 
|           14|            1|           14| 
|           17|            1|           17| 
|           18|            1|           18| 
|           20|            1|           20| 
|           71|            1|           71| 
Totals                   177|          316|        

Table 3 Messages by origin 

|country     |Total message| 
|?           |            8| 
|Canada      |          130| 
|Israel      |            6| 
|UK          |           40| 
|USA         |          132| 

Table 4: Messages by type 

|tag   |messages     |% messages|linecount    |%lines    | 
|A     |           57|    17.981|         3867|    25.306| 
|C     |           39|    12.303|         3078|    20.143| 
|D     |          156|    49.211|         6707|    43.891| 
|Q     |           64|    20.189|         1616|    10.575| 

Table 5:  Messages by type within each month 

      |type  |messages     |% in month|lines        |% in month| 
AUG87 |A     |           10|    23.256|         1230|    32.031| 
SEP87 |A     |            7|    17.500|          105|     9.722| 
OCT87 |A     |            9|    30.000|          428|    36.992| 
NOV87 |A     |           16|    34.783|          863|    48.840| 
DEC87 |A     |           10|    11.494|         1178|    25.732| 
JAN88 |A     |            2|     4.000|            5|     0.256| 

AUG87 |C     |            3|     6.977|         1712|    44.583| 
SEP87 |C     |            6|    15.000|          208|    19.259| 
OCT87 |C     |            1|     3.333|           93|     8.038| 
NOV87 |C     |           13|    28.261|          526|    29.768| 
DEC87 |C     |            6|     6.897|          218|     4.762| 
JAN88 |C     |            6|    12.000|          112|     5.744| 

AUG87 |D     |           22|    51.163|          694|    18.073| 
SEP87 |D     |           17|    42.500|          577|    53.426| 
OCT87 |D     |           13|    43.333|          518|    44.771| 
NOV87 |D     |            4|     8.696|          131|     7.414| 
DEC87 |D     |           52|    59.770|         2649|    57.864| 
JAN88 |D     |           37|    74.000|         1678|    86.051| 

AUG87 |Q     |            8|    18.605|          204|     5.313| 
SEP87 |Q     |           10|    25.000|          190|    17.593| 
OCT87 |Q     |            7|    23.333|          118|    10.199| 
NOV87 |Q     |           13|    28.261|          247|    13.978| 
DEC87 |Q     |           18|    20.690|          520|    11.359| 
JAN88 |Q     |            5|    10.000|          155|     7.949| 

