Assessing a text-critical database using AI techniques(1)

Yo Tomita
School of Music, Queen's University of Belfast
Belfast BT7 1NN

A paper read at RMA Study Day, 30 November 1996, at King's College, London


This paper examines some of the common problems identified in the field of text-criticism in manuscript studies with the view to the assessment of both particular variant readings we find in the sources and the sources themselves as to how they can be located in the stemma of sources. To resolve these problems, we should make various attempts to ascertain the origins and authenticity of these readings. The statistical analysis of the gathered evidence has always played a vital role in our studies.

This paper then proposes a 'new' conceptual architecture that can be applied to the way we work in order to solve the problems more efficiently. This architecture externalizes various levels of human knowledge and their cognitive activities in relation to the copying of music manuscripts. Various problems and limitations in our conventional methods in the domain will be addressed as to how they can be dealt with by using AI techniques, even though we are using the same data set obtainable through the traditional methods.

Source Studies: Traditional Approaches and its Limitations

The recent source studies of J. S. Bach's Well-Tempered Clavier (hereinafter WTC), Book II have reached to the stage where we now come to confront the remaining, unresolved issues, which are so complex that they now seem beyond comprehension by traditional scholarship.(2) This article attempts to discuss about the problems, our past endeavours and how they can be resolved by using more powerful facilities that computer may be able to offer.

The background of the problems is two-fold: firstly, it appears that Bach did not complete WTC II in the way he had done with WTC I by producing the Fassung letzter Hand; rather he left WTC II incomplete, somewhere in a semi-final stage, where he kept on entering many sporadic revisions in either or both of two separate sets of autographs, and for some reason, Bach did not write the final definitive version.(3) This type of source situation creates enormous difficulties for the musicologists whose primary objective is to produce an edition of a single authentic text reflecting the composer's final thoughts.

Secondly, the study of the variants attested to the surviving MSS indicates that the majority of over 130 surviving sources of WTC II were copied from those no longer extant.(4) This means that there are little clues in the sources about the context in which the variants were first introduced, and that, as a result, it becomes much harder to identify their precise origins. Doubtless this affects our interpretation on the issue of authenticity.

In spite of such predicaments, we were not too distressed with the progress we have made so far. Beside the recognition of the problems itself, which was the first real significant step forward for the future, we have also come to possess an overall impressionistic understanding of what actually might have happened in the history of the work's transmission. For example, we now believe that Bach treated carefully not to mix-up two different sets of autograph scores, for the reason that he seemed to have distinguished them for the strategic purpose of compilation of the work; we also believe that he frequently made available his own MSS for copy making when so desired or required even during the compilation of the work, and that he entered revisions not only into his own scores but also to his pupils'. The work soon acquired the fame and admiration that resulted in the extensive dissemination of the work through the copying of MSS. This must be a fair assessment of the work's transmission history, albeit much generalised, for, on the one hand, there is evidence of Bach's continual revision processes recorded in the autograph, which are fairly clearly reflected in the surviving copies, and, on the other hand, complex arrays of variant readings are clearly identifiable in the form of a genealogical tree with many branches of distinct characters.

However, this kind of approximate reconstruction of the events is not compatible with our ultimate goal, namely to reconstruct the text intended by the composer as accurately as possible. When we zoom into the problem, the image we can capture is often a completely blurred one, instead of getting much finer details of it.

The fact of the matter is that we often do not have sufficient information about the variants we find in the sources, because we cannot estimate securely the following aspects:

  1. who introduced the variant (whether the composer himself as later revision or by someone else whose copying skill and competence may be doubtful);
  2. where it was first introduced (whether it was in the source we have identified or in one of the previous generations of sources which are lost); and,
  3. how and why it was introduced (by accident or on purpose; in what musical and notational context).

Our foremost task, therefore, is to ascertain both the origin and authorship of individual variants. To achieve this, we need to obtain sufficient information about every variant reading so that we can assess it accurately. How can this be possible? Can we achieve this without the need to increase the amount of information we currently have? My hypothesis is that it should be possible by using more rigorous and powerful mechanisms to assess and evaluate information in variants in both systematic and objective fashions.

Compilation of database containing text-critical information

In order to achieve the above goal, a database of variant readings was compiled and published.(5) In compiling the database, I have selected all the text-critical information which I considered potentially useful for the assessment of sources from a statistical point of view. In practical terms, this involves the collation of divers types of variants from individual sources, whilst analysing them to reveal a 'character' for each source. The latter, namely the data obtained from the analysis of variants, is also considered as a part of data in the database, though this has left some unresolved problems.(6)

The data were presented in the form of a large two-dimensional table: data belonging to a particular MS are kept along the horizontal axis, while the points of examination in the music are kept in a set sequence on the vertical axis. In the table, all the manuscript sources are grouped and ordered according to their genealogical affiliation.

Because of the way in which the data are organised, the presentation itself illuminates certain textual characteristics reflected in various levels of genealogy, from the higher level of tradition to the individual group of sources.

As in any other text-orientated databases, the manner in which the data are selected, stored and presented dictates, to a certain extent, the way they are to be extracted and evaluated. In other words, the data make the designated sense only when they are viewed in the correct context. The individual datum is referenced to the unique locational information by means of the record headers. This is the lowest level of the context where one can identify the purpose of the enquiry, or 'query' in database-specific terms. The data syntax is largely determined by the nature of the query, and identifies the noteworthy character in the reading of one source or group of sources. The data viewed at this magnified level may amount to no more than a description of the error or variants at a particular examination-point. However, when a finding is assessed together with the other aspects of text-critical examination, namely, its musical significance and its contribution to the overall character of the source, one can see the wider context and establish its impact in the assessment of the source.

Study of Variants

Variant readings were studied from two different angles. Firstly, variant readings were classified according to musical grammar into 'errors' and 'variants' in order to measure the scribe's performance. Under normal circumstances, the errors are easily identifiable. Thus 'variants' are valid readings, both musically and notationally, even under such situations when they appear to distort the original intentions of the composer. Sometimes the distinction between the two could not be made explicitly, due to the inherent nature of errors as being misrepresented readings, which may appear to have been valid, and sometimes inspired, alternatives.

In the same way, amended readings were classified into two equivalent categories, 'corrections' and 'revisions'. As both of these involved at least two layers of readings, the sequence of events was 'coded' into a single cell of information (datum) within which the multiple layers of readings were separated with arrows showing the direction of the change. The datum also included information about the types of ink and quill employed, when such information was available, so that we could use it to distinguish the layers of revision in the statistical analysis. This lead to the revelation of the inherent, chronological information. Also integrated in the data structure were the scribal techniques in which these amendments were recorded, as this must be closely associated with the purpose underlying such action. All this information is systematically organised and tightly packed into the space allocated in the table, using abbreviated codes and syntax.

Secondly, variant readings were also assessed from non-musical aspect of sources, namely 'notational' information. This furnishes important supportive evidence to extend the faculties in this text-critical enquiry, because that which normally appeared to be 'musical information' was in fact viewed 'statically' or 'notationally' on the score by the scribe. Based on this hypothesis, some of the notational information were also extracted as evidence for our interpretation. In this study notational information covered any notational preference relating to musical notation, such as the layout of the music, the use of temporary change of clefs or staff and a particular style of beaming and stemming, and above all, the validity of the effective duration of accidentals.

Thus there are two basic types of errors or variants musical and notational. These basic types could be further subdivided into smaller branches and sections according to the genealogical origins of sources and the musical significance of the readings. In most cases the listing of data itself became self-explanatory, as each reading is projected among others in its musical context, thus exposing the essential nature of the reading.

Pitch indication in the database should, for instance, be considered to contain information of both musical and notational significance. In principle, the pitch indication transcribes what is represented in a manuscript without losing these valuable qualities: for example, if an accidental played an important part in the query, its presence or absence was specified. This is because in some instances pitch is neither absolutely indisputable nor intended by the copyist to be so, and sometimes examples of pitch were very puzzling due to the confusing array of notational conventions used in the manuscripts reflecting multiple layers of notational conventions piled up as copies were being made.

Pitch is also staff dependent, and so the origin of a pitch error can often be ascertained from the notational perspective. For example, there are cases where notes are so casually written in the space between R.H. and L.H. staves that one cannot easily determine the staff to which the note should belong. As a result, there is a high chance that the pitch of such notes will be read incorrectly. To deal with this kind of text-critical matter, we must accept the nature of data as being 'ambiguous' and, at the same time, permit the recording of such locational information in the data entries, so that we can trace any possible genealogy or consequence that this vague notational disposition might have, being either influenced by a particular lost source, or, conversely, erroneously influencing the texts of the later generation of sources. It may be worth supplementing here that the same problem also occurs when R.H. and L.H. staves are drawn too close together.

Under the flag of text-critical enquiries, the notation in the manuscript sources is therefore expected to contain various kinds of invaluable information:

  1. the environment and situation under which the initial copying, as well as later revisions, took place;
  2. the diplomatic policies that might have influenced certain presentational aspects of the score; and
  3. the musical tradition that is reflected in the sources, such as the established theories of various times and regions, which must have surely influenced not only the way the music was written but also the resultant variant readings resolutely or inadvertently introduced.

Having examined some issues concerning the qualitative value that one can find in the data presented in this database, it is important to realise its limitation. As the notational information in fact covers several discrete aspects of copyists' considerations and states of mind, there is currently no pragmatic way to quantify accurately the degree of probability of the origin of every error or variant, which now seems beyond our comprehension. Without further facilities to calculate the complex array of data and to measure the degree of uncertainties associated with each event, what remains possible for us would only be the tentative structuring of statistical information and the tentative reconstruction of the historical environments from the available contextual evidence, where, admittedly, a subjective judgement has to play a deciding role.

In order to discover unforeseen possible explanations of an event both objectively and accurately, we ought to change the way we work, in which we can use the more powerful facilities to calculate a complex array of musicological data and to give more realistic outlook of the events which took place a few centuries ago.

Expert System and its Architecture

To overcome the problems in text-critical studies discussed so far, we need to devise radically different assessment-mechanisms that not only possess the expert-knowledge of the domain but also have both the abilities for reasoning and the capacity to process complex arrays of statistical information.

Computer programming in the field in Artificial Intelligence has opened new possibilities in assisting data assessment, such as the quantitative assessment that are drawn from a complex array of qualitative data. The computers' ability to work with the assessment process and knowledge building to produce robust, reliable results is particularly suitable to the type of situation where a set procedure is predetermined by musicologists. Building an Expert Systems seems an obvious answer.

Stage 1: Diagnosis of the Problems and the Solving Techniques

Our first task would be to make sure if the problems can be solved adequately by using conventional methods in AI, such as statistics and algorithms. The most fundamental problem in our study resides in the fact that the assessment has to be conducted at two levels, namely the assessment of individual variant readings and the evaluation of the sources themselves. These two phases of examination are sometimes dependent on each other: for example, the use of specific kinds of cautionary accidentals is related to the chronological information of the source itself, which in turn becomes a valuable evidence for its locational information in the stemma of sources. Under such circumstances, failing to accrue sufficient knowledge from one side to reason the other side means failure in the assessment. There is a potential predicament in assessing the variant or in validating their relationship in that the database may contain either inconsistencies in the part of scribes or errors in the data-set. The treatment of 'noise' therefore has to be an integral part of this Expert System.

During the process of assessment, we need to work with diverse issues concerning the qualitative value one can find therein. This often leaves numerous unresolved questions. For example, a pitch variant has to be evaluated from many different angles, such as the surrounding melodic materials, the harmonic function of the note, and the structural significance of the note in Schenkerian sense. Here we require the mechanism to deal with relevant topics that have the capacity to process reasoning with uncertainty, such as a probabilistic reasoning mechanism to perform statistical calculations.

In addition to such purely musical matters, the notational information also becomes an essential part of data for analysis, not only because it often bears chronological information reflecting the changing theory of musical notation, but also because it can reflect several discrete aspects in copyists' considerations and minds, such as copying policies and psychology at the time of copying. These invisible non-musical matters are manifested in the notational form whereupon some sort of mental activities in the scribes' mind operates in two contrasting directions, (1) positive pursuing mode, and (2) negative, constraining mode. We need to measure them as accurately as possible, perhaps using certainty factors, similar to the mechanism used in Mychin expert system (Shortliffe, 1976),(7) so that we can estimate consistently what the scribe who initially introduced the variant meant at the time when s/he wrote it, and that we may be able to determine the levels of her or his consciousness or the likelihood of this particular variant being deliberate or in fact an accident. These inherently uncertain, complicated, and ambiguous natures of data, which has been musicologists' central dilemma, can thus be assessed by using AI techniques.

The system should be capable of handling such source situations where their relationship is very obscure (whereby too many intermediate sources are apparently no longer extant) as well as extremely complex (whereby a lot of contamination have occurred in the lost intermediate sources).

Stage 2: Knowledge Elicitation and Representation

Having identified the problems, the next task is to decide what knowledge to include. We need to formulate clearly two types of knowledge declarative and procedural: we know what note-heads are; we also know the fact that 'the position of note-head determines the pitch'. The capture and compilation of this sort of factual knowledge can be relatively straightforward. On the other hand, we also need to have knowledge about how an ambiguously positioned note-head causes a copyist to write a note at the wrong pitch and how this happens. This type of knowledge consists of a set of specific instructions, which are more difficult to define. Exhaustive compilation of the knowledge is required for the successful operation of the Expert System. We would also need to work out the selection of appropriate reasoning mechanisms and search strategies in meta-level knowledge architecture proposed by John Self.(8) The process of knowledge elicitation can prove to be disastrous if some essential knowledge is missed out.

Definition of knowledge has to be clearly expressive and effective. Some knowledge can be gathered using the idea demonstrated by Version spaces (Mitchell, 1978).(9) It shows how we can find rules (algorithms) from the working example, namely in the database in our case. For example, the rules to classify the accidentals can be sampled in this way; the essential attributes for the symbol are as follows: types, affecting pitch, cancel the modified pitch, cautionary, superfluous, harmonically valid. When sufficient samples have been gathered, they can be notated in rule-based procedural form, as shown in Example 1:


what is the type of the accidental? (*, $, #)
does it affect the pitch?
    if yes, then does it cancel the previously altered semitone,
      or re-alter the modified pitch?
           if it cancels, then
                what is its significance?
                    (1) distance from the previous accidental (specify) 
                    (2) harmonic activity between the notes (specify) 
           if re-alters, then 
                what is its significance? 
                    (1) distance from the previous accidental (specify) 
                    (2) harmonic activity between the notes (specify) 
      if no, then is it cautionary or superfluous? 
            if cautionary, then is there a note previously modified? 
                 if yes, then 
                     where is the previously modified pitch? 
                     is it truly essential in the context? 
                 if no, then what is the significance of the accidental as cautionary? 
                    (1) unusual interval with the preceding note (specify) 
                    (2) unusual interval with the following note (specify) 
                    (3) spelling out a specific minor scale (specify) 
            if superfluous, then 
                 what is the reason behind it? 
                    (1) confusion of k-s at the passage (specify the distance) 
                    (2) misplaced accidental from the surrounding area (specify) 
                    (3) must be an error 
does it make sense in its harmonic context? 
      if yes, then 
            name the chord 
      if no, then 
            note the pitch in relation to key-signature 

However, the decision process by copyists can be very complicated, for there can be multiple factors for their consideration, some of which could be weighed differently and others could give contradicting desires. Thus it is necessary to measure the probabilities in all the possible ideas that contribute to the decision whether or not to supply an accidental at the very spot on the stave. By taking the approach of neuro-scientific model of simulating the decision making process, our example can be represented in the form of neural connections shown in Example 2:


Among various possible factors a copyist may consider when deciding what to do, we may be able to identify that there are three possible factors in decision making, each of them holding its own input strength (the degree it contributes to the decision of the action). When the total strengths exceed the threshold, the decision takes place. In this way, it is essential that the mechanism is developed in order to measure the strength of each factor. Thus analysing copyists' activities into all the possible behavioural factors that contribute to their decision making in their activity seems a more suitable approach than using a conventional Expert System whereby a pre-defined result is extracted from a database of collected samples of knowledge. Therefore the analytical module should be designed and implemented to extract algorithm from each session of data assessment.

Stage 3: Programming an Expert System

When the two previous stages are cleared, an Expert System will finally be programmed. Although it is not viable to predict until all the problems can be clarified, the relevant knowledge formulated and the reasoning mechanisms established, the actual Expert System will probably be best constructed by using forward reasoning technique (bottom-up), for the database of variant readings already contains all the information about the variants, and we attempt to move towards the desired, but unknown goals.

The result of the work can be ultimately implemented as a computerised reference system (an encyclopaedia) of the work under which all the variant readings are extracted from the sources. It is envisaged that the system is capable of explaining specific variant readings from many different angles, such as the background of their origins and its musical and notational significance in the context, together with the on-line presentation of score and sound, with further innovative facilities to demonstrate the most minute details of source characteristics, relationship between sources, and the work's dissemination and transition in the historical perspective.
Converted to HTML on 23 March 1997