Issues of philological interpretation

The texts (McQuown 1971) appear in the form of a database as shown in (1).

(1) IBM codification
a. &TO MAT|7AK|$|NI|I’| $|7AN|QUU’| PA$|TA|XUU|NAN|QUU’|
PA$|NI’7| , N|AA|L’|$|TU|KUT|QUU’| . &ID (2.6.) &=SP DIZQUE
b. &TO WAN|QUU’| PII|NII|4AAN|H|$|WA|NII’T| . &ID (2.2.) &=SP
c. &TO WA|A’| TUU| $|WA’N| PAALI’|, 4AAN|H|NII|4AA’N|, WA|A’|

Once consistent changes are applied, the lines look like (2). The texts, as the reader can observe in (1), are bilingual Totonac-Spanish. The English translation is mine.

(2) a. mat|ʔak|š|ni|í| š|ʔan|qúː| paš|ta|xuː|nan|qúː| paš|níʔ|_, n|aː|ɬ|š|tu|kut|qúː|_.
‘dizque cuando se iban a meterse a bañar los puercos_, ya no se salían (del agua)_.’ [2.6]
‘it is said that when the pigs went in to bathe, they didn’t come out of the water anymore.’
b. wan|qúː piː|niː|λaːn|h|š|wan|íːt_.
‘Dicen que no era bueno._’ [2.2]
‘They say that he was not good.’
c. wa|á| tuu| š|wán| paːlí|_, λaːn|h|niː|λáːn|_, wa|á| š|lá|y
‘lo que dijeran los padres, con razón o sin ella, (su dicho) se cumplía.’[11.9]
‘whatever the priests said, right or wrong, it was complied with.’

McQuown’s analysis of the language is implicit in this system of graphic representation—the white spaces, the blank space before periods or commas, the graphic stress marks, and the vertical lines in (2).

In my own interpretation of the texts, I have departed from McQuown’s use of vertical lines, which, from a contemporary standpoint, imply the conflation of different levels of analysis and, quite often results in hypersegmentation, well beyond a synchronic morphemic analysis. The vertical lines in (2) separate items that are not homogeneous, sometimes the segments represent morphemes but sometimes they indicate postlexical prosodic processes. In (3), I gloss two small stretches of the examples in (2), providing a second line with the phonetic representation, a third line with the underlying phonological and morpho-syntactic representation, to show what I mean.

(3) a. wa|á|
b. λaːn|h|š|wan|íːt
/λaːn š–wan–niːt/
good PST–be–PF
‘was good’

In (3a), the second “a” does not represent a morpheme or a segment, it represents instead a phonetic lengthening of vowels that occur at the end of Accentual Phrases (AP) that in medial position within Intonational Phrases (IP). What is represented as “|á|” is clearly not a morpheme. CTot has both long and short phonemic vowels. However, a prosodically lengthened long vowel measures more than double the duration of the same long vowel (in the same lexical item) in a context without the prosodic lengthening; the phonetic fluctuation of vowel duration is far more intricate than that, though, because the phonetic duration of both phonemically short and long vowels depends both on the foot structure and the larger prosodic contexts (see Levy and Hernández‑Green 2018).

In (3b), the “|h|” in McQuown’s representation indicates a process of lengthening and devoicing of resonant consonants (at the end of Lexes that are in medial position within Accentual Phrases). Again, this “segment” is not a morpheme, it represents the output of a postlexical prosodic process.

The morphemic analysis intended by the vertical lines hypersegments from the point of view of contemporary practice. Consider, for instance (4), taken from (2a):

(4) ʔak|š|ni|í|

McQuown gives four “segments” for this word, since elsewhere in the grammar the partial forms correspond to either productive morphemes or are segmentations consistently recognized in the lexicon even though synchronically they do not represent either productive or transparent processes anymore. The first one is ʔak-, elsewhere in the language a body-part prefix denoting ‘head’. In fact, McQuown systematically encloses any sequence /ʔak/ in two vertical bars, irrespective of lexical context; we might speculate that this was his hypothesis of the historical source. The second element enclosed in vertical bars is š-, which in many other contexts appears in sound-symbolic alternations of /s~š~ɬ/, graded for intensity, endearment, etc.¹ The third element, –ni, has the shape of a very frequent nominalizer. And, finally, /i/ represents the AP-final, IP-medial prosodic lengthening. It is clear that the synchronic meaning ‘when’ cannot derive compositionally from its putative components, so this combination is completely lexicalized. The morphological analysis in the annotation of the texts presented here is my own.

Most puzzling was that the strings that McQuown wrote in between blanks are not homogenous from the point of view of their lexical components. Some are monosyllabic words. Others are polysyllabic, but recognizably single words. However, many of them are clearly phrases, composed of several items that are each words by morphosyntactic criteria, as in (5a) from (2b) and (5b) from (2c):

(5) a. wan|qúː piː|niː|λaːn|h|š|wan|íːt
wan–quː piː niː λaːn š–wan–niːt
say–PL.PRT that NEG good PST–be–PF
‘they say that it was not good’
(5) b. λaːn|h|niː|λáːn|
λaːn niː λaːn
good NEG good
‘good or not good’

At first glance, one might think that McQuown wrote elements like the subordinator piː, or the negation niː, attached to what follows because he considered them proclitics. That is certainly one possible criterion for writing. Analysis of the audio was necessary to test whether this was a tenable hypothesis. It was not. Besides being clearly morphosyntactic words, they are also phonological words. In Levy and Hernández-Green (in press) we show that the phonological word in CTot is defined as the domain of foot formation, the domain of the main accent and, under well defined conditions, of the secondary accent. Components like the subordinator piː, or the negation niː are their own domain of foot formation and of primary stress.

Aside from elements that could be thought of as clitics, it was apparent from cursory inspection that many of the sequences between blank spaces contain several lexemes belonging to content-word classes. (5a) contains a sequence of the adjective λaːn ‘good’ followed by the inflected copula, š–wan–niːt, which starts with the outermost prefix of a verbal word, the past tense prefix š-; the copula, therefore, is not an affix or the second element of a (typical) compound. By morphosyntactic criteria (5a) has four words: piː niː λaːn šwaníːt (that_neg_good_he.had.become). (6) shows a more extreme example of another copular construction, a subordinated nominal predication, written as one sequence, without any medial blank spaces (“_”).

(6) _maːskinaːniːlakwanhlak¢amaxanhšwanquːníːt_
maːski naː niː lakwan lak–¢amaxan š–wan–quː–niːt
although also NEG best PL–girls PST–be–PL.PRTPF
‘although also they were not the best girls (of their town)’ (Mc 8.42)

Notice that there is only one graphic stress mark in the first line of (6). Furthermore, it is not only the case that there are sequences of 13 syllables between blank spaces bearing only one graphic stress mark, but the stress mark also appears in monosyllabic sequences between blanks. For instance, the first and third sequences between blanks in (2c), waá /waʔ/ ‘that’, and šwán /š–wan-yaː/ pst–say–pfv ‘he said’. Whatever the stress diacritic was meant to represent, it furnishes a consistent graphic clue in McQuown’s representation: there is one per sequence between blanks, with a few lexical exceptions. The major puzzle was what had been McQuown’s criteria for the blank spaces in his graphic representation.

The short answer to the conundrum is that McQuown transcribed using segmental phonological clues to the exclusion of everything else. CTot has a host of postlexical segmental boundary phenomena that signal four levels of its prosodic hierarchy, and McQuown devised a graphic representation that quite elegantly conflates the four in one line (see Levy 2015). He put blanks at one of these prosodic levels, recognizable among other things by the epenthesis of a nasal if the sequence starts with an obstruent consonant, and by a prosodic lengthening of a final vowel when the sequence ends in a vowel. Both the epenthetic nasals and the prosodically lengthened vowels are perceptually very prominent. While McQuown defines the stretch between blanks as the word, a phonological unit of CTot (1990:93), in prosodic terms it is rather akin to the level of the Accentual Phrase (Jun 1998; Igarashi 2014).

Two factors may have lead McQuown to choose such a representation: 1) having trained with a heavy emphasis on Indo-European, he must have been conversant with Sanskrit sandhi processes and, therefore, he was sensitized such phenomena; 2) even though phonologists of the era discussed juncture phenomena a lot (Aronoff 1980), the late thirties and early forties of the last century were a time in which phonology was considered an autonomous component of language, to be described prior to and independently of morphosyntax. So methodologically he could not appeal to what today we would call the morphosyntactic word. McQuown took the segmental phenomena that define the Accentual Phrase as his cue to put in blanks. Furthermore, from the examination of a printout of the IBM texts proofread by McQuown, it was clear that he insisted on the low dash before commas and periods, inserting it by hand whenever it was missing. So the blank spaces before punctuation marks were clearly meaningful. And, in fact, they signal phonological processes that define the domain of the Intonational Phrase.

The collection furnishes a corpus that is implicitly marked for the prosodic domains of the language. I present the texts here with one line added to the traditional interlinearized format, the second line in which I make McQuown’s implicit prosodic analysis explicit for the reader.

  1. Sound symbolism is pervasive in Totonac-Tepehua languages, and it has been noted and sometimes described for languages in all branches of the family, Levy 1987:115-130, for Papantla Totonac; Bishop 1984, for Xicotepec de Juárez Totonac; McQuown 1990:66, for Coatepec Totonac; MacKay 1999:113-114, for Misantla Totonac; Beck 2008 for Upper Necaxa Totonac; McFarland 2009:62-66 for Filomeno Mata Totonac; Watters 1980, for Tlachichilco Tepehua; Smythe Kung 2006, for Huehuetla Tepehua.