Wednesday, October 13, 2010

My currently-stewing model of morphology-syntax-discourse (and phonetics and phonology)

This is a very rough sketch, but I thought I'd toss it out as is to see what responses it engenders....

Morphology, syntax, and discourse are all the same thing: structure builders. What qualitative differences exist between them emerge strictly from the scale-derived type of elements they are combining; their fundamental combinatorial principles are the same.

One qualification to this is that morphology is actually two things: a combinatorial component, and a lexical component. These are roughly the morphosyntactic and the morphophonological. The latter in particular covers aspects of morphology that are not directly structure-building, namely, paradigmaticity effects...which are probably attributable to acquisition and retention constraints and strategies.

It's useful in this to see that traditional syntax, as understood through generativist tree diagrams, is quite explicitly the interface between the generic-encyclopedic and the discourses-specific. Functional structure can be of either kind: the former realizes event-argument structure and aspect (both verbal and nominal; nominal aspect is measures, quantification, etc.), and the latter realizes all the features interpretable only relative to the specifics of the discourse, i.e. of the speech act itself. This includes voice, tense, mood and modality, pronominal features, focus, topicality, and clause type.

To understand this clearly: man bite dog is a generic, non-discourse-specific event-argument structure. What we currently call light elements---i.e. stackings of minimal predicates---are enough to constrain the semantics to this realm. Aspectuality doesn't change this: man having bitten dog, man regularly biting dog: all of these are still generic, encyclopedic, non-discourse-specific concepts.

Add in voice, and we begin to have discourse-determined priorities:

man bite dog
dog bitten by man

Add in tense, mood, etc., and we definitely have discourse-determined material, since the semantics added are calculated with respect to NOW, with respect to our REAL WORLD, etc.

man did bite dog
dog would be bitten by man

Definiteness of argments, relativization of arguments, pronominalization of arguments, ellipsis of arguments: all of these refer to pre- or elsewhere-established reference of arguments...which is of course discourse. Same again of course for focus, topicalization, etc.

the man did bite the dog

Clause-type is of course exactly the same thing as the above, even subordination, as subordination indicates NOT being the discourse-Main proposition. Same again for imperatives, which of course are discourse-specific par excellence.

man, bite the dog!
the man that did bite the dog

Phonetics and phonology of course do relate to this: prosody in particular tracks word-level and phrase-level structure quite intensely, and has a rather obviously substantial role at the discourse level. For example, the prosodic weight, and the prosodic specification (clitic, etc.) of morphemes determines their distribution, their position, their availability, their well-formedness in a given configuration. Does the same thing at the phrasal level, acting as the real agent behind the parameterization of word order, among other things. And at the discourse level, it of course manifests quite a bit of discourse morphology, i.e. old/new information and topic/focus and topic/comment contrasts, and very often interrogativity and a thousand other emotional and affective stances.

Phonetics and phonology also share properties with the morphology-syntax-discourse complex in that they too have combinatoriality and configurationality, and often the same modeling tools for the latter work well for the former. Locality, for example. We often think of this as the syntax of phonology, etc., but it's better understood as syntax and phonology both drawing from a common pool of cognitive-computational processes and constraints thereon.

Tuesday, October 5, 2010

Sapir-Whorf for the umpteenth time

So people keep sending me this article from the New York Times regarding what we conventionally refer to as the Sapir-Whorf hypothesis.

And here, slightly edited, is what I wrote to my mother about it:

Man, every time I read these things, I just wanna say: of course no, and of course yes. No in the sense that it doesn't fundamentally limit the range of your cognitive potential. Yes in the sense that a specific hand tool you use every day is going to develop some specific neuromuscular tasks/skills more than others. This question is structurally similar to the nature/nurture debate, and the answer there seems pretty obvious: we are products of both. And as with that issue, the interesting question is not which extreme is correct, but to what extent, and in what specific ways, does each of these factors shape what we experience:. Hence the effects of what Jakobson is quoted here to have observed, about what our languages oblige us to attend to, are well worth thinking about. More on that later.

Overall, I don't see why this kind of perspective is so hard to reach, and why people continue to obsess about the one side or the other.

Also, the line "But it does mean they are not obliged to think about timing whenever they describe an action." is a bit inaccurate: it means they're not obliged to think about time relative to the point of speaking (which is what tense is/can be, among other things); Chinese grammar still requires you to be specific about relative timing, i.e. relative timing of one reported event to another, the relations of before, during, and after, and so forth.

And again, lots of languages do this, or only specify tense when discursively relevant---Penobscot and Passamaquoddy are like the latter---probably because this kind of information is in fact usually eminently recoverable from discourse context anyways.

I remember asking Jay Keyser about WHY we end up with these weird obligatorinesses in language, like tense in English, and so forth (I was specifically asking him about grammatical gender, actually). He told me he thinks it may be that the computational system of language just needs something to grab on to, some way to tag things to make them readily manipulable. I've been taking this to mean something like uniformizing labels for elements, so that the system doesn't have to take every object as wholly unique, wholly on its own terms. Same principle, then, behind the convenience of stereotyping and the efficiency of mass production using interchangeable parts.

But why these particular tags, of tense, of gender, etc.? My conversation with Jay didn't get that far, but I'd venture that it's just because these happen to be some of the more salient categorical features of the things we're manipulating. Which (among others) are, basically, events (= verbs) and entities (= nouns). Relative location in time is a feature that can at least be imputed to all events; as is gender for entities, provided you wander through the universe anthropomorphizing the hell out of everything, which is what humans sure seem to do.

And regarding this: "More recently, psychologists have even shown that "gendered languages" imprint gender traits for objects so strongly in the mind that these associations obstruct speakers’ ability to commit information to memory." I think that what this tells us is something more about what kinds of tools we bring to bear to construct and maintain memory. One of them clearly can be narrative---which, generally, involves language. I wonder if they've in fact shown language-specific gender to actively obstruct memory, or simply fail to be available in the relevant contexts as a powerful aid to memory.

And the Guugu Yimithirr matter, well, you know what my native tongue is [Ed: Well, my mom does], and and I don't actually even speak any languages that work like Guugu Yimithirr or Tzeltal. Yet I pretty much always know which way is north, south, east, and west. Even though my language(s) never demand that out of me on a regular basis. I'm not sure why this is, but it's true, and oddly enough, rather recent for me. Somewhere in the last ten/fifteen years or so I just started to always pay attention to cardinal directions, and I still don't know why. Maybe from moving to so many new towns or something...I can remember all of Bandung, Wellington, Ithaca, Beijing, Boston, Indian Island, Sipayik, and Birkat Al Mouz this way, for example. And Portland [Maine, my hometown], of course. And even Manhattan, which I hardly have spent any real time in---though there, as in Beijing, the fact that the whole city is laid out on a cardinal-directions grid makes that pretty simple anyways.

So I think this is actually a pretty simple cognitive skill that is probably more natural to have than not. I'm thinking that there's of course likely a feedback relationship between this skill and the language-specific demand for it in Guugu Yimithirr and Tzeltal, to be sure---but I think the linguistic phenomenon almost certainly piggybacked on the pre-existing common cognitive phenomenon. Just the same as, say, linguistically formalizing off of our attention to gender, animacy, and other crucial landmarks in our cognitive world. And yes, the orienteering metaphor is intentional.

I'm also suspicious of this example of evidentiality in Matses. In most cases of evidentiality that I know of, it's more of a rhetorical stance than a strict factual requirement. I.e. the guy will tell you that he has four wives, if he's as sure of it as if they were standing there evidently. Perhaps Matses is a language more strict in its evidentiality than those I've studied, but again, most languages use evidential hedges to reflect how the speaker personally views or wishes to portray the proposition's relation to the evidence, and not some purely legalistically objective relation between the two. This flexibility is what, again in my experience at least, makes evidentiality so tricky to research. And I would point out that only in the previous sentence did I realize just how many evidential hedges I've already included in this paragraph. You could attribute it to my working with Penobscot and Passamaquoddy-Maliseet, which both have evidentiality as active components of their grammar, but I don't think any native English speaker would find this paragraph unusually epistemologically sensitive.

So there you go. I think that there is a wonderful wealth of cool and nifty things to study about all of these phenomena, and all of the interesting kinds of normative perspectives that different speech communities have developed around the world. And I'm wicked glad that scholars are making these kinds of respectful, thoughtful, and earnest efforts to share all of this with the general public, and thereby give us all useful material with which happily blow our minds. But I worry that people still over-exoticize this stuff...failing, in particular, to notice just how much of this seemingly exotic behavior they themselves also carry out on a regular basis.

Sunday, March 14, 2010

words as people

I noticed some time ago that we never really forget people. Maybe temporarily a name, or even a face (especially if it's changed from our expectation), but in general: we forget anything else, but other humans---their identities, their basic relationships and features---we just seem to be very good at remembering.

This stands to reason. As deeply social primates for whom other members of the species are perhaps the most important entities out there in the world, having a memory that reliably and robustly remembers other humans is more than just a good idea; it's pretty much crucial to survival.

So what if we tried to bring this great capacity to bear on that most notorious of language-learning memory problems: recalling vocabulary?

What if you started to think of every word you learned as a person? What if you developed a human-like emotional relationship to it? And just generally cultivated a very social-relational sense of it, as we have for humans? Remembering how, when, and where we first met it, who (i.e. other words) it is related to and how?

This makes sense to me in large part because even before this explicit idea, I think I was already doing this to a fair degree. As a lover of words, I often do tend to remember exactly the situation where I first learned a word: who, what, when, where, how, and why. I've no doubt at all that personal relational detail (not to mention the obvious grounding in real-life experience rather than book-learning) greatly helps me recall the word word when needed, and also to FEEL the full of its use and sense (to have the necessary relevant "language-feeling" for it, as they say in Chinese with the term 語感 yǔgǎn). And on top of that, I have always always constantly taken any new word and looked for how it relates to others: by derivation, by shared semantics/semantic domain, by collocation, by humorous phonological similarity (both within the same language and without)...all of these have just been what makes sense to do to meet the word and know it well. As one would with a person. And I would guess that placing that word in this dynamic web of multiple relationships helps fix the form in my memory (with lots of redundancy and reinforcement), and offer many robust roads to its retrieval from memory.

So there you go. Conceptualizing words (and other lexemes, up to phrasal idioms and beyond) as people might be a helpful mnemonic approach, piggybacking off of our handy primate memory for our own kind.

Monday, March 8, 2010

the Person-Case Constraint as feature-structural-level crossover effects

Here's a new idea recently tweaked off of my dissertation work: that Person-Case Constraint phenomena are driven by crossover effects operating not at the phrasal level, but at the feature-structural level.

That is, in the model I developed back in 2006, Person contrasts are built up out of referential dependencies---the most salient of which is that non-SAP Person status (i.e. 3rd person) is in a relevant sense referentially dependent upon the establishment of SAP referents. There are a number of ways to conceptualize the reasons for this; one of the simpler ones is simply that 3rd person is defined negatively as the other, as the non-SAP---even as the reverse relation is NOT the case---and so is in that sense dependent upon SAP referent establishment to be defined at all.

With this basic idea, we can see that, essentially, there is a little SAP component implicit in 3rd person status. Such that if 3rd person is introduced above SAP elements within the relevant locality domain---e.g. the Goal-Theme structure of a ditransitive, among others---a crossover effect results, because you are essentially demanding the representation to interpretation of a SAP-derivative feature structure before establishing the SAP feature structure itself. In short, [3] scoping over [1|2] in the relevant locality domain is bad for the same reasons that a pronoun scoping over its referential source (i.e. its antecedent R-expression) in the relevant locality domain is also uninterpretable.

That's pretty much it for the idea. What I've been struggling with for the past four years or so are, among others, two questions:

-Why is the Goal-Theme construction and its ilk the locality domain for feature-structure-level crossover effects? All I can think here is that these are co-phasal, even as other possible relations involving, say, external arguments, engage enough extra structure to make possible PCC candidates no longer relevantly local. But I need more than that.

-How, how, how, to properly formalize this sense of referential derivativity? All along I've known that it's not exactly the same as referential dependency as it's traditionally discussed, but still has all the same basic properties, and indeed, with this new idea of equating it to crossover configurations, is even more so now.

And indeed, it's what I really like about this idea: a nice example of the kind of fractal self-similarity---in structure, and in interpretational constraints thereon/therefrom---at each iteration of representational structure-building that Boeckx 2008 talks about, and that I find so compelling.

So, any ideas?

[By the way, the general idea here was inspired by something anyone who works in PCC phenomena would be interested to know about: that the Algonquian split between "proximate" and "obviative" 3rd person feeds into this system: obviative 3rd person over proximate 3rd person gives rise to the same PCC effects as 3rd person in general does over SAP persons. It is realizing that the proximate/obviative contrast is just a further iteration within the 3rd person domain of the same "referential derivativity" structure-building that creates the SAP/non-SAP contrast that pushed me towards this whole model of Person feature contrasts in the first place.]

Sunday, March 7, 2010

some ideas (not all novel) about writing systems

Writing systems don't have to show phonological contrasts, they have to show morphemes. Alignments of form with meaning.

That's why logographic systems do just fine, from the familiar Mandarin one (though see the note to the preceding post) to the fascinating and still-woefully-underappreciated-for-its-significance to-the-linguistics-of-orthography Mi'kmaw komqwewi'kasikil system. They write morphemes, or subsets or combinations thereof.

That's also why even alphabetic writing systems dribble with morphological sensitivity.

English has the -s' vs. -'s singular and plural genitive contrast so beloved by pedants and so hated by regular folks (especially when adding to that the fun of adding to -s-final wordforms).

Arabic has its tāʾ marbūṭa, has its dropping of the alif with li- but not with bi- (justified, yes, but largely on graphic grounds rather than phonological ones), its floaty little alif after verbal 3plural -ū.

Irish writes "an" all the time for definite article a(n), even as it applies basically the same rules as Welsh y(r), whose system goes the purely phonological route.

Doubtless many more examples exist---certainly English heterography is: while often historically motivated by a original phonological difference, now it's being exploited purely morphologically. I have a three-way merry-Mary-marry distinction, but I wonder how many who phonologically contrast two or less of these then have this just filed under morphological graphic (morphographic?) contrast.

Graphic elements need to contrast morphology. They can do it by representing phonological contrasts---since those carry out said task pretty well in speech---but they are not limited to that. And tend to happily mix back and forth between this kind of phonologically mediated graphic-to-morphemic contrast, and a more direct graphic-to-morphemic contrast.

Perhaps if modern linguistics hadn't developed largely from a scholarly world steeped in strong phonology-to-orthography mappings---by this I mean Pāṇini, not English, for example---we would not tend to think, rather misguidedly, that writing is for marking down the sounds of languages.

It is, rather, for marking down words. In the loose sense of the term, i.e. morpholexemes of all shapes and sizes. And whatever works, works. As the evolution (convergent and divergent) and diversification of human-used orthographies tends to show.


And, as Rich Rhodes has helpfully pointed out, when it comes to representing phonology in writing, salience very often outweighs contrastiveness.

-Hence no prosody, either of the lexically contrastive types incompletely listed below, or especially of intonation. [Pretty much everybody, barring performance-detailing religious texts]
-Rarely if ever tones, and most often from their original segmental sources, and/or serious post hoc scholarly metalinguistic devisings. [Thai, Burmese, etc.]
-Often no vowels. [Tifinagh]
-And if vowels, only the salient ones, namely the long ones. [Arabic, etc.]
-No codas, even if there are at least some. [Sulawesi-area lontarak, and lots of other syllabaries]
-Weird treatments of obstruent phonation contrasts. [Old Irish, Mongolian (in both cases perhaps just because the donor orthography isn't well-matched, but still....)]

syntactically categorizing light elements as computationally classifier elements; and why we have classifiers at all

In neoconstructionist models, syntactic categorization is arrived at through the addition of light elements to categorically unspecified Roots or Root complexes. The effect of these elements is strikingly similar to nominal classifiers, which explicitly constrain a lexical element to nominal syntax and interpretation. Comparably, light verbs also , and indeed many light verb systems are described as verbal classifiers---predicate classifiers, that is, and not to be confused with the nominal classifiers that can incorporate into verbs.

So who cares? Anything that systematically changes and/or reduces something else to a new set of properties (particularly in constraining its semantic range) can be loosely splattered with the "classifier" label, to no usefully precise effect. Combine [green] with [book] creates an entity that belongs to the class of [green things]. True, but no real help.

But.

There's something more to this. We need to step back for a moment and ask why languages so recurrently develop classifier systems of the more familiar, unambiguous type. I think it is because they present a pretty optimal solution to problems both of lexical retrieval and of syntactic combination.

In lexical retrieval, classifiers optimize a lexical search algorithm. If you have a lexicon of 1000 distinct lexical items divided into 10 classes (for sake of argument, all evenly divided into 100 members each), then at the first sign of a classifier, the search algorithm can immediately ignore 900 out of 1000 possibilities. That's pretty sweet, but add to it also the neuropsychological component to boot: obviously through association, a classifier is going to have a rich priming effect for whatever its classifiee is. That's a double whammy of lexical retrieval optimization.

And for the computational system, just manipulating relations between these elements, reduction of the full lexicon to a few categories means that said system can contentedly process the same set of five or so kinds of boxes all day long, contentedly ignoring any complications from the richer semantic contents inside each. This is why stereotypes are so pervasive to human cognition: because they're really really efficient, provided they're actually accurate.

So that's the value of a classifier system in two directions. And you can see from the above fairly easily that the features of a classifier system that facilitatate syntactic computation/manipulation could be written again for syntactic categories with no trouble.

The idea, then, is not to equate syntactic categories with classifiers, but to simply say that they are the same kind of system, operating at different scales of complexity in syntactic representation, but always driven by the same basic information-theoretical and biosystemic properties. And in so doing, making for a rather simple mechanism for grabbing lexical items, putting them together, and reading off the resulting relations that hold between them.


[It's worth noting that this lexical access optimization effect of classifiers is rich enough that it arose not just in the spoken Sinitic languages (among others), but also in the graphic lexicon of Chinese. The vast bulk of the lexicon of Chinese characters consist of a phonological element (a pre-existing character, notionally familiar for its sound, though now in practice often a fair bit far afield from the user's actual spoken form) accompanied by an additional character/component that acts as a semantic diacritic.

In short, learning for the first time and/or recalling a previously learned Chinese graphic lexeme, you see a phonological cue (usually not terribly worse than the incomplete/inconsistent phonological information supplied by your average English alphabetic spelling), and then a semantic classifier that allows you to radically narrow your search space to [THING WITH THIS RANGE OF MEANING] that has [THIS ROUGH PHONOLOGICAL FORM]. So again, a classifier system optimizes lexical retrieval: our Encyclopedia can be and usually is orgranized into thematic chapters with nice bold headings.

Interestingly, adaptation after adaptation, innovation after innovation, and error after error over the many years of use of Chinese characters shows that the phonetic component is perhaps the most robust, salient aspect of the system (thank you, John DeFrancis, among others): nonstandard usages regularly borrow familiar characters strictly for their phonological similarity, ignoring the original semantics also encoded with them. A triumph of form over content, if you will, and a highlight of the tendency for this cognitive system (like many) to happily manhandle any "real" material into an abstracted symbol. Just as in spoken languages, classifiers aren't entirely crucial, at least at the purely lexical retrieval level; our spoken and graphic lexicons can get by without them just fine. They just help.

Not so sure if our syntactic systems could, though.]

teiscinn

Okay, so now a bit more intro.


My thinking tends to be a bit 雜 zá 'random,ecletic'; i.e. in my case a nicer way of saying 'not very disciplined'. Hence this blog will likely tilt towards---"careen around" feels a bit more honest---the surface gloss I like to give to the Chinese literary genre called 雜記 zájí. Namely, 'random notes'.

An undisciplined mind: this is why I tend to perish more than I publish.

But the 雜 , that's why we are here, way out in the open ocean---to slosh and splash about, and see what floats and what sinks. (I'm in the basement, mixing up the metaphors....)

So this is a place to hoot and holler about the fun-ness of this wacky phenomenon we call language. Here is where I will happily gibber about the ideas that pop into into my head, and encourage others to similarly let loose their voices as linguists.

Wednesday, March 3, 2010

namings

Oh, and the name. Well, it's likely to be modified, because I'm not even sure that the grammar is correct. You try getting genitive case and plurality correct after numerals in Irish, especially with funky little subclasses of nouns with regard to quantifications, and idiomatic exploitation of all the morphological possibilities.

So yes, this might change.

And as for why, well, Lameen, this one's for you.

startings

So I find that so much of what I want to say in my field isn't formal enough (yet!) to put in the exacting form of decent publishable paper. Welcome, then, to my workshop. Mind the dust; I think there's a chair over there that you can sit on if you clear all the stuff off of the top of it. No, it's okay, you can just dump it right on the floor there. If you can find an open space.