The perils of fieldwork, part two (now of three)

Well, the promised tale of a scrape with death is going to have to wait a day or so, as I have an intervening update to make. Fortunately, it stays within the theme of poison and pain; but this time the poison is self-administered and the pain well-deserved: I got drunk.

Since getting here — with the exception of a weekend visit to the coast — I have been the model of moderation, taking no alcohol and generally behaving myself. To what extent this has been due to the slowness with which my contacts network has developed, and to what extent due to my own wisdom and careful living, I leave the reader to judge. But an additional factor certainly is locality: on my block there are three bars, and two of them are, well, probably a little too scuzzy for a nice, well-turned-out boy like me. The third just never seemed to be open until the last couple of weeks when, in walking past, I noticed that not only was it open, but the floor seemed to be unsticky, the tables to be clean and a general smell of stale cachaça to be absent from the place. Additionally, and highly temptingly, the guy running it seemed to regularly sport the previously-mentioned mineiro-style hat.

So on Saturday afternoon I popped in for a quick beer and a chat. And on Sunday morning, I gently staggered out [+beers], [+amigos] and [+informants].

Mike, the hat-wearer — I think it is safe to name him as the chances of him being an informant are zero — is Swedish, married to a Brazilian, and has just taken over running the bar. I got on extremely well with both of them and both were very interested in my research. Ana has lived here on and off most of her life (her father is a politician) and though unsuitable herself as an informant took down detailed notes of my selection criteria, and promised to work her way through her address book. Mike is a musician, meaning that the bar has a lot of regulars who are musicians and apparently many are mineiro as well. Of course, telling me this could simply be a ploy to encourage my custom.

But promisingly, I have three potential informants from the evening itself — all of whom are regulars at the bar, and so will be easy to pin down. One of them, whom we shall call “Chilly Willy,” moved here at the age of nine almost in the first days of the Brasília project, and is now 64. We had quite a long chat about language attitudes and the like and he had some interesting opinions. He is there most days from fairly early, as he generally gives a hand with the opening of the place, and Mike has told me that if I have any interviews around opening time, he is happy for me to turn the music off and let me use the back room of the bar for recording. Catching at least two of the new potentials early is the day is going to be essential, unless I want to have to factor in the effect of three or four glasses of cachaça to my analysis.

I should stress now that, of course, one boozy night does not necessarily great and lasting friendships make. So like a good scientist I popped back in again yesterday for lunch (a very strange fist-size dwarf chicken) to take a second observation, and all the readings seemed pretty much the same. So, tomorrow I shall be trying to catch Chilly Willy, before he gets stuck into the cachaça, and record him.

In general, having a friendly local bar-owner could be extremely useful, particularly for recruiting from the lower socio-economic sections of the populace. I am aware that this is, of course, Brazil; and there is something of a divide between the middle and working classes. A slightly down-at-heel bar stuck between two very down-at-heel ones actually — to be aggressively realistic — strikes me as a perfect place to meet and get to know a much wider range of people whilst minimizing the very real risk in urban Brazil of getting myself targetted by banditos.

So, a highly productive — not to mention fun — weekend was had, and after a slow initial few months, I feel much refreshed and more optimistic about completing all of this on time.

Next up, the delayed tale of a brush with death, which is what I know you all wanted to hear about anyway, instead of my minor inebriate jaunts.

The perils of fieldwork, part one

To be honest, the “work” part of the title is not accurate: both of the events to be discussed occurred during a couple of days’ break I took to visit friends on the coast. Both feature rather poor-quality images, which I shall — did you guess? — use as a starting point for a bit of a linguisticy chat. But as an unusual added extra both also feature narrowly-avoided horrendous pain and, in part two, possible death.

Part one now, and this features are couple of beasties you may already have seen if you attend to the other place as well: spider-hunting wasps. There’s actually two of them in the photo, busy in the process of creating a third (click on it to see this dodecapedal filth full-sized). Once the egg is fertilized, the female goes out, finds a decent-sized spider, and paralyses it with a specially-adapted sting. She then drags the spider to her burrow (really and truly, she does), where she lays the egg upon its still-living, but immobile, abdomen. The larva hatches, and finds itself on a substantial and fresh source of nutrients, which it proceeds to consume, even avoiding the essential organs until the very end. All of which has nothing to do with linguistics, but is totally, totally amazing, and more than a little gross.

I disturbed this couple whilst trying to photograph them, and they broke up and whirred around angrily, at which point I strategically ran for it as they’re about three or four inches in length, and I have a tendency to swell up like a balloon at just an ordinary British wasp sting. When I showed the photos to friends in the coastal village where I was staying, I was told that these were called marimbondo, and that their sting was, indeed, extremely painful. All well and good, new word learnt, excruciating pain narrowly avoided.

The interesting part arises when I returned to Brasília, and mentioned to a few friends that I had seen some marimbondos mating, disturbed them and was glad I avoided their sting. This was met with polite indifference, so to stress the extreme peril I had been in I brought up the whole spider-hunting business. What? No! Marimbondos don’t hunt spiders, marimbondos are just ordinaryish wasps; slightly large but not three-or-four inchers. The photos were produced for confirmation and I was duly and authoritatively informed that these were not marimbondos. What’s going on here?

On the one hand we are long accustomed to different words for the same thing, particularly across country boundaries. British English speakers know that when Americans say diaper they really mean nappy, and similar differences occur between Brazilian and European Portuguese. Sometimes there are odd or amusing coincidences — such as the potentially eye-watering difference in reference of Durex between the UK and Australia.

In fact, alternative names happen on a far more local scale than across countries: I grew up in Plymouth, where (keeping with the original theme) wasps were known as jaspers: I have no recollection of hearing this term, which must be said in the boarder of Janner accents, anywhere else. However, unlike the US/English difference, this local term is supplementary to the general term: no Plymouthian would fail to recognize the word wasp.

As a converse phenomena, one can come across localized taboos, where the general word for something is prohibited for cultural or superstitious reasons. The most common taboo words are, of course, swear-words and blasphemy (however construed in the culture in question) but other, more apparently arbitrary taboos can exist. They are not necessarily geographic in their distribution, but associated with a particular sub-culture or profession, such as the actors’ bar on saying Macbeth; often, in small communities with a single industry, the two conflate.

As the prevalence of swearing in the public arena has expanded in the west, and the total prohibition of blasphemy reduced, we are unaccustomed to the idea that there may be words in a language which some individuals completely avoid. However, it is to be found not necessarily so far from home. A colleague at the University of York, Damien Hall, works on the Accent and Identity on the Scottish-English Border project, and in the town of Eyemouth, at the south-east point of Scotland, encountered a number of older fishermen who would not in any circumstance use the words pig or pork — even avoiding it in a written text they were asked to read as part of the interview. In this case a local term — guffie — was substituted. In other systems phrasal euphemisms are used as an avoidance strategy, such as the practice in Judaism of saying adonai (“my lords”) where the text contains the prohibited YHWH.

What’s happening with my marimbondos, however, is a slightly different phenomenon again: the term appears, if we generalize, to refer to wasps of some type, but the detail of the reference varies from place to place. I want to stress that we should not consider the term to be vague in the way that, say, bush or shrub may be vague in English — where we cannot precisely delimit where it ceases to apply. For any individual speaker, there seems to be a fairly clear schema of the insects marimbondo covers, it just varies between speakers. That the same word can fall within the same hypernym (higher-level category) and yet locate totally discrete items should be unsurprising to anyone of British origin who has had to negotiate the geographical and social variation in the reference of the meals termed tea, dinner, and supper — which has even been the topic of some published research.

Although, in one respect, this kind of lexical variation is beyond the original plan of my research, this has sparked my interest. Is it a gradient phenomena or categorical? In other words, do caiçara speakers identify species A, B and C as marimbondo, but mineiro speakers identify them as species D, E and F with no overlap? Or is there an intervening dialect which, say, uses it for B, C and D, or even all of A–E? Additionally, and now specifically related to my research, if the various migrants to Brasília have brought different ranges of reference of the term, has a specific range won out in the speech of the youth? Or does their speech cover all ranges? Or — as pointed out by the ever-helpful Jen Nycz — as the youth group are city-dwellers, do they even have these detailed species terms, as opposed to just, for instance, vespa (the broadest term for wasp).

Although this is, with respect to the underlying themes of my research, a fairly trivial question, thinking upon it has given me a solution to a problem I had still not solved satisfactorily to my mind — and that is the problem of word lists. It is a fairly standard procedure, in sociolinguistic research, to end an interview with a request that the interviewee read a short text or list of words. The purpose of this is to ensure that the researcher has at least a set of tokens of the same type for every speaker, and also to elicit speech in a more formal manner. Although I have prepared such a list, I have been deeply uncomfortable about using it, because Brazil still has relatively high rates of illiteracy, and — particularly amongst my older age-group — I run the risk of asking them to perform a task that they cannot easily or comfortably do. Nevertheless, it would be nice to elicit some similar tokens from all individuals.

In variant names for flora and fauna, I have found a perfect solution. When those who disagreed with my original understanding of marimbondo were pressed, they turned out to have a wider awareness that in other regions the terms were not the same — this seems to be generally known. It is early enough in my fieldwork, I think, to allow me to source a number of higher quality images of a range of flora and fauna with disputable ranges of reference, and show each them to each interviewee, with the explanation that I would like their term for this creature or plant. This will be supplementary to the existing wordlist — in which it is easier to control for desired sounds, but will mean that, for speakers who show discomfort with the written materials at the start of the interview (all interviewees are asked to read and sign a consent form), I need not ask them to undertake the additional wordlist task.

So, this post ends with call-out to Brazilians: I need species — animals or plants — with contentious or frequently localized names! Send me details and get credit in my acknowledgements! What more could you want?

Up next: the perils of fieldwork, part two. This time with added mortal peril!

A near-minimal pair

Apologies for the delay in updating. I said that I would post content on some of my research themes, but when I started writing these sections I found that they would largely overlap with the introduction chapter to my thesis. Having spent a good portion of the last ten years advising publishers on effective repurposing of content, I could not ignore my own mantra — that it is easier to remove content and scale down technically than to add and increment — and have been occupied with writing that chapter first. Selected highlights of it to come.

Additionally, and regrettably, things are still moving slowly on the informant front. I have secured a few more promises, and made some recordings. But potential informants often have lives of their own, curse them, and turn out to have done things or lived in places that disqualify them from your sampling criteria.

Just because they have been disqualified from the sample criteria does not, however, mean these sessions have been entirely useless. One can learn from them about the location, and language attitudes, even the if speech data itself cannot be added to the pool to be analysed. Today’s post is about a near-minimal pair I have encountered in two such recordings.

The term MINIMAL PAIR is used in linguistics to refer to two items — usually words — that contrast in only one feature or element. They are thus used as evidence that that element can be used contrastively in the language or system under investigation. For instance, in English the words beet vs bit form a minimal pair, indicating that the difference between the vowels ee and i is contrastive in English — that different meanings arise when each is used. In Portuguese, however, these two vowels do not contrast. Conversely, Portuguese contrasts nasalized vowels with their oral versions: thus vi and vim (“I saw” vs “I came”) form a minimal pair — note that the m is not pronounced as an English m but instead makes the preceding vowel strongly nasal, similar to what happens in the more familiar French non.[1]

Minimal pairs are much favoured by phonologists to get at the sound system of a language, but in principle you can have a minimal pair of anything — just as long as you can get two items which are identical in all but one respect. In fact, although the term minimal pair is a linguistic one, this is really not a particularly linguistics-specific idea: that in investigating a system, if we can find two items that differ in only one feature, and are treated differently by the system, then that feature matters to the system.

If there are a couple of features in which they differ, we term this a near-minimal pair. These are also useful, but we have to be more careful with them: we have to decide whether there is any relationship between the variable features. Is it necessary that both features differ for contrast? Does one of the features depend upon the other, or can they alternate freely? Near-minimal pairs are not, in themselves, best considered evidence of an aspect of a system; however they are useful for directing one’s attention towards features which may be of interest, and questions we need to ask of the system.

So what is my near-minimal pair? Well, it’s rather more metaphorical, and the system in question is social rather than a language: two guys whom I recorded as informants who have remarkably similar backgrounds. Both guys work at the same place in the same type of job, have the same level of education, share a house, are from the same town in Minas Gerais and are roughly the same age. One of them — “Mighty Mouse” — came here to find work about six years ago and the other — “the Don,” who knew Mighty Mouse and was a friend from their hometown — came later, about three years ago. (That both guys are also really quite short — a good few inches less than me — only adds to their suitability to be classed as a minimal pair.)

So, Mighty Mouse and the Don have remarkably similar backgrounds. However they differ in two notable respects: their attitudes towards Taguatinga, and towards the brasiliense dialect. Mighty Mouse has been here longer, but dislikes Taguatinga, considering people here to be cold. He is here to work and get money, but intends to return to Patos de Minas when he can. The Don, however, very much likes Taguatinga, and intends to stay here, with no desire to return to Patos. When I asked them questions concerning the brasiliense dialect, there was also difference. Mighty Mouse was clear that there was a brasiliense dialect, and that he would be able to identify someone as being from the Federal District by their voice. The Don denied that a brasiliense dialect had emerged. Both agreed that speech here is very mixed, with aspects of north-eastern, mineiro and other dialects but whereas to Mighty Mouse there is now an emergent accent, which contains features of each, but is identifiable on its own, to the Don there simply remains the mix of original migrant accents.

Now, this raises a question — can we postulate a reason why this might be the case? We can, of course, prove nothing with just a couple of observations, but from the above two possible hypotheses arise:

  • There could be a temporal reason for the difference. Mighty Mouse has been here for six years compared to the Don’s three: it could simply be that it takes time to recognize the brasiliense accent.
  • There could also be an attitudinal reason. Mighty Mouse does not identify with the brasiliense folk, the Don does. Could the Don’s desire to be part of the community motivate his denial of a dialectical difference between him and that community, and Mighty Mouse’s distancing himself from them allow him to assert the contrary?

We should note that these hypotheses are not mutually contradictory — both could be contributory factors, or neither. We also do not know whether reported perception of the brasiliense dialect actually corresponds to an actual ability to distinguish speakers from the region, and whether or not it also corresponds to anything in the interviewee’s own speech behaviour. Obviously, from just two informants, we can draw no conclusions.

But questions such as these are interesting ones to sociolinguists, and we can try to frame them in a way that is quantifiably testable, rather than simply slightly fluffily observational, as I have been so far. Fluffily observational is a reasonable starting point, but we want to ask some more scientific questions, and for that we need data and research questions! We need to find ways to measure the phenomena, and then get measurements from a larger sample — a representative one — and frame some questions in ways that we can statistically analyse. The kind of thing we might want to ask is:

  • A perception question: does acknowledgment of the existence of a brasiliense dialect by a speaker correspond to an actual ability to correctly identify if other speakers are from Taguatinga/Brasília?
  • An attitudinal question: does acknowledgment of the existence of a brasiliense dialect by a speaker correspond to the speaker’s attitudes towards Taguatinga/Brasília in general?
  • A production question: does acknowledgment of the existence of a brasiliense dialect by a speaker correspond to presence or absence of particular features in their own speech?
  • Does time spent in Taguatinga by a speaker correspond to the same things?
  • If both acknowledgement of a brasiliense dialect and time spent in Taguatinga have effects on production and/or perception, is this because they themselves co-vary? Or do they differ in the amount of influence they have?

If we can find ways to measure the issues under consideration across the population, there are mathematical techniques that we can use to try and answer questions like these. Working out what to measure is not always an easy task, however. Time spent in Taguatinga is easy enough to measure, and claims to recognize the brasiliense dialect is also simple enough, a basic yes/no question — “Do you think there is a distinctive brasiliense dialect?” (which we express mathematically as 1 or 0).

But what of attitudes towards Taguatinga? If we want to apply mathematical techniques to analysis, we need to get a number for this. We could simply ask “do you like Taguatinga?” as for the dialect question. But there are a couple of problems with this. Liking or disliking is more of a gradient phenomena: to reduce it to a binary opposition is somewhat simplistic. It would be nicer if we could find some kind of a scale. Additionally, people’s attitudes towards a place can be rather more complex than simply like/dislike. My colleague Jennifer Nycz, who has worked on the speech of Canadian migrants to New York, commented to me that “there’s the national attitude of Canada vs. US, where typically people love Canada and didn’t really like US; but then on a local level they loved NYC while not really feeling at home in whatever small area they came from in Canada.” One solution to the former problem could be to ask people to grade their liking on an arbitrary scale — say from 1 to 10. However, this kind of approach can be problematic as different people will interpret the scale in different ways, and also does not tackle the problem of mixed attitudes. Alternatively, I could code this myself — thus dealing with the problem of different interpretations of the scale, but instead introducing subjectivity in that it we are now dealing with my judgments of the respondents’ attitudes, rather than a direct questioning of them.

The approach which I will be taking in my thesis attempts to deal with this by the use of proxy variables and principle component analysis. Wait … wait … don’t go running yet. The maths is horrible, but the concepts are actually quite simple. The idea is that, instead of asking one question from which we attempt to deduce a gradient phenomena we ask a large number of simple yes/no questions which can cover an extremely wide range of topics, as long as they relate to the local community and attitudes towards it or participation in it: “Do you plan to stay here?,” “Have you/will you bring up your children here?,” “Did you vote in the last local election?”

These questions we call “proxy variables” because they stand for what we actually want to measure. We then apply a technique called principle components analysis, or PCA. As I say, the maths behind this is pretty complex, but essentially what PCA does is to take all of the variables, and produce a number of weighted groupings of them, and each grouping is then ranked according to the amount of the total variation that it accounts for.

That sounds nasty, but it’s not. Imagine that one of my questions was the rather foolish “Do you speak Portuguese?” Given that all respondents will answer “yes”, this question will account for none of the variation in the population — and so its entire weight will be assigned to the bottom-ranked grouping. Now imagine I ask “Do you think that there is a brasiliense accent?” and also “Do you like the brasiliense accent?”. Now whilst these both will vary between respondents, there will be a certain lack of variation between the questions themselves, for (almost) no-one who claims that there is no accent will also claim to like it! So, we might see their weightings correspond quite closely, and appear in the same ranked groups.

Because weightings are applied the resultant groups have gradient scores, rather than just one or zero, and because more than one grouping is returned, the question of whether there are complex and conflicting attitudes will be reflected in the amount of variation accounted for by each group. If attitudes are fairly simple, there will be a large amount of co-variation between the variables (e.g. all the people who like it here also intend to bring up their children here, like the sound of the speech here, and, let’s say, spent last Christmas here) and so the process would return one group that accounted for maybe 75% of all the variation, and all the others accounting for a minute amount. We would then use this group and the informants’ scores on it as our “attitude toward Taguatinga” variable. If, however, people’s attitudes are more complex and there is less co-variation the process might return two groups that both accounted for, say, 40% of all the variation. In this case, we would have two resultant variables, and we would have to inspect the individual weightings to decide what each of them accounted for.

Once we have scores for all the variables we want think might affect the answers to our research questions, there are a number of mathematical techniques we can use — regression, analysis of variance and similarly scary-sounding processes — to try to tease out the comparative effect of each of these variables. But an explanation of how that works, I think, should be left for a future post.

[1] Should you care for a bit of linguistic detail, English contrasts /i/ (usually long) with /ɪ/. In Portuguese they occur in complementary distribution, with [ɪ] as an unstressed allophone of stressed /i/. The Portuguese system, however, contrasts oral /i/ with nasal /ĩ/. [Back up]