Interrogating openness

Open data and qualitative research

Aug 03, 2022

As a qualitative researcher, I’ve observed developments around open science and open data with interest, but also with some vague sense of … caution. Sometimes, particularly when discussing open science with its advocates, that caution has taken on the form of more explicit concerns. I’d like to explore some of those concerns here.

In the context of contemporary scientific discourse, the preferred meaning of ‘open-ness’ has settled on some very specific qualities. In the interests of considering how and where the movement for open-ness might be relevant for qualitative research, it seems sensible to begin by considering what is meant by open-ness, and also what is excluded.

Firstly then, open-ness is good, right? It’s a universal assumption in this discourse that this is the case. There are certainly lots of everyday examples of open-ness as good: open arms, open hearts, open minds, open borders. We can – and should – consider alternative formulations: a mineshaft, disused and left open is not good; open access to distressing material and imagery is not good; a kindergarten which is open to any random visitor is not good; an open wound is not good. So we should at the outset be mindful of simply assuming that open-ness is always good. Some things call for us to be responsible and protective, too.

There is a secondary assumption associated with this, which is, I think more nuanced, and this is the idea that the alternative to open-ness is closure. I think many formulations and practices within the open science movement itself do show an understanding that open-ness might sometimes be a matter of degree. One problem for qualitative researchers, however, is that some gatekeepers (funders, journal editors and reviewers) can be over-zealous and may not always enact their roles with these nuances in mind. So it is important to consider that between open-ness and closure, there are also varying kinds of permeability, varying degrees of transparency, and varying forms of guardianship. How do we articulate these clearly, and which positions are relevant to qualitative research?

I think it’s also worth noticing the visual metaphors which underpin this commitment to open-ness: transparency looms large here. There are other ways of thinking about how open-ness might be mobilised: the opportunity to be involved and engaged in the primary research as a citizen collaborator, for example, or the ease with research can be accessed and understood. Some of these alternatives are fellow travellers with open-science (open access publication, for example) and some are not (it seems that science is still largely for experts, in this discourse). When we ‘see through’ the transparency that is sought by open science, what are we looking at? Largely, we are looking at data. This is important because being able to see data is also held to be a good thing, meaningful in their own terms, and potentially valuable for both replication and secondary analysis. It’s also important because of what we are not looking at, and what we are not looking at is any sort of robust response to the problem of reflexivity.

Whether one prioritises problems of replication, or problems of reflexivity, is contingent largely on one’s epistemological commitments. Qualitative researchers have long been concerned with striving for open practices of their own, to engage with the problem of reflexivity. However, qualitative research has never been in a position of power which has enabled such practices to be required of quantitative researchers. Indeed, in very few publication and conference spaces are qualitative researchers even in a position to include those open practices in their presented work: it is a commonplace that reflexive work is done in the thesis, or in ‘the process,’ but reduced to a minimum in the ‘output.’ When we ‘see through’ via a reflexive lens, we are not looking at data: we are looking at the researchers, and we are looking at the context and practice of knowledge production itself.

Replication and Science

The problem of replication is simply that confidence in science is undermined if procedures of design and analysis can be distorted by errors, or by malpractice, within a research team. Allowing others to scrutinise data and repeat analyses, independently of the primary research team, is therefore held to solve this problem. So far, so good – if science is what you’re doing, and if we can all agree what that means (more on that later).

The problem of reflexivity is more foundational: scientists are humans, and not separate from the system which they study. Their motives are inevitably also human, and complex, and so bias creeps into their observations through the careers that they pursue, the organisations who pay them, the methods that they choose, the measures that they apply, and so on. The people affect the work, and the work affects the people; it’s a loop. Quantitative researchers within the empiricist tradition[1] typically respond to this problem by aiming for closure. The reasoning usually goes that if we close the system (through appropriate sampling, by using standardised measures, and in applying recognised statistical procedures), we keep ourselves out of the loop so far as is possible. This is usually presented pragmatically: we strive for objectivity through closure, even when we recognise that it might not be wholly attainable.

This is interesting, right? Are we calling what we see here a … dialectic, or a juxtaposition or something clever like that? Empirical science sits on a core practice of closure, but in it’s current moment of crisis it is strongly espousing a commitment to open-ness. This is almost exclusively an open-ness about design commitments and data – it is a managed open-ness which does nothing to facilitate discussion about leaks in the closed system itself. I know it it sounds like I’m going somewhere with this bit, but honestly - do what you like with that, I’m just glad it’s not my problem.

What is my problem? Qualitative researchers within the interpretavist[2] tradition typically respond to the problem of reflexivity by engaging in reflexive practice. What does this look like? In short, it involves embracing subjectivity. It is an alternative tradition of practicing open-ness, but open-ness about the researchers and the research process. Strategies include: writing in the first person; reflecting on experience, identity and process (how it went, what you brought to the work, and how the work interacted with that); engaging in dialogue (with peers, with supervisors, with experts-by-experience) and drawing on that to inform the work and to ensure that is understandable and meaningful for it’s intended audiences; supporting permeable and interdisciplinary research teams; attending to context and reflecting on it’s consequences; exposing challenges and errors; and sometimes, abandoning work which feels unethical or fatally compromised. There is a great paper[3] encouraging quantitative researchers to think and work in these ways, but there is nothing which matches the pressures currently applied to qualitative researchers, with regard to signing up to open science practices.

Why does this matter? It matters because we haven’t yet described the response of qualitative researchers working within the interpretavist tradition to the problem of replication. This is why the open-closed contradiction is not my problem: we don’t care. We don’t care about replication. We are not trying to produce context-free knowledge. We are not trying to produce objective knowledge. This is because, when we write a qualitative paper, we are not announcing, ‘Objectively, this is how things are.’ If we were, it would be important that someone could indeed check that this is how they are. But instead we are saying, ‘Stand here with me for a moment, and see how things look, in that context, from this perspective. Does this view seem plausible and helpful? If so, here are some consequences to consider[4].’

Are other interpretations possible? Of course they are, but we foreground the one (or more than one) that best survives our reflexive work. This might be because it made sense to our collaborators and stake-holders, or because it is so insistently explicated by our participants in the raw data, or because it resonated with the research team, or because it completely up-ended our theoretical expectations – but we can be open about those reasons, if we have the opportunity and space to write reflexively.

It will be evident by now that these two traditions have different ideas about what the main epistemological threat to ‘doing good research’ looks like, and what the main qualities of ‘good research’ should be in response to that. I’ve been doing qualitative research – on and off – for a little over thirty years. In that time, I’ve been around a lot of conversations about how what we are doing in qualitative research is scientific, it’s just that, you know, it’s not empiricism, or it’s not positivism – you know, it’s just a ‘different model of science.’ If that was ever helpful, I don’t think it is helpful now. The idea and practice of ‘science’ is too tightly bound up with empiricism and closure to accommodate any alternative models under its umbrella. If there ever was the cultural capacity and willingness to accommodate us as fellow scientists, then over the last 30 years, there would have been some progress. I don’t think there has been much progress - a growing interdisciplinary space, yes, but not a more flexible and accommodating understanding of ‘science.’

If you think this can still change, just look at the sleight of hand going on with managed ‘open-ness’ in the context of a discipline whose fundamental principle is closure. No ground has been given, and no ground will be given. This might be harder for some than it is for me: I don’t think of myself as a scientist. I do value the work that scientists do, and I do collaborate with lots of scientists in my work, and you know, some of my best friends are … Anyway: it would be more healthy, I think, to develop an interdisciplinary conversation about what we call this thing which is science-adjacent, which is systematic, which is epistemologically-grounded, and which contributes useful and important insights to formal human knowledge. The Germans, as the saying goes, have a word for it. By ‘it,’ I mean, formal scholarship in a way which is more inclusive than ‘science’. So we could use that word: Wissenschaft. Or you know, someone on the internet could make up a word, and we could end up with that word instead, if we want a word that we all hate using. That’s what usually happens. (I was going to make up a word here, as a joke, and it made me laugh, but I am afraid it would travel.)

So those are my conceptual concerns about applying the current understanding of scientific open-ness to qualitative research. There are obviously some serious problems with transferring the concept uncritically, and without adaptation, from one domain to the other.

Tediously, I also have a number of practical concerns too. These are discussed (much more briefly) below:

Secondary data analysis and Context

Often, when I raise concerns about the relevance of replication to qualitative researchers, I am asked, ‘What about secondary data analysis?’ I mean, yeah, what about it?

Oh, OK. Well, hardly anyone does this, and while there might be some important methodological exceptions (I’m thinking of EMCA), the reason that it isn’t widely pursued is because of the conceptual issues above. Closure requires context-stripping, so of course secondary data analysis can be conducted usefully in quantitative work. But context is data in qualitative work, and some of that context is not visible to secondary researchers looking at anonymised transcripts. It’s in the field notes of the primary research fellow. It’s in the head of the principle investigator. It’s suspended in the unrecorded conversations that take place as researchers discuss the developing analysis with experts-by-experience and other stakeholders. Even where the primary work ‘simply’ involves analysis of extant documents (adoption records, say, or incident reports), there is a lot of contextual information which can not be archived, and which makes the task of a secondary data analysis team very difficult. There may well be occasions on which such efforts are warranted, but these can be addressed via permeable data guardianship – they do not justify the expectation that everything is held in mass open data archives.

And as an aside: list the most important 10 qualitative studies in your field. How many of them are secondary analyses of data that was previously analysed and published by another team? Can you think of any studies that do that [excluding EMCA work for a minute]? What would the rationale for such a study look like, in your field? We can imagine such a rationale, but it is usually a unique methodological case. We may need to develop clear protocols for such possibilities, whilst recognising that they are likely to be relatively rare.

Ethics and Anonymity

The complexity of issues around anonymity in qualitative research is often overlooked or oversimplified in requests for data to be archived. There are lots of examples of why, but here are five:

1. We might have a small sample who share a common context, and who without careful management of the presentation of the data (i.e someone who knows them, and can carefully select data extracts for wider discussion which are safe for sharing), could be able to recognise each other from a mis-handled secondary analysis. This is often described as a threat to internal anonymity.

2. We might have a sample who have shared incidents or described contexts that are sufficiently unique and identifiable that the primary team has chosen not to quote certain details, so that people who know them outside of the research can not identify them from the research – what if the secondary team do not honour this? This is often described as a threat to external anonymity.

3. We typically have a relational ethics underscoring qualitative studies, which resides in the trust that participants place in their researchers, on the expectation that they will act in good faith. What if a secondary team come to the dataset with a very different aim? This would be a breach of ethics. Consent to participate is often contingent on the aim of the work.

4. This kind of relational ethics connects also to the composition and ethos of research teams. For example, what if women who are domestic violence survivors agree to participate in a study because it is ‘lived experience-led’, and then the data are subsequently shared with a team which is led by, say an academic from a criminal justice background with a track record on ‘men’s rights advocacy.’? Consent to participate is sometimes contingent on who is doing the work.

5. Arrangements often include opportunities for the participants to withdraw subsections of data, but also to mark some subsections of data as ‘suitable for analysis but not for quotation.’ This is a useful arrangment which strikes a good balance between the needs of the respondent and the needs of the researcher. It resides in the trust between this participant and that researcher. Can such trust be extended to some future, unknown third party? It seems unlikely.

It’s important to place these concerns about ethics in the context of the earlier concerns about limits to the benefits. In all of these examples, I think there is a greater role for managed permeability, and responsibility for primary researchers to take on guardianship of that, than there is for straightforward open access and transparency arrangements. I’ve seen solutions suggested where participants sign up in advance to complex hypothetical future arrangements. I’m not sure that these are ethical. I don’t know how Future Me will feel about Past Me’s decisions, and so personally, I would rather that the burden fell to the researcher, to ask Future Me’s consent, if and when the need arises.

Research designs

Some research designs are more amenable to this kind of thing than others. Qualitatative research involves many different forms of data collection and analysis. If there are expectations about how we navigate the spectrum of positions between ‘open’ and ‘closed’, then these need to be shaped by a clear understanding of the different modalities of qualitative work. It may be that there are some modalities where open access arrangements transfer reasonably well, where the benefits are clear, and where the ethic issues can be easily managed. For other modalities – perhaps most - I think these are complex issues, and that the qualitative research community needs to be building an alternative conception of ‘open-ness’, from the foundations up, rather than once again finding itself trying to accommodate the ill-fitting frame of a scientific model.

[1] I’m using empiricist here in the sense of: a/ drawing on structured observations to b/ identify empirical regularities in order to c/ explain the causes of events.

[2] I’m using interpretavists here in the sense of: a/ exploring human sense-making and communication in b/specific contexts to c/ understand the meaning of particular events for certain people in well-specified contexts.

[3] Jamieson et al. 2022. https://europepmc.org/article/PPR/PPR460659

[4] Thanks to Dave Harper for this analogy.

Aug 3, 2022

Useful further reading:

Branney et al.- 'Three steps to open science' - some of this fits with what I've suggested here, and on other points I think we take rather different views: https://psyarxiv.com/ahdcu/

Pownall - 'Is replication possible for qualitative research?' - more optimistic than me, I think, about the potential for space within science which is amenable to the commitments of qualitative research: https://psyarxiv.com/dwxeg/

Prosser et al. - 'When open data closes the door' - more extensive examples of areas where the commitments of qualitative research don't align well with the requirements of open data: https://psyarxiv.com/5yw4z/

Chauvette et al. - 'Open data in qualitative research' - shares a lot of similar concerns to me, I think, but have written a properly referenced academic paper to express them! https://journals.sagepub.com/doi/full/10.1177/1609406918823863

Michael

Expand full comment

Understanding Others

Discussion about this post

Ready for more?