Thursday, March 8, 2012

Research data as artificial object

These past few months have been so busy that I've not been in a position to write/submit anything mildly academic and it's been bugging me that whilst I do have some ideas, I just have no time. So I'm turning to this neglected blog to dull the guilt. Here goes....

Last year I put together a paper (rejected) for an information systems conference. I'm of the mind that, whilst the IS discipline focusses on business and government, there is an increasingly complex set of developments occurring around research, specifically research data. The past few years has seen a lot of development and expectations around research publishing models; data management, sharing and reuse; collaboration etc. The IS field has models for evaluating this kind of work and a body of knowledge that (potentially) can inform those of us constructing research systems and research administration systems. Furthermore, the work being undertaken on research systems can contribute to the IS discipline.

In building research data systems, it is useful to understand the central concept being studied. One question I keep finding myself coming back to is: "what is research data?". Should we cover just the raw data or include the methods of collection, instrument details, decision making notes, data cleaning procedure... and on and on. Most likely there's not so much an answer to "what is research data" but a need to define who's looking at the data.

Acknowledging an inability to generate a single definition of “research data” does not prevent research data from being studied as an object, regardless of its constituent parts. In considering research data in such a manner, Simon (1996) provides a useful starting point with his “sciences of the artificial”. Research data is an artificial (man-made) construct that describes or reports on natural or artificial occurrences. To support this assertion we can consider Simon’s four indicia for delineating the concept of artificial (as opposed to natural):
  1. Artificial things are synthesised (though not always or usually with full forethought) by human beings.
  2. Artificial things may imitate appearances in natural things while lacking, in one or many respects, the reality of the latter.
  3. Artificial things can be characterised in terms of functions, goals, adaptation.
  4. Artificial things are often discussed, particularly when they are being designed, in terms of imperatives as well as descriptives.
(Simon 1996 p. 6)

Applying these indicia to the notion of research data reveals its status as artificial:
  1. research data is created by humans and does not exist in the natural world;
  2. whilst the data may record a sample of the natural or artificial world, the data is a partial simulacrum of the subject being studied; and
  3. the creation and use of data serves a purpose determined by human objectives.
Regarding the fourth indicia, the process of creating research data is an effort in determining how the subject may best be studied and which attributes should be collected. In terms of the descriptive discussion, research data is often described through metadata created as a specific descriptor or through publications referring to the data.

Simon, H. A. 1996. The Sciences of the Artificial (3rd ed.), MIT Press, Cambridge, Mass.