Another New Kind of Science?
Last weekend's Cultural Studies conference reminded me of a viscous cycle that many humanities-oriented researchers are being subjected to. Disciplines such as educational research, ethnography, anthropology, cultural studies, sociology etc have effectively been colonized by the methodology of the social sciences and they are being forced to play a numbers game which they may not be suited for.
Many projects striving for credibility are subjected to the tyranny of statistics - forced to transform their qualitative information (interviews, transcripts, first person accounts) into quantitative information through the process of coding. This reduction forces the data into buckets and creates a significant degree of signal loss, all in the name of a few percentages and pie-charts.
Perhaps we have lost sight of the motivation for this reduction - the substantiation of a recognizable, narrative account of a phenomena, supporting an argument. Arguably, the purpose of the number crunching is to provide supporting evidence for a demonstrable narrative. Modern visualization techniques may be able to provide one without all the hassle.
True, this is not always the only reason that qualitative is transformed into quantitative data, but advanced visualization techniques may provide a hybrid form that is more palatable to many of the researchers active in this area, and is still a credible methodology. It seems as if many people are being forced into coding and quantification, when they aren't thrilled to be doing so. But the signal loss that coding is responsible for, all in the name of measuring, might be unnecessary if people think about using data visualization tools, that comprehensibly present the data, in all of its richness and complexity, as opposed to boiling it down to chi-squared confidence levels (and does this false precision actually make any difference? Does a result of 0.44 vs. 0.53 tell significantly different stories?)
In a thought provoking post on the future of science, Kelly enumerates many of the ways new computing paradigms and interactive forms of communications might transform science. The device that I am proposing here might lead to some of the outcomes Kelly proposes.
For a better idea of the kinds of visualization tools I am imagining, consider some of the visualization work on large email corpora coming out of the M.I.T. media lab, or the history flow tool for analyzing wiki collaborations, but even the humble tag cloud could be adapted for these purposes, as the power of words and visualizing the state of the union demonstrate.
Crucially, tools analogous to Plone's haystack Product (built on top of the free libots auto-classification/summarizer library) might help do for social science research what auto-sequencing techniques have done for biology (when I was a kid, gene sequences needed to be painstakingly discovered "manually").
The law firms that need to process thousands of documents in discovery and the commercial vendors developing the next generation of email clients are already hip to this problem - when will the sciences catch up?
For any of this to happen the current academic structure needs to be challenged. The power of journals is already under attack, but professors who already have tenure can take the lead here and pave the road for their students to follow.
Many projects striving for credibility are subjected to the tyranny of statistics - forced to transform their qualitative information (interviews, transcripts, first person accounts) into quantitative information through the process of coding. This reduction forces the data into buckets and creates a significant degree of signal loss, all in the name of a few percentages and pie-charts.
Perhaps we have lost sight of the motivation for this reduction - the substantiation of a recognizable, narrative account of a phenomena, supporting an argument. Arguably, the purpose of the number crunching is to provide supporting evidence for a demonstrable narrative. Modern visualization techniques may be able to provide one without all the hassle.
True, this is not always the only reason that qualitative is transformed into quantitative data, but advanced visualization techniques may provide a hybrid form that is more palatable to many of the researchers active in this area, and is still a credible methodology. It seems as if many people are being forced into coding and quantification, when they aren't thrilled to be doing so. But the signal loss that coding is responsible for, all in the name of measuring, might be unnecessary if people think about using data visualization tools, that comprehensibly present the data, in all of its richness and complexity, as opposed to boiling it down to chi-squared confidence levels (and does this false precision actually make any difference? Does a result of 0.44 vs. 0.53 tell significantly different stories?)
In a thought provoking post on the future of science, Kelly enumerates many of the ways new computing paradigms and interactive forms of communications might transform science. The device that I am proposing here might lead to some of the outcomes Kelly proposes.
For a better idea of the kinds of visualization tools I am imagining, consider some of the visualization work on large email corpora coming out of the M.I.T. media lab, or the history flow tool for analyzing wiki collaborations, but even the humble tag cloud could be adapted for these purposes, as the power of words and visualizing the state of the union demonstrate.
Crucially, tools analogous to Plone's haystack Product (built on top of the free libots auto-classification/summarizer library) might help do for social science research what auto-sequencing techniques have done for biology (when I was a kid, gene sequences needed to be painstakingly discovered "manually").
The law firms that need to process thousands of documents in discovery and the commercial vendors developing the next generation of email clients are already hip to this problem - when will the sciences catch up?
For any of this to happen the current academic structure needs to be challenged. The power of journals is already under attack, but professors who already have tenure can take the lead here and pave the road for their students to follow.
2 Comments:
The problem with statistical analysis in the hands of many is that they expect the statiistics to yeild the truth and this leads into the mistake "reporting their findings" in a theory-deprived context. Whenever you are dealing with the human sciences, whether the information is statistical, visual or otherwise, you still have to build a meaningful narrative that requires that you have a point of view that has either overt or covert theoretical assumptions. Without that you arte in danger of reporting your views in what Marcuse calls opreational language, a language derived from the tools of discovery rather a serious point of view.
Of course. I am just questioning the impulse to resort to statistics to support a theory in an era where there may be alternate, emerging techniques which can also adequately support the assertions.
I left the NYU Tech & Learning Symposium thinking that we could understand more about human memory and cognition from an armchair (inner empiricism?) and that 2k years ago they had a better theory of the mind than we do now.
Interestingly, Ulises just made a related post questioning the ways in which technology impacts epistemology:
"it is true that technology alters our ways of knowing and thinking in irreversible ways. These shifts in epistemic stances are particularly pronounced in the use of technologies that manipulate language. If the manipulation of numerical data by computers fundamentally changed how we construct knowledge in the sciences, the manipulation of language by technology had a similar effect for other disciplines"
Complex visualizations won't yield truth any more readily than statistics. But could they become powerful and persuasive enough to displace statistics?
Post a Comment
<< Home