A chapter from Digital Humanities Pedagogy

Teaching Computer-Assisted Text Analysis: Approaches to Learning New Methodologies.
By Stéfan Sinclair and Goeffrey Rockwell
The chapter Teaching Computer-Assisted Text Analysis: Approaches to Learning New Methodologies by Stéfan Sinclair and Goeffrey Rockwell in the creative commons book Digital Humanities Pedagogy gives an insight on teaching computer-aided text analysis to novices. I chose to detail and summarize this chapter as the theme intrigued me as an interdisciplinary student of computer science and literary and linguistic computing. It is interesting to read how light is shed on text analysis tools and teaching them to novices. Through the whole chapter text analysis is the main theme; why it should be taught, introductory recipes to text analysis are described and the concept of an notebook for advanced students is explained. Though it is stated that humanities students may be intimidated by using computers for text analysis, its application is omnipresent in our digital society, as show by examples of the Google search engine or predictive texting capabilities and not exclusive to digital humanists or computational linguists. With digitization a transformation of how the written can be handled emerge and humanities students are most predestined to employ such methods when everyone around them does.
Computer-aided text analysis is illustrated as a help for the interpretation of electronic texts, to find text passages with interpretable words for the convenience of the author. Moreover, texts are broken into basic units and synthesis, to count, manipulate and reassemble them. With the help of the computer, counting and statistics can be more sophisticated, with a larger scope and an easy way to visualize results adapted to the desired phenomena. Searching large texts quickly, indexing and fast interactive finding can advance the analysis without frustration. Subsequently, complex explorations for word patterns or co-occurrences become accessible. Traditional reading practices cannot offer the possibilities, which are feasible with the help of the computer. This strives to new approaches to develop interpretative questions and formalizations for the computer, without reading the complete text, which is also called distant reading.
Computer-aided text analysis should be taught, since the common scale of its practice on the web imposes it on the ones whose profession it is to work with text. Examples of word clouds, Wordles or a tool of the New York Times, to interactively scan and work with the transcripts of a political debate are mentioned. Basic analytical literacy to evaluate and interpret such visualization and graphs is essential for humanities students. Students could learn about the limits and potential of literature and surveillance analytics by inspecting how companies mine data from social sites.
To explain how to integrate text analysis into a course tree models from simple assignments to a complex model to create an own text are presented. The first task involves using a pre-populated analytical tool for interactive reading with an indexed text loaded, to work on the unfolding of themes throughout a text or other simple exploratory tasks.
The second explains how to build a corpus with own research questions and texts. For this task, popular phenomena should be identified and studied with various text analysis techniques using numerous online texts. The procedure should be noted in journal to be recreatable.  Additionally, hermeneutical questions can be posed to help to formalize those, possibly before starting the computer-aided analysis with a dedicated tool, with the creation of presentation of how the results were achieved in mind.
Thirdly, it is proposed to introduce to text analysis to make the students think about how computers can be used to analyze data. Approaches by reading an introduction or working with an online workbook by the Text Analysis Developers Alliance are presented to experiment with. The freely available recipes of the authors for the description of text analysis as interpretative task for humanities researcher without jargon are referenced. The essence is to encourage methodical thinking and to develop interpretations with descriptions on how insights are achieved, which is also important in other fields.
Apart from presenting an own tool the authors discuss the very interesting concept of literate programming by Donald E. Knuth. It is described as useful for thinking and documenting intellectual processes with analytical research. Essentially, it is a paradigm for a writing a highly blended mix of documentation and computer code, favoring prosaic and exegetical documentation. The goal is to mention the narrative context of the code, to illustrate human beings what was the intention behind the code, which alters the code style to be symbiotical with the narrative. Thus, the code is said to be better. Reading the documentation would serve more to understand the code and reading the code is preliminary for the computer. An code explanatory example of the typical Hello world example is illustrated.
Literate programming is designed to make a program more self-explanatory and easier to understand for novice programmers for the pedagogical purpose, but also to become more self-aware about structures, coding decisions and code alternatives. The program code is only secondary, which makes it a good fit for humanists. Though learning programming languages with their formalism and fixation on details is arduous, the challenge is to learn to think algorithmically about texts, as many think of it as an analogue singular object. Through computer analysis, digital text are chunked and can be reconfigured in infinity ways, which can be also recognized as a negative. The question in service of speculative or hermeneutic goals can be raised and about the incremental and iterate exploration.
Nevertheless, the approach of literate programming is only used marginally in environments for development, mainly in mathematical tools, where capturing reproducible process and reasoning are fundamental. Meanwhile, most of these tools include a high technical barrier for humanist as running code and understanding its paradigm is required.
For most parts, the approaches seems to be good to instruct and converge students to these tools. Nonetheless, the concept of literate programming seems to be a wish exclusive to non-programmers. I don't believe fundamental algorithmic thinking to be the main problem when starting to program, but more the expression in code and its expertise. It may be really interesting to link code with a narrative, but the essential for coding to articulate an algorithmic idea in the determined vocabulary. Often algorithmic ideas are inherent in the natural environment and the destinct cognition is only a small missing step. An example would be the knapsack algorithm, which tries to fill a container with different sized items to its maximum by “trying” the to find the best solution incrementally. Moreover would literate programming in computer science lead to derailed focus of the results and drastic stretching of implementation cycles. An application of literate programming might be only reasonable for pedagogical reasons. Normally, a documentation and textual commentary in the code explanatory enough for programmers to understand it and reproduce the underlying idea. Ultimately, the cardinal and intrinsic purpose to program is frankly not about presenting a narrative, but to deliver a useful final implementation or to process meaningful results.
[2] Stéfan Sinclair and Goeffrey Rockwell, Teaching Computer-Assisted Text Analysis: Approaches to Learning New Methodologies, Digital Humanities Pedagogy, P 241-263

16.1.15 23:07


bisher 0 Kommentar(e)     TrackBack-URL

E-Mail bei weiteren Kommentaren
Informationen speichern (Cookie)

 Smileys einfügen