top of page
  • Writer's pictureSpencer Bennington

Text Mining in Martial Arts Studies

Welcome back to the Digital Humanities section of the Rhetorical Roundhouse blog. I haven't had many digi-things to talk about in the past few weeks (last week I recounted the double conference-palooza in Pittsburgh and the week before I was...well, I was conferencing pretty hard). Also, as many of you know, last week I broke my foot and, as much as I didn't want it to, that injury has slowed me down quite a bit. Today and in the next few weeks I'll be writing more about that so look out for some one-legged action coming soon...

No, my one-legged action won't be nearly this cute.

Today though, I'll be discussing the work of Text mining as it relates to Digital Humanities, Rhetoric and writing pedagogy, and Martial Arts Studies. First, you might be wondering, what exactly do I mean by "text mining?" In it's simplest form, this is a process by which researchers utilize computers to query large corpora of words in order to discover trends not necessarily apparent to a single human reader. For example, any undergraduate English major can read Othello closely and find themes of race and gender inequity, but could that same undergraduate do a close reading like this on EVERY WORD Shakespeare ever wrote?

Of course not. Not even an extremely talented graduate student like Spencer Todd Bennington IV could do that...

But there are plenty of programs (many free to use) which can. Voyant is one such free and user-friendly platform for those first interested in the dynamic relationship between text mining and data visualization techniques. You can log on right now (if the server isn't too crammed) any play with their suite of mining/visualization tools either with your own corpus of text or with their pre-loaded Shakespeare and Jane Austin corpora.

What I like about Voyant is that it serves as a wonderful introduction for students new to this form of research and can entice them to pursue further projects (academically, artistically, etc) in fields related to their interest. For example, when I was first introduced to tools like these through Digital Humanities and Quantitative Methods coursework, I thought about how insightful it might be to create a corpus of Taekwondo training manuals to get a zoomed out idea of what kinds of themes trend between different editions, decades, authors, etc. Of course, the daunting part of this was actually digitizing these texts to be mined and visualized. I learned a lot about the do's/don'ts of text digitization (and how time consuming it is!) when I first experimented with this project. In the end, all that I can find is a clean sample of one page of one manual, Advancing in Taekwondo by Grandmaster Richard Chun.

The above visualizations are fairly simple, but they're a good way to explore ideas that may have not seemed key in the short description. For example, Voyant picked up on the word "imaginary" as being important (because of the ways its repeated in different contexts) and highlighted it in the word cloud.

I noticed this and remembered just how important the concept of the "imaginary threat" was to my personal meditations when practicing this form, so I chose to focus on the term in the line graph to the right. This shows me that the word "imaginary" appears in the second and third "segment" of the text (out of four) which shows me that the theme is concentrated in the portion of a text we normally associate with the climactic.

Again, with a text this short, you could come to these realizations through close reading, but, as I just mentioned, you may not have the impetus to do so. The truth is, subconsciously, I knew that the theme of the imaginary was an important concept to understand when performing this pumsae, but I didn't look more closely at that concept textually until prompted by Voyant's visualizations. And I think that's the point--tools like this and methods like textual analysis allow us to see our work differently and ask research questions we may not have otherwise. That is valuable all by itself.

I hope that in the future I can work with some other talented scholars in Martial Arts Studies, Digital Humanities, and Rhetoric to create a corpus of training manuals to better understand the integration of bodily, ethical, mental, and spiritual training embedded in these massive texts. For now, because of the limited amount of time left in my PhD program and the other fish that need frying (**cough cough** finish dissertation **cough cough**) I'll put this project on the back burner for a future when I have a team and grant funding :P

For now though, I'll just leave you with one of the more aesthetic functions Voyant offers--the textarc first developed by W. Bradford Paley. As a static image, the circumfrance of the circle is populated by every word of the text. As a dynamic visualization, you can see where the words appear and reappear through the cycle of a text. For this image, I chose Bruce Lee's famous manual, Tao of Jeet Kune Do.

Thanks as always for reading and happy mining :)


Further Reading

Jockers and Underwood, “Text Mining the Humanities,” in Schreibman, et al., A New Companion: Rockwell and Sinclair, “Text Analysis and Visualization,” in Schreibman, et al., A New Companion: Nowviskie, “What Do Girls Dig?”: Rhody, “Why I Dig,” in Gold and Klein, eds., Debates 2016: Drucker, “Graphical Approaches to the Digital Humanities,” in Schreibman, et al., A New Companion to Digital Humanities: Underwood, et al., “The Transformation of Gender in English-Language Fiction,” Journal of Cultural Analytics (February 13, 2018):

66 views0 comments


bottom of page