Monday, December 7, 2009

Clausewitz: What data mining can reveal!

Is warfare just the continuation of politics by other means? Carl von Clausewitz's bold theoretical assertion in On War has enthralled many a military historian since its publication in the 19th century. Writing in the wake of his own experiences as a Prussian officer in the Napoleonic Wars, Clausewitz's analysis is stunningly contemporary to rise of nationalism/imperialism in Europe and has been argued as a major ideological foundation for the first world war. If we want to learn more about his infamous work, we must explore the new ways to encounter texts. In the digital age, historians have many new tools to develop new research techniques. One of the most accessible is text and data mining software. What new themes can we find by examining the metadata of Clausewitz? The one that I would really like to explore in this blog is the most central to Clausewitz theory and also his practice of warfare: Warfare is the continuation of politics by other means.

The first task I performed was a simple text search to see which words were mentioned the most in the text. While unsure what this would show me, I figured it would be a great way to start. The three most used words (other than pronouns, prepositions and general verbs) were battle (266), means, and general (both 259). Words that associate the interrelationship between politics and warfare do occur a great deal. Examples are power and influence (both 94), political (77) and superiority (62). This initial step was illuminating, but not very revealing in terms of interpretation. I next tried a concordance analysis using the words politics and warfare. I did not find anything information of value using these words. However I realized that the word politics is not used by Clausewitz, and I checked my initial text search to confirm this suspicion. I then used the word policy instead of politics and got a much better result. The concordance reads as follows:

renewal of the era of     warfare      be a change for the  ONLY A CONTINUATION OF STATE     POLICY      BY OTHER MEANS . This  always be the aim of     Warfare      . Now War is always  therefore , will action in     warfare      be stopped , as indeed  it is called forth by     policy      it would step into the  step into the place of     policy      , and as something quite  a want of harmony between     policy      and the conduct of a  prior right to consideration .     Policy      , therefore , is interwoven  IS A MERE CONTINUATION OF     POLICY      BY OTHER MEANS . We  the tendencies and views of     policy      shall not be incompatible with  if we regard the State     policy      as the intelligence of the  only if we understand by     policy      not a true appreciation of  War may belong more to     policy      than the first . 27  object of particular acts of     Warfare      , and therefore also the  an intimate knowledge of State     policy      in its higher relations .  of the War and the     policy      of the State here coincide  at the different scenes of     Warfare      , or to send there  is still more like State     policy      , which again , on  scale . Besides , State     policy      is the womb in which  end of the act of     warfare      , and modify or influence  at the present state of     warfare      , we should say that  displays itself most in mountain     warfare      , where every one down  lie towards the province of     policy      . The preparations for a  SUSPENSION OF THE ACT IN     WARFARE      IF one considers War as  suspension in the act of     Warfare      , strictly speaking , is  Nevertheless in this kind of     Warfare      , there is also a  which , with a shilly-shally     policy      , and a routine-ridden military  as the real activity in     Warfare      , which , by its  are cases also in modern     Warfare      in which this has not 
The results display each time warfare and policy are mentioned in the text. If tweaked a little in terms of grammar, this result could be abridged to produce a overview of the overall theme of the book. The concordance works well to illustrate how to identify a theme in the book. When using a word pair search with policy and warfare, there are no results. So text mining to me seems like a great tool once the book has already been read. One would need to read the book in order to ascertain this specific theme in order to begin asking questions.
Like all good historians, I know that the hardest and most crucial part of our inquiry is asking the right questions.  In terms of text mining, I simply do not know the questions to ask in order to find out the information I desire.  Luckily for me, the TAPoR software which I used for my text mining offers a great 'recipe guide' to finding useful information through text mining.  I decided to try the recipe for Exploring Themes Withing a Text since I have been searching for Clausewitz's theory.  At first I had to generate a word search using words related to the theme I was searching for.  This is certainly an area where I am not very comfortable.  Word associations remind me of grade 12 English, not historical research.  But herein lies the challenges of this new type of interpretation, forcing us to think outside of our comfort zone.  With enough practice, historians could be forging new links between themes and theories using these novel digital tools.  For Clausewitz and the theme of nationalism engendered by policy and warfare, I chose to enter only four words: warfare, policy, nationalism, superiority.  Next I had to determine the senses of the words.  Again, deja vu of grade 12 English.  After determining the senses I then had to find synonyms and antonyms of my chosen words.  There were no synonyms or antonyms for any of my words.  Using all my words, I then used the concordance tool to find all the instances of use for each word.  I came up with a very similar result as my first concordance test.  I then needed to finish the recipe doing a collocates search.  Unfortunately, my words came up with no result.
Overall, I can definitely see how useful text mining can be for historians and scholars.  However, I simply do not know the right questions to ask to uncover any meaningful information for new interpretations.  I started on this quest hoping to find something new about On War.  I did not learn anything new about the text,but I did learn something about how I have interpreted it.  Perhaps I should have expanded my initial observations on the books themes and not focused on such a narrow aspect.  Perhaps the best lesson that I learned was the difference between Human language and Machine language. If we are going to harness the power of computers to advance scholarly research in the humanities, we had better learn how to speak effectively with them.  The biggest obstacle that I ran into was asking the right questions of the software.  I needed to be more descriptive in my word selections and better attuned to the use of language in On War.  The real skill of text mining is knowing what the computer requires in terms of information to produce the results you want.  In my first attempt, I was way off!  The computer and I were speaking a different language.  The next step for historians is to figure out how to ask the right questions of computers.  We have gotten pretty good at questioning humans, now its time to face a new challenge.
I was able to perform text mining of this book with the help of TAPoR.  This website has a text analysis portal that can be accessed by scholars and students with great ease.  The site is easy to use and has a wonderful tutorial to help all new users.  I used the Gutenberg Project to extract the text for On War.  This digitization initiative contains over 30 000 e-books for free download and use and continues to add to its extensive collection.

No comments:

Post a Comment