Do a Thousand Words Make a Great Picture? Using Word Clouds to Analyze Text


This word cloud represents this blog post. You can use a word cloud to see repeated words within your text.

Even if today’s headline is the first time you’ve heard the term “word cloud,” you’ve likely seen them floating around the web. Online tools create word clouds by taking text and converting the words into an image where a word’s size is based on a word’s repetition. Besides being eye candy, these images can offer professionals the ability to quickly analyze their content and identify areas of emphasis. Why does this matter and why should you care? Take a moment and read on to discover how you can improve your Governance, Risk and Compliance (GRC) writing by using a word cloud.

Are GRC Authors Paid by the Word?
My late grandmother, an English teacher of 40 years, gave me her copy of James Fenimore Cooper’s Last of the Mohicans when I was 12 years old. To this day, I’ve yet to finish the book (and no, watching the 1992 movie version does not help). The reason for the gift was not to immerse myself in mid-18th century America. She gave me this book along with a key insight, “Cooper wrote like he got paid by the word; don’t you do the same.” A more famous analysis comes from Mark Twain when he stated that Cooper should, “use the right word, not its second cousin,” in reference to Cooper’s flowery, long-winded style. For evidence, read this sentence straight out of Last of the Mohicans:

When, therefore, intelligence was received at the fort which covered the southern termination of the portage between the Hudson and the lakes, that Montcalm had been seen moving up the Champlain, with an army “numerous as the leaves on the trees”, its truth was admitted with more of the craven reluctance of fear than with the stern joy that a warrior should feel, in finding an enemy within reach of his blow.

If you’re a fan of dense literature, you have my respect. However, being flowery or long-winded in literature is one thing: with literature, your readers read for pleasure. What happens when your readers read to finish a task? Text in the GRC universe too often expresses traits from a literary style. How often have you suffered through writing using the following style traits:

  • Multiple words all representing the same concept, just for variety’s sake – procedure/response/task/method
  • Too many ideas in a single sentence
  • Information indifferent to the context when it’s read – think multiple, long-winded paragraphs on the proper exit locations during a crisis (in a crisis, readers won’t have time to read long paragraphs)

While the GRC authors may think the content is thorough, readers are left frustrated when they can’t promptly find the proper action buried within the dense content. In addition, readers may not know the precise task to complete when authors put the emphasis on the variety and their Thesaurus skills, and not the intent of the document.

To the Cloud! Using Word Clouds to Identify Repetition
Back to the beginning: how can word clouds help GRC authors? When you identify a problem, you can begin fixing it. Analyzing your text through a word cloud image allows you to quickly visualize the underlying repetition (or the lack of) within your text. For example, I created this word cloud image from the page where I captured the Last of the Mohicans quote:


Word cloud created from Chapter 1 of Last of the Mohicans.

What key terms stick out? Where is the emphasis? This text does not appear to stress any consistent theme. (I’m not saying it should; literature has a different intent. I’m using this only for contrast.) Word variety may be flowery and make for an expressive story, but it’s not the best style for adding emphasis to an important point or communicating instructions. This text wasn’t written for the world of work, so let’s cut him some slack. In contrast, check out the word cloud from the Occupational Safety and Health Act (OSHA) policy guidelines for the holiday season, content featured on last week’s blog post.


Word cloud created from OSHA’s “Crowd Management Safety Guidelines for Retailers.”

Look at what words stand out: crowd, store, emergency, management, event. The repetition of these words adds emphasis to the intent of the document – retail store owners should plan for emergencies due to large crowds. While no one will be gifting copies of OSHA guidelines for the holidays, this writing style effectively communicates the intent of the message, using repetition of the key words to add emphasis.

Conducting Your Own Writing Analysis
Google “word cloud” and you’ll discover a wide variety of online tools for creating your own image. For this article, I used Wordle. It’s easy to use; paste your text and click “Go.” (Note: Please consider your own corporate policies before pasting internal, proprietary text into an external, online tool.) When reviewing your word cloud, ask yourself these questions:

  • Does my text have any word emphasis? If not, consider revising your text to incorporate some keyword repetition. 
  • Do my repeated words relate to the intent I’m trying to convey?
    • Are there words I’m overusing that distract from my intent? For example, the word cloud for the draft of this blog post had the word “following” as the biggest word. Since this term didn’t relate to my key points, I went back and took out a few uses of the word. 
    • Are there words I’m under-using that need more emphasis? A key point of word clouds is “emphasis.” This word barely registered in my original word cloud. My revision made a conscious effort to include “emphasis.” 

Like any technology in the GRC space, a word cloud is just a tool. It’s not a substitution for wisdom.  Do not artificially inflate your text with terms like a first-year web page designer trying to score big on search engine optimization. Use word clouds simply as a way to view your writing from a different perspective. If nothing else, remember this:

In task-based writing – such as authoring policies or instructions for an incident response – it’s essential that your writing be precise and clear. Don’t cloud your intent with a wild variety of terms.

With GRC writing, we can be the best authors when we put our Thesauruses down. If your word variety is as “numerous as the leaves on the trees,” take a moment and simplify your text. When you pay attention to the content of your prose, your readers will thank you. Now, if your intent is to write the next great American novel, disregard all of these tips and follow your whims and your passions. However, if your next writing task is to author testing procedures for your control standards, stick to the direct approach.

–Jonathan Kitchin, OrangePoint

Cooper, J. F. (1928). Last of the Mohicans. Philadelphia: David McCay Company

This entry was posted in GRC Education and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s