Monkey Testing
Monkey Testing
Monkey Testing
e get conflicting opinions about the efficacy of monkey test tools. Boris Beizer suggests in Black Box Testing that test monkeys arent very useful for testing todays professionally created software. His analysis concludes that the use of good testing practices will find more bugs than keyboard-scrabbling (also called Rachmaninoff testing). But James Tierney, former Director of Testing at Microsoft, has reported in internal presentations that some Microsoft applicaQUICK LOOK tions groups have found ten to twenty percent of the bugs in their projects s Types of test monkeys using monkey test tools. s Costs and benefits of random testing Which assessment of monkey test- s Guidelines to choosing the right monkey ing is correct? Probably both.
There is no universal test tool that will find all the bugs in any software. Each tool has its uses, and some tools are more useful for certain projectsor at specific points in a project cyclethan others. Test monkeys are no exception. Use them wisely, and youll have a cost-effective way to find new bugs. Use them carelessly, or exclusively, and youll release a buggy product. In this article well look at monkey test tools, examine in detail the class of monkeys Ive used most often, and provide guidelines to help you make wise choices.
Six monkeys pounding on six typewriters at random for a million years will recreate all the works of Isaac Asimov.
18
www.stqemagazine.com
This article is provided courtesy of STQE, the software testing and quality engineering magazine.
While many of us find the monkey name appealing, others prefer the more technical-sounding stochastic testing. Regardless, the essential elements are: s The monkey is relatively ignorant of how humans use the product. It doesnt know, for example, how to build a Web page or create an amortization table. s The monkey can randomly choose from among a large range of inputs for testing, and may be able to recreate all possible inputs for some applications. Well consider two types of monkeys: smart monkeys and dumb monkeys. Smart monkeys have some knowledge about how to access the user interface in the product theyre testing. They know at a simple functional level what can be done, and more importantthey understand what should happen when they do it. For example, they may know that choosing the New item on the File menu creates a new document, and they know that the new document will be displayed as a window with a particular class and text. If no new document window appears, or the window has the wrong caption or class, the monkey can identify the problem and report a bug. Smart monkeys usually get their product knowledge from a state table or model of the software they test. Randomly traversing the state model, they choose from among all the legal options in the current state for moving to another state, and then verify that they have reached the next expected state. You can add illegal inputs to the monkeys repertoire if the model includes error-handling states. Dumb monkeys act differently. (Ignorant monkey is technically more accurate, but the term dumb is far more common.) They dont use a state table; they have no idea what state the test application is in, or what inputs are legal or illegal. Most important, they cant recognize a bug when they see one. The pure dumb monkey exemplifies Beizers keyboard scrabbling test tool, and it isnt very useful for most projects. What can be useful is a notquite-dumb monkey thats ignorant about your project, but understands its environment enough to find very obvi-
ous bugs like crashes and hangs. Such tools have been in use for some time. In the early eighties the Lisa and Macintosh project teams developed a dumb monkey test tool with some limited knowledge of the Apple operating systems. Some developers required that their products survive a specified amount of monkey test time before they were released. Modern test monkeys know even more about their operating systems than those early Apple simian tools did. For this discussion, dumb monkeys are application-ignorant but environment-savvy.
the commercially available load and stress testing tools depend on this smart monkey technology. As Brian Marick says in The Craft of Software Testing, complex tests find more bugs than simple tests. But most of our automated tests are simple. We look for one major outcome after applying one input. Then we return the application to a known base state and execute another simple test. If the tests are well thought out, theyll find good bugs. But they remain simple tests. When we return the application to a base state, we discard any history from previous tests. Real users seldom do that. Instead they chain many simple activities, one after another, to create complex situations. Our simple tests dont emulate that user behavior. So if one simple activity sets up another activity for failure, our simple tests wont find that bugbut our users will find it. Using a smart monkey, however, allows us to make our simple automated tests into complex user scenarios. Remove the return-to-the-known-basestate routine from the tests. Then let the monkey decide which tests to run, and in what order. The monkey will create very complex tests for as many hours as you want, and it will make different series of complex tests every time you run it. Another advantage of this simpleturned-complex testing is that we can make sure the application handles memory and resource allocations well over time. Running the same series of tests, even complex tests, in the same sequence over and over again seldom finds new memory or resource bugs. Instead, we need to use complex sequences that weve never used before. Monkeys do this more efficiently than humans.
www.stqemagazine.com
This article is provided courtesy of STQE, the software testing and quality engineering magazine.
new features results in state explosion in which the number of nodes increases geometrically. So creating the model is seldom a one-time cost; for large models or tables, maintenance becomes a major cost factor. A good state table based on Petri nets (an automation modeling technique for expressing concurrent events in discrete parallel systems) or Markov chains (a weighted graph in which all weights are non-negative and the total weight of outgoing edges is positive) may have value beyond the smart monkey utilityand that may help justify some of the expense. Even so, the cost of creating the table, and the monkey to run tests using it, often outweighs the value of the additional bugs found. The sad fact is that most smart monkeys are not easily adapted to other projects. Your monkey must pay back all its costs by finding bugs on the specific project it was designed to test.
several popular automation tools. Although my teams interest is the Windows operating system, similar monkeys can be developed for other GUI operating systems using versions of automation tools specific to other operating systems. Monkeys with GUI savvy can manipulate many Windows applications. But a few applications rely on custom controls to expose their functions to users. Most automation tools have trouble testing those applications because the tools cant find the controls the user must manipulate. If the automation tool cant find the controls, the monkey cant find them either. We deal with that problem in several ways: s We tell the monkey to click randomly a few times in every new window it sees. Occasionally the monkey clicks on one of those invisible elements and changes the application state. s If the application has interesting areas such as toolbars that are invisible to the monkey, we tell it to focus its random clicks in those areas. s We can also ask the monkey to randomly perform mouse actions, such as left-clicks, right-clicks and drags, or enter random text at the current insertion point, if the application relies on human users doing those things often. (A monkey with those skills can make some weird and futuristic drawings in Microsoft Paint or Corel Draw!) We sometimes call these tools generic state monkeys, because to be effective they need to know five states: 1. The test application is not running. 2. The test application is running and is probably ready to accept test input. 3. A new window appeared. 4. The new window has Windows controls on it that the monkey recognizes. 5. The new window went away. Given a state table with just these five generic states, our monkey cant log much useful information about an
applications faults and failures. Most of the errors it sees are ambiguous; a human must examine the error log to decide what really happened. We call these monkey noise bugs and we try to avoid themmost often by ignoring them entirely. Instead, the monkey starts the application in a debug session and we monitor the monkeys tests with a debugger. We want to find nasty crashing bugs that display the dreaded Blue Screen of Death; a debugger is very good at trapping those bugs. It automatically halts the monkey and allows a developer to examine the machine state when the bug occurs.
www.stqemagazine.com
This article is provided courtesy of STQE, the software testing and quality engineering magazine.
coverage analysis and compare the results with a full pass of your non-monkey tests. If the monkey tests a function thats not touched by your traditional tests, you need to re-examine your test plan. If you have a state table for your application, teach the monkey to read it and check off each state as it tests your application. If it finds one new state thats not defined on your state table, the monkey has exposed a whole new universe of untested bug possibilities in your applicationsomething like discovering a wormhole into the heart of the Beta Quadrant! At least one commercial tool (Rationals TestFactory) uses the dumb monkey method to explore applications and create automation to maximize coverage while minimizing test time. (You might be surprised at the level of test coverage that dumb monkeys can achieve. On one internal Microsoft application, with complexity similar to Microsoft WordPad, we got 65% code function coverage in less than fifteen minutes of dumb monkey tests.)
usually cost effective. Our target is to spend no more than thirty minutes teaching a dumb monkey about a new application. Once youve given the dumb monkey the minimum information it needs to explore your application, set it up in a corner of your lab or office on an old, slow computer no one wants to use for regular testing. Have it start testing the application under a debugger and check its progress every day or so. If the monkey finds just one good bug, it will be the least expensive bug your team reports. Like any test tool, a good dumb monkey can be expensive to develop. But unlike many test tools, a mediocre or beginner dumb monkey has a good chance of finding some bugs, if you use it at the right time and for the right reasons. As the monkey proves its worth, you can add features and give it more skills. If you use Rational Visual Test on the Windows platform, you can start experimenting with dumb monkeys using a simplified monkey based on one of our Microsoft internal testing simians. (The Freddie dumb monkey is available on the compact disc accompanying Thomas R. Arnolds Visual Test 6 Bible [IDG Books]. Chapter 14 of the book describes monkey testing in more detail and shows you how to add features to Freddie.)
norance they miss many bugs. Monkeys wont add much value to embedded systems, software running in simple environments, or projects that are difficult to automate. Unless you already have an automation-readable model or state table, smart monkeys will be very expensive to develop. They may be cost effective, however, for critical parts of a project where the state table can be kept small. Theyre also valuable for load and stress testing. When used in the right places, smart monkeys will find a significant number of bugs. Dumb monkeys that understand your operating system can be used on any application to get some basic testing done. A small amount of training on your specific application greatly improves the monkeys chances of finding bugs. Dumb monkeys will not find many bugs, but the bugs they do find will be crashes and hangsthe bugs you probably least want to have in your product. STQE Noel Nyman, software test engineer for Microsofts Windows 2000 Certification (noeln@ microsoft.com), has worked in software product development and testing for over twenty years and is a member of the Los Altos Workshop on Software Testing. He tests, therefore he is.
Choosing Wisely
Monkey testing should not be your only testing. Monkeys dont understand your application, and in their ig-
21
www.stqemagazine.com