During the development of a software, with a special look at the always growing complexity of the softwares, the management of the quality of the softwares gets an always bigger role. The key element of the software quality management is the product testing. This paper takes us through an ingenious approach suggested by the author. The author bases the paper on the presence of test case generators for generation of random test cases. The author tries to drive in the point of the effectiveness of correctness measurement by way of some hypothetical tables and by comparing it to other approaches.
The article also talks about the defect detection and identification using two methods stating the situations in which both are used (viz. extensive & adaptive). The article also suggests that the product testing and productivity of the software are interconnected and cannot be dealt in isolation. The article clearly suggests the futility of assurance offered by various testing methods. Since 100% correctness is unviable most of the times it is important to know the degree of correctness of a program and to baptize the program as far as possible in the most cost effective manner.
This is where the author has successfully gelled in the concept of correctness measurement. He suggests the use of various tables based on defects and percentage successful of various functions in different products. Interpretation of tables of the type used by the author leads us to draw various inferences about the possible errors and narrows down the scope of search. The measures tell us clearly whether software should be approved or not. For example a table that tells the percent successful of each function used in product can suggest widely different things.
It may be that the test cases being used are wrong themselves or certain changes made in the program has caused a deviation in operation of the particular function by changing a value here and there. To ascertain the problem we need information on other products, which use the function and the historical performance of the particular module. Inferences about the progress of the software can also be mapped. The tables also tell us the nature of the modules being used. This clearly suggests the use of validation because of the fact that the success of functions in different products is being done.
Utility of this approach can be extended to the Mecomb life cycle project that we are doing . It would be a beneficial idea to prepare a log book measuring the defect and percentage successful at each stage of the coding and for the various modules that are present. Information about similar modules can be collected from all the participating companies in the project and a comparison might be done to see their success level. The analysis thereafter can suggest the mistakes being made and their simultaneous corrections.
The author however brings out the defects in this method by saying,” Sometimes the test case content is challenged as not being representative of usage” and “What to count for test and failure units are the other correctness measurement problems”. To the first problem the author suggests the use of increasingly difficult test. This however seems to be a rather juvenile effort at dealing with the difficulty. The complexity of test cases will only create further problems at the time of maintenance and service.
Morover it is not always that the person or organisation making the software also periodically checks for errors. They may be employees of the customers company who may only be intrigued by the complexity and will eventually be dissatisfied with the performance. Thus my point of view is to keep things simple as far as possible. The difficulty wit identifying failure units is that there may not be any sanguine answer to the question of where exactly the defect lies and until it is found and corrected the program is still uncertain. Herein the marketing perspective of management comes in.
The need to deliver the product on time calls for identifying the problems, which will affect the customer the most, and correcting them before release. The other problems whose probability of surfacing is less should slowly be corrected in the continuation phase or while maintenance or in the next version of the product. Thus thereafter these products should be marketed with major stress on correct parts. The article suggests that coverage considerations should not be ignored. However I feel that it is not the coverage that really matters.
The article on “Comparing the Effectiveness of Software Testing Strategies” by Basil and Selby suggest that there is no correlation whatsoever between program coverage and number or percentage of faults found (alpha>0. 05). It also brings out the inadequacy of using 100% coverage criteria for structural testing. The article suggests that structural approach to decide on testing may be accurate but the user is not always aware of the structure. Thus if he wants to know the degree of correctness of the program it will not be possible for him to interpret the structure.
Thus the article suggests an alternative in the correction measurement methods. My opinion about this is that code reading is also a way to improve the hits while testing. This has been proved to be more efficient than structural and functional testing in the article by Basil and Selby. The article also suggests that for defect detection random testing is better. This actually reiterates the fact exposed by the article written by Hamlet and Taylor (“Partition Testing Does Not Inspire Confidence”). The costs incurred are also much less in this method.
However the fire fighting approach of adaptive testing which the author seems to be unethical under the moral code of conduct. If the test cases are however by chance tried and the problem is identified then the manufacturer of the product may loose repute and standing in a highly competitive industry like software, apart from the damages that have to be paid. The article also points to the statement made by Ackerman and Musa in “When to stop Testing”. He says” If one does nor find new defects at a cost effective rate one should stop extensive testing because it is not worth the money being spent.
While discussing productivity the author stresses on the need of computers for test case generations suggesting the futility of hand written test cases. My take on this would be that we should exactly know when to use the generators and when to use human intellect for generating cases but taking into consideration the CPU time cost and the complexity. With the increase in complexity the cost of using the test case generators comes down and that of hand written test cases go up. The graph below demonstrates the fact.