Catálogo de publicaciones - libros
Cooperative Bug Isolation: Winning Thesis of the 2005 ACM Doctoral Dissertation Competition
Ben Liblit
Resumen/Descripción – provisto por la editorial
No disponible.
Palabras clave – provistas por la editorial
Software Engineering/Programming and Operating Systems; Software Engineering; Logics and Meanings of Programs; Algorithm Analysis and Problem Complexity
Disponibilidad
Institución detectada | Año de publicación | Navegá | Descargá | Solicitá |
---|---|---|---|---|
No detectada | 2007 | SpringerLink |
Información
Tipo de recurso:
libros
ISBN impreso
978-3-540-71877-2
ISBN electrónico
978-3-540-71878-9
Editor responsable
Springer Nature
País de edición
Reino Unido
Fecha de publicación
2007
Información sobre derechos de publicación
© Springer-Verlag Berlin Heidelberg 2007
Cobertura temática
Tabla de contenidos
Introduction
Ben Liblit
Real software is buggy. Real users can make it better. Cooperative Bug Isolation ( CBI ) seeks to leverage the huge amount of computation done by the end users of software. By gathering a little bit of information from every run of a program performed by its user community, we are able to make inferences automatically about the causes of bugs encountered in the field.
Palabras clave: User Community; Feedback Report; Crash Reporting; Sampling Transformation; Basic Sampling Strategy.
Pp. 1-6
Instrumentation Framework
Ben Liblit
This chapter describes the process of going from unmodified application source code to native executables with sampled instrumentation. This process is managed by the instrumentor : a software tool whose external behavior mimics that of the native compiler, but that internally applies the instrumentation injection and sampling transformation steps depicted at the top center of Fig. 1.1. Our instrumentor, sampler-cc , is implemented as a source-to-source transformation for C using the CIL C front end [48]. Transformed code then proceeds to GCC for native compilation. From the developer’s perspective, the sampler-cc command behaves exactly like the gcc command with a few extra instrumentation-related command line flags. Section 2.1 presents the basic strategy for managing fair, randomly sampled instrumentation. This sampling transformation is quite general, with potential applications beyond bug hunting. However, bug hunting is the focus of this book, and Sect. 2.2 describes several instrumentation schemes that may be used with the sampling transformation and that we have found to be helpful for bug isolation. Section 2.3 considers performance issues and examines several optimizations that may be applied atop the basic sampling transformation. Section 2.4 describes an adaptive, non-uniformly sampled generalization of the core random sampling model, while Sect. 2.5 closes the chapter with an informal discussion of realistic sampling rates in truly large scale deployments.
Palabras clave: Fast Path; Threshold Weight; Nonuniform Sampling; Slow Path; Interprocedural Analysis.
Pp. 7-38
Practical Considerations
Ben Liblit
We believe that CBI and related research efforts have great potential to make software development more responsive and efficient by giving developers accurate data about how software is actually used in deployment. However, testing this idea requires significant experimentation with real, and preferably large, user communities using real applications. This chapter reports on our experience in preparing for such experiments. We have selected several large open source applications, listed in Table 3.1, comprising some two million lines of code before instrumentation. We have built instrumented packages using the strategy described in Chap. 2, made these packages available to the public, and are now in the process of collecting feedback reports. We have not yet identified any bugs using these reports: our user base is still too small, and does not provide reports in the quantities needed by our statistical debugging techniques. However, we have demonstrated an end-to-end complete CBI system and feel comfortable in claiming that our approach is technically feasible. While aspects of our system could certainly be improved, at this point all components are good enough to support the deployment of realistic instrumented applications and the collection of feedback reports from a large user community. The design of a CBI system involves interesting challenges, both technical and social. In the next several sections, we focus on the solutions to technical problems most likely to be useful to the designers of similar systems and experiments: integration with existing native compilers (Sect. 3.1), management of static and dynamic linkage (Sect. 3.2), and correct execution in the presence of threads (Sect. 3.3). Moving toward the social domain, Sect. 3.4 discusses the privacy and security facets of widespread monitoring of deployed software. Sect. 3.5 considers CBI from the user’s perspective, and presents our approach to ensuring that users remain fully informed about and fully in control of their participation in the CBI system. Lastly, Sect. 3.6 briefly reviews the current status of our public deployment, and offers general information about the state of this experiment under way.
Pp. 39-54
Techniques for Statistical Debugging
Ben Liblit
Thus far we have focused on techniques for collecting sparsely sampled data from large numbers of users. However, this data is only as good as the sense we can make of it. This chapter presents several techniques for using sparsely sampled data to isolate the causes of bugs. Sampled data is terribly incomplete. With sampling, 99% of everything that happens is not even seen. Thus, we do not give strict causes and effects as one might look for using a symbolic debugger. Instead we use statistical models to identify those behaviors that tend to be strongly predictive of failure over many runs. We refer to this body of techniques as statistical debugging . Statistical debugging reaps the benefits of the Bernoulli sampling transformation developed in Sect. 2.1.1. While the data is incomplete, it is incomplete in a fair, statistically unbiased way. Thus the observed data is a noisy but representative sample of the complete behavior, and failure trends identified in the former are equally applicable to the later. Section 4.1 defines some basic notation and terminology that we will use throughout the remainder of this chapter. In Sect. 4.2 we describe an algorithm for isolating single, deterministic bugs using a process of elimination. Section 4.3 extends our scope to non-deterministic bugs using a general-purpose statistical regression model. This approach has certain limitations, which we discuss in greater depth in Sect. 4.3.4 and Sect. 4.4. Better understanding of these limitations leads us to develop an improved algorithm in Sect. 4.5 that combines statistical ranking techniques with an iterative bug elimination process to manage multiple unknown deterministic and non-deterministic bugs. The ranking and iterative elimination is our best known algorithm to date. Section 4.6 offers several case studies demonstrating how the algorithm has been used to successfully isolate both known and previously unknown bugs in real applications.
Palabras clave: Elimination Strategy; Elimination Algorithm; Candidate Predicate; Program Crash; Buffer Overrun.
Pp. 55-88
Related Work
Ben Liblit
Here we discuss a cross section of related work, loosely organized into three broad topics. We briefly visit static analyses that examine code without running it. We consider earlier approaches to profiling and tracing running code, most of which have concentrated on performance profiling. Lastly we review dynamic analyses that focus more directly on the problem of debugging, including several that use statistical methods.
Palabras clave: Bayesian Belief Network; Performance Overhead; Fast Path; Software Model Check; Instrumentation Scheme.
Pp. 89-93
Conclusion
Ben Liblit
It is an unfortunate fact that essentially all deployed software systems have bugs, and that users often encounter these bugs. The resources (measured in time, money, or people) available for improving software are always limited. Widespread Internet connectivity makes possible a radical change to this situation. For the first time it is feasible to directly observe the reality of a software system’s deployment. Through sheer numbers, the user community brings far more resources to bear on exercising a piece of software than could possibly be provided by the software’s authors. Coupled with an instrumentation, reporting, and analysis infrastructure, these users can potentially replace guesswork with real triage, directing scarce engineering resources to those areas that benefit the most people. The Cooperative Bug Isolation project represents one effort to leverage the strength in these users’ numbers. We have designed, developed, and deployed a debugging support system that encompasses a complete feedback loop from source to users to feedback to bug fixes.
Pp. 95-96