Catálogo de publicaciones - libros

Compartir en
redes sociales


Privacy in Statistical Databases: CENEX-SDC Project International Conference, PSD 2006, Rome, Italy, December 13-15, 2006, Proceedings

Josep Domingo-Ferrer ; Luisa Franconi (eds.)

En conferencia: International Conference on Privacy in Statistical Databases (PSD) . Rome, Italy . December 13, 2006 - December 15, 2006

Resumen/Descripción – provisto por la editorial

No disponible.

Palabras clave – provistas por la editorial

Data Encryption; Database Management; Probability and Statistics in Computer Science; Computers and Society; Legal Aspects of Computing; Artificial Intelligence (incl. Robotics)

Disponibilidad
Institución detectada Año de publicación Navegá Descargá Solicitá
No detectada 2006 SpringerLink

Información

Tipo de recurso:

libros

ISBN impreso

978-3-540-49330-3

ISBN electrónico

978-3-540-49332-7

Editor responsable

Springer Nature

País de edición

Reino Unido

Fecha de publicación

Información sobre derechos de publicación

© Springer-Verlag Berlin Heidelberg 2006

Tabla de contenidos

A Method for Preserving Statistical Distributions Subject to Controlled Tabular Adjustment

Lawrence H. Cox; Jean G. Orelien; Babubhai V. Shah

Controlled tabular adjustment preserves confidentiality and tabular structure. Quality-preserving controlled tabular adjustment in addition preserves parameters of the distribution of the original (unadjusted) data. Both methods are based on mathematical programming. We introduce a method for preserving the original distribution itself, a fortiori the distributional parameters. The accuracy of the approximation is measured by minimum discrimination information. MDI is computed using an optimal statistical algorithm—iterative proportional fitting.

- Methods for Tabular Protection | Pp. 1-11

Automatic Structure Detection in Constraints of Tabular Data

Jordi Castro; Daniel Baena

Methods for the protection of statistical tabular data—as controlled tabular adjustment, cell suppression, or controlled rounding—need to solve several linear programming subproblems. For large multidimensional linked and hierarchical tables, such subproblems turn out to be computationally challenging. One of the techniques used to reduce the solution time of mathematical programming problems is to exploit the constraints structure using some specialized algorithm. Two of the most usual structures are block-angular matrices with either linking rows (primal block-angular structure) or linking columns (dual block-angular structure). Although constraints associated to tabular data have intrinsically a lot of structure, current software for tabular data protection neither detail nor exploit it, and simply provide a single matrix, or at most a set of smallest submatrices. We provide in this work an efficient tool for the automatic detection of primal or dual block-angular structure in constraints matrices. We test it on some of the complex CSPLIB instances, showing that when the number of linking rows or columns is small, the computational savings are significant.

- Methods for Tabular Protection | Pp. 12-24

A New Approach to Round Tabular Data

Juan José Salazar González

Controlled Rounding is a technique to replace each cell value in a table with a multiple of a base number such that the new table satisfies the same equations as the original table. Statistical agencies prefer a solution where cell values already multiple of the base number remain unchanged, while the others are one of the two closest multiple of the base number (i.e., rounded up or rounded down). This solution is called zero-restricted rounding. Finding such a solution is a very complicated problems, and on some tables it may not exist. This paper presents a mathematical model and an algorithm to find a good-enough near-feasible solution for tables where a zero-restricted rounding is complicated. It also presents computational results showing the behavior of the proposal in practice.

- Methods for Tabular Protection | Pp. 25-34

Harmonizing Table Protection: Results of a Study

Sarah Giessing; Stefan Dittrich

The paper reports results of a study aimed at the development of recommendations for harmonization of table protection in the German statistical system. We compare the performance of a selection of algorithms for secondary cell suppression under four different models for co-ordination of cell suppressions across agencies of a distributed system for official statistics, like the German or the European statistical system. For the special case of decentralized across-agency co-ordination as used in the European Statistical System, the paper also suggests a strategy to protect the data on the top level of the regional breakdown by perturbative methods rather than cell suppression.

- Methods for Tabular Protection | Pp. 35-47

Effects of Rounding on the Quality and Confidentiality of Statistical Data

Lawrence H. Cox; Jay J. Kim

Statistical data may be rounded to integer values for statistical disclosure limitation. The principal issues in evaluating a disclosure limitation method are: (1) Is the method effective for limiting disclosure? and (2) Are the effects of the method on data quality acceptable? We examine the first question in terms of the posterior probability distribution of original data given rounded data and the second by computing expected increase in total mean square error and expected difference between pre- and post-rounding distributions, as measured by a conditional chi-square statistic, for four rounding methods.

- Utility and Risk in Tabular Protection | Pp. 48-56

Disclosure Analysis for Two-Way Contingency Tables

Haibing Lu; Yingjiu Li; Xintao Wu

Disclosure analysis in two-way contingency tables is important in categorical data analysis. The disclosure analysis concerns whether a data snooper can infer any protected cell values, which contain privacy sensitive information, from available marginal totals (i.e., row sums and column sums) in a two-way contingency table. Previous research has been targeted on this problem from various perspectives. However, there is a lack of systematic definitions on the disclosure of cell values. Also, no previous study has been focused on the distribution of the cells that are subject to various types of disclosure. In this paper, we define four types of possible disclosure based on the exact upper bound and/or the lower bound of each cell that can be computed from the marginal totals. For each type of disclosure, we discover the distribution pattern of the cells subject to disclosure. Based on the distribution patterns discovered, we can speed up the search for all cells subject to disclosure.

- Utility and Risk in Tabular Protection | Pp. 57-67

Statistical Disclosure Control Methods Through a Risk-Utility Framework

Natalie Shlomo; Caroline Young

This paper discusses a disclosure risk – data utility framework for assessing statistical disclosure control (SDC) methods on statistical data. Disclosure risk is defined in terms of identifying individuals in small cells in the data which then leads to attribute disclosure of other sensitive variables. Information Loss measures are defined for assessing the impact of the SDC method on the utility of the data and its effects when carrying out standard statistical analysis tools. The quantitative disclosure risk and information loss measures can be plotted onto an R-U confidentiality map for determining optimal SDC methods. A user-friendly software application has been developed and implemented at the UK Office for National Statistics (ONS) to enable data suppliers to compare original and disclosure controlled statistical data and to make informed decisions on best methods for protecting their statistical data.

- Utility and Risk in Tabular Protection | Pp. 68-81

A Generalized Negative Binomial Smoothing Model for Sample Disclosure Risk Estimation

Yosef Rinott; Natalie Shlomo

We deal with the issue of risk estimation in a sample frequency table to be released by an agency. Risk arises from non-empty sample cells which represent small population cells and from population uniques in particular. Therefore risk estimation requires assessing which of the relevant population cells are indeed small. Various methods have been proposed for this task, and we present a new method in which estimation of a population cell frequency is based on smoothing using a local neighborhood of this cell, that is, cells having similar or close values in all attributes.

The statistical model we use is a model which subsumes the Poisson and Negative Binomial models. We provide some preliminary results and experiments with this method.

Comparisons of the new approach are made to a method based on , in which inference on a given cell is based on classical models of contingency tables. Such models connect each cell to a ‘neighborhood’ of cells with one or several common attributes, but some other attributes may differ significantly. We also compare to the Negative Binomial method in which inference on a given cell is based only on sampling weights, without learning from any type of ‘neighborhood’ of the given cell and without making use of the structure of the table.

- Utility and Risk in Tabular Protection | Pp. 82-93

Entry Uniqueness in Margined Tables

Shmuel Onn

We consider a problem in secure disclosure of multiway table margins. If the value of an entry in all tables having the same margins as those released from a source table in a data base is unique, then the value of that entry can be exposed and disclosure is insecure. We settle the computational complexity of detecting whether this situation occurs. In particular, for multiway tables where one category is significantly richer than the others, that is, when each sample point can take many values in one category and only few values in the other categories, we provide, for the first time, a polynomial time algorithm for checking uniqueness, allowing disclosing agencies to check entry uniqueness and make learned decisions on secure disclosure. Our proofs use our recent results on universality of 3-way tables and on n-fold integer programming, which we survey on the way.

- Utility and Risk in Tabular Protection | Pp. 94-101

Combinations of SDC Methods for Microdata Protection

Anna Oganian; Alan F. Karr

A number of methods have been proposed in the literature for masking (protecting) microdata. Nearly all of these methods may be implemented with different degrees of intensity, by setting the value of an appropriate parameter. However, even parameter variation may not be sufficient to realize appropriate levels of disclosure risk and data utility. In this paper we propose a new approach to protection of numerical microdata: applying multiple stages of masking to the data in a way that increases utility but controls disclosure risk.

- Methods for Microdata Protection | Pp. 102-113