cellKey - An R Package to Perturb Statistical Tables

Authors

  • Bernhard Meindl Statistik Austria

DOI:

https://doi.org/10.17713/ajs.v54i4.2131

Abstract

National statistical offices (NSIs) routinely publish aggregated data in the form of statistical tables. However, ensuring data privacy is a critical aspect of this process. Anonymization techniques must be applied to these tables to safeguard the privacy of individual data contributors and prevent unauthorized inference about specific units from the published outputs as often required by law. The R package cellKey offers a possible solution to this challenge by implementing a post-tabular perturbation method. This method modifies table cell values after aggregation, ensuring that sensitive information is adequately masked. It is versatile, suitable for both frequency tables and magnitude tables. A key feature of the cellKey package is its ability to maintain consistency across multiple tables that share identical cells. This ensures that anonymized data across different tables remains coherent while still protecting privacy. This approach makes the package especially useful for scenarios involving complex datasets with interrelated tables. The cellKey package is user-friendly and can empower NSIs and other data holders to publish statistical outputs that uphold both data utility and privacy, meeting the growing demands for secure and accessible data dissemination.

Downloads

Published

2025-05-28

How to Cite

Meindl, B. (2025). cellKey - An R Package to Perturb Statistical Tables. Austrian Journal of Statistics, 54(4), 136–156. https://doi.org/10.17713/ajs.v54i4.2131