Our Disclosure Risk Calculator allows anyone to quickly and easily explore disclosure risks in a dataset. The data is stored in your browser unless you explicitly ask to upload it to our server for more detailed results.
Elliott, Milne, Roberts, Simpson, and Sporle (2022). Disclosure Risk Calculator. https://risk.terourou.org.
The app reads the dataset and asks you to choose a selection of variables that might be used for identification and provide a population size or sampling fraction.
Using the chosen variables, the software calculates the number of unique combinations (i.e., individuals in the sample with a unique set of responses for the chosen variables) and pairs of combinations. This is combined with the sampling fraction to calculate the disclosure risk.
Since this calculation is simple, it can be performed in your browser without uploading the data to our server.
If you choose to upload your data to the server by clicking the ‘Upload’ button, the encrypted version of the data (in which values are replaced by random labels) is uploaded to a secure process that only your current connection can access. It then makes use of functions from the ‘sdcMicro’ R package to calculate:
You can adjust the chosen variables and sampling fraction to see the effect on the estimated disclosure risk(s).
Once you close your browser, the data will be released from memory and deleted.
We have an R server running using Rserve that the client (your browser) can connected to using websockets. When you load the app, the websocket connection creates a new R instance on the server that is tied directly to your connection — once the websocket closes (i.e., if you close your browser or refresh the page) the associated R session is killed and any data contained within it is lost. Importantly, noone else can access this R session.
Once the connection is made and the user clicks the ‘Upload’ button, the encrypted version of the data is sent to the R process and stored in memory as an R dataframe. The process then returns a function (calculate_risk()
) which can be called by your app session (and only yours) to estimate the risk information. This function is scoped, so the data is availble to it but doesn't need to be re-uploaded each time.
If you want more details on how this all works, this project is completely open-source:
If you have questions to feedback, please either open an issue on Github or send us an email at terourounz@gmail.com.