“Privacy by design“, the term sounds logical and simple. In practice, neither is true. A major misconception is that Privacy by Design is seen as a technical issue and that anonymization and encryption will soon suffice. Privacy by Design is much more than that: it is primarily an issue for process design and management. Dr. Henk Jan Jansen describes in this article what they see as “Privacy by Design” and how organizations can translate this into their own business operations.
What is Privacy by Design?
The answer to this question is not unequivocal. Privacy by Design is described anything but unambiguously. Reference is often made to the 7 foundational principles of Ann Cavoukian or the publication of Enisa
It is striking that the interpretations differ considerably. Closer to home, the Dutch Data Protection Authority provides the following explanation:
“Privacy by Design” means that you, as an organization, already pay attention to privacy-enhancing measures during the development of products and services (such as information systems), also known as Privacy Enhancing Technologies (PET)
Second, you take data minimization into account: you process as little personal data as possible, i.e., only the data that are necessary for the purpose of the processing. This way you can handle personal data carefully and responsibly. technical enforcement.”
Privacy by Design therefore entails two things, although we would prefer to reverse the order for the sake of efficiency:
1. Processing personal data as little as possible
2. Subsequently ‘building in’ privacy by using technologies that protect personal data.
But how do you make this concrete? We describe this below, on the basis of the eight OECD principles for privacy [see below]. We have opted for the OECD principles because as a generic framework they are better suited to the objectives pursued by privacy legislation.
The OECD Privacy Principles
1. Collection Limitation Principle
2. Data Quality Principle
3. Purpose Specification Principle
4. Use Limitation Principle
5. Security Safeguards Principle
6. Openness Principle
7. Individual Participation Principle
8. Accountability Principle
Collection Limitation Principle
The Collection Limitation principle basically means that you only collect the personal data that you need. But what do you need at least to be able to do your job properly? And what information do you collect mainly to better serve the customer.
Translation into practice: a well-known example are application forms, in which you are asked all about it, where you wonder why people actually need certain data. In administrative processes, the Collection Limitation principle is at odds with the principle of one-off data collection. In the latter case, the idea is that an administrative process consists of several steps, but that all data for all steps are requested in the first step in order to prevent a person from passing the (digital) counter several times to obtain additional pieces of information. provide.
In this case, an organization has to make a choice between either collecting as little data as possible and later asking the customer for additional information, or asking everything in one go where you also receive data that you will not always use. This requires individual consideration for each case: what amount of data may be collected too much and how difficult it is to approach the customer again later for additional information.
A further free interpretation of the principle of “collection limitation” is the question: what information do you actually need. It is regularly visible that organizations want to take a certain test and thereby request more personal data than is necessary.
Examples:
An income test asks about the income of a person (an absolute value, with some sensitivity), instead of asking whether someone earns more or less than threshold x (a yes / no value, with lower sensitivity)
When determining the discount percentage for children’s admission, the visitor is asked for the date of birth (some sensitivity). An alternative could be: whether the visitor was born before or after the XYZ date (much lower sensitivity).
So, the question with these types of examples is: do you literally need this information, or do you use this information to create a
Privacy by Design: process personal data as little as possible and use protective technologies
To make an assessment and would it also be sufficient to ask the person concerned for the assessment?
In our view, the timely and structured removal of personal data also fits within the principle of collection limitation. The simple starting point is: if (personal) data are no longer required and legally should not be kept, they must be deleted. At the same time, practice shows that this is not easy.
With the digitization of archives and the low costs for storage, timely deletion has never been structurally taken into account. In many organizations, people are struggling with both compliance with archival legislation or the obligation to retain data in relation to privacy.
Translation into practice: the exact frameworks for the retention obligation are often unclear (how long, what part of the information provision).
The result is that often not a lot of thought is given to what should be stored and why. Selective storage of core documents from a file (instead of a complete file) or making sensitive personal data illegible (for example removing BSNs after x number of years from contracts and / or keeping the documents) is often not part of the file management and archiving processes. built-in.
In addition, many documents have been digitized in such a way that the documents can no longer be searched for specific characteristics such as personal data. Simply put, many documents were scanned as “images” and not as text files. This hinders the search and hinders the possibilities for anonymization or the efficient search and smoothing of personal data in archives.
When designing a data management system, the following must therefore be considered:
1) Determining the retention period. How long must be kept and what must be kept (for example, only a consent, or a complete file)?
2) Destruction, in such a way that nothing leaks or remains unintentionally during destruction.
3) Data Quality Principle
4) The Data Quality principle is not specific to privacy. Data Quality is a principle that has mainly been developed in the Business Intelligence (BI) field. The concept of Data Quality for BI consists of the following aspects:
Completeness – This indicates that the full expected dataset must be present. This can be a challenge when consolidating data from different sources. The completeness aspect does have a possible relationship for privacy. In this case, a lack of information may lead to a wrong conclusion. With more context-related information, data can be interpreted differently.
Consistency – This aspect has to do with consistent data being recorded throughout the system. For example, no bills can be paid by someone who has not ordered, or a salary slip cannot be created for someone who is no longer employed. For this aspect, I do not see any specific relationship with privacy, except that inconsistency will be related to the accuracy or completeness of the data.
Compliance – This aspect has to do with the same coding and references used throughout the system (for example, dd / mm / yyyy as the default note for a date). For this aspect, we do not see a specific relationship with privacy, except that inconsistency will be related to the accuracy or completeness of the data. Accuracy – This indicates to what extent the data reflects reality. This aspect is also related to privacy. Misinformation can lead to a wrong conclusion.
For example, if it is stated somewhere that the resident of 2nd street 21 has a criminal record for a sexual offense, but if it should have been 2nd street 31, this could lead to undesirable effects.
Integrity – The aspect of integrity has its own, somewhat more technically oriented meaning within the field of Business Intelligence. This is not about personal integrity, but about the validity of the data and the fact that the relationship between the data is correct. For example, there is a fixed relationship between a zip code and a street name / house number. If these two do not agree, this is an integrity issue.
Timeliness – This is the degree to which data is available on time. For example: In a data warehouse environment, in which the invoiced turnover is loaded every night, this data must be fully available every morning. Privacy is often first thought of as confidentiality of personal data, followed by its accuracy. However, the (timely) availability of personal data is also part of privacy. The relationship lies at this point.
Privacy by Design is a Business Question
How can the Data Quality principal best be translated into practice? Discussions about what information is reliable can be avoided by providing the “single point of truth” or “Authentic source register”. This means that as much as possible use is made of one well-organized source for the data and that copies and derivatives (which are often less well and frequently maintained) are avoided. So as few copies of databases or exports to Excel as possible. In addition to increasing the quality of the personal data, it improves transparency and simplifies security.
Use Limitation Principle
The Use Limitation principle means that the personal data may not be used other than for the original purpose, as originally established. Additional use is only permitted if there is a legal basis for this, or if the user agrees to this. If data is used by an organization differently than originally intended, then there is a good chance that this will (explicitly) require the additional approval of the person concerned. What does this mean in practice? When designing a business process, it must be considered where the data used comes from and what the basis is.
Translation into practice: organizations have to deal with this, for example, when using core registrations (such as the Personal Records Database) and when using the Citizen Service Number (BSN + Burger Service Number is the same as a social security number in the USA). The fact that you have access to this personal data does not mean that you can use it for everything. An example of the latter: the BSN is included in a personnel system for tax obligations. This citizen service number may then only be used for fiscal accountability, not on the leave notes, in HR reports or for the exchange of information regarding sickness reports.
There are two known challenges, where there is a need to use personal data more widely. The first known challenge for organizations is that they want to combine data from source systems to perform certain analyzes that can improve the quality or efficiency of the service. These types of initiatives are often referred to as Big Data or Business Intelligence. In these applications, it is important to offer data to environments in such a way that they have been stripped of personal data (anonymization) or that use is made of pseudonymization. Pseudonymization is the replacement of identifying data by a pseudonym, so that data can no longer be traced back to the person. However, a residual risk remains for both, namely that by combining more data, so many attributes of persons are ultimately known that their identity can easily be derived from them.
The second challenge is that the ICT department likes, and sometimes necessarily, to use representative test data for system development and problem analysis. With this test data one can act out current affairs and the common opinion is that the more truthful this data is, the better the test is. However, using “a copy of the production database” is undesirable and almost always unnecessary. For the development and execution of functional tests, it is sufficient to use a counterfeit set of test data, for example “John Doe”. Technical solutions are now also available that can generate fictitious test data, or anonymize or pseudonymize existing data. Usually, it is not necessary to use production data (real data) until a very last functional test. If this is really necessary, its provision can be facilitated incidentally within a conditioned situation. In our experience, in the vast majority of testing activities, it appears that it is ultimately not necessary at all to provide personal data for testing purposes.
Purpose Specification-Principe
With these last examples, the challenge for the previous principle, Purpose Specification, also becomes clear. This principle requires that the purpose of the data collection is known when the collection is created and that the purpose is not adjusted afterwards. The suggestion to include data in a BI environment or to use it during testing is rarely conceived initially, but something that the need grows over time. Getting rid of the specific personal characteristics also prevents friction with this principle, as with Use Limitation. On the other hand, when defining the “purpose specification”, a broader use of data can also be taken into account, also for the purpose of BI analyzes at the meta level, for example.
Security Safeguards-Principe
Determining the Security Safeguards is a field in itself. The choice of the right security measures is strongly related to the technology (hardware, applications) in which the personal data are processed. Moreover, the quality of the measures is related to the state of the art. With the current state of the art (in 2016), the following measures are generally considered appropriate:
1) Authentication at a trusted location, such as a workplace within a secure office and on a secure network, takes place at least on the basis of a knowledge reference (password).
2) Authentication at an untrusted location, such as a workplace at home or in a public place, or via an untrusted network, requires a possession identifier in addition to the knowledge characteristic.
3) Personal data that is sent over the company’s own secured network is preferably encrypted; outside your own secured network, such as the internet, they are always encrypted. This also applies to portable media.
4) Services that process or offer personal data cannot be accessed without authorization and authentication, for example through the use of certificates.
5) Physical and logical measures shield the processing of personal data, for example by placing servers in closed rooms and ‘hardening’ systems / components.
6) Access to personal data by
7) system administrators are recorded (time and user are logged).
8) Access and use are recorded (time, user, process, and result are logged). In addition to the measures for ‘normal’ personal data, the following measures for special personal data are generally considered appropriate:
9) In addition to the knowledge characteristic, authentication always also takes place on the basis of a possession characteristic.
10) Special personal data are encrypted, even if they are sent over the company’s own secured network.
Openness principle
The Openness principle requires organizations to be open and transparent when dealing with personal data. We do not see how the application of Privacy by Design can help improve openness. The other way around: if organizations are effective in Privacy by Design, this can be used in communications (such as policy, presentations) so that one can distinguish oneself as a responsible organization and gain trust.
Individual Participation principle
The Individual Participation principle requires options to be able to comply with inspection, correction and change requirements. An organization must be able to check within its own information provision which data of a person is known and to adjust and delete them where necessary.
If data needs to be modified, the data must not only be modified in the authentic source (the aforementioned single point of truth), but also in its derived copies. When using the right of erasure, the same applies in part: the data is erased in the authentic source and in derived copies. In addition, the deletion must be permanent. This in itself makes sense, were it not for the fact that in databases sometimes an entry cannot be deleted, but at most it is marked as “inactive”.
The fact that the rights of individuals must be able to be exercised in an information system must be translated into a functional requirement and (preferably) also tested upon delivery.
Accountability principle
According to the Accountability principle, the controller must ensure that all principles are met. In the case of Privacy by Design, this should mean that a delivery test must establish that all principles have been adequately followed. This could, for example, be in the form of a contribution to the acceptance test activities by an involved privacy advisor.
Privacy by Design is more than encryption or anonymization. Privacy by Design requires a structured approach, in which sensible use is made of personal information based on business process analysis. This makes Privacy by Design mainly a profession for business experts, information analysts and process architects.