Personal Identifiability in the Icelandic
Health Sector Database
Professor Einar Arnason
Professor of Evolutionary Biology
and Population Genetics
Institute of Biology
University of Iceland
[email protected]
Abstract
Personal identifiability is a
fundamental question in the ongoing debate about the Icelandic Bill
and Act on the Health Sector Database (HSD). If the data are
personally identifiable, Iceland's international legal commitments
indicate that a priori consent must be obtained from patients for
the use of their personal medical information. The HSD Act presumes
that one-way coding of personal identifiers renders the data
non-personally identifiable and that therefore a priori consent is
not required.
The history of the debate on the HSD
shows that the concept of personal identifiability was initially
based on a notion of 'considerable amount of time and manpower' as
a criterion for defining personal identifiability. This definition
comes from Recommendation R(97)5 of the Committee of Ministers of
the Council of Europe on Medical Data. As a result of the Icelandic
Data Protection Commission's opinion on the HSD, that concept was
rejected and the resulting Bill and HSD Act adopted a definition
from the European Data Protection Directive (95/46). The rejected
concept, however, reentered with the idea that one-way coding of
personal identifiers means there is no key that can be used to
trace the identity of a person in the database.
The question of what constitutes a
key in this context is of fundamental importance. The database will
collect and link data from different sources on individuals over
time and therefore the method of coding must remain stable. It is
possible therefore to construct a look-up table, which constitutes
a key. Keys can also be built from comparisons of patterns of
family trees as well as by putting generally available information
into context
The information in the Health Sector
Database is personal information. Therefore reason and justice
require that a priori consent be obtained from patients for the
transfer of their health data to the database as Iceland's
international legal obligations stipulate. Anything less is
unreasonable and unjust.
Keywords: Personal
identification, Icelandic Health Sector Database, EU Data
Protection Directive, health records data, one-way coding, keys,
genealogy, information context, privacy.
This is a Refereed
article published on 4 September 2002.
Citation: Arnason E, 'Personal
Identifiability in the Icelandic Health Sector Database', Refereed
Article, The Journal of Information, Law and Technology
(JILT) 2002
(2)<http://elj.warwick.ac.uk/jilt/02-2/arnason.html>. New
citation as at 1/1/04:
<http://www2.warwick.ac.uk/fac/soc/law/elj/jilt/2002_2/arnason/>.
1.
Introduction
The debate continues on the merits
of the Icelandic Act on a Health Sector Database (HSD) and the
plans for its construction. Lawsuits have already been filed that
challenge both the constitutionality of the Act and whether
Iceland's commitments under international law are violated[ 1 ]. The exclusive license to establish and operate
the Icelandic Health Sector Database has been given to a private
American company deCODE genetics. The Health Sector Database will
contain the medical records of the whole population of Iceland but
it also will be a structure through which a genealogical database
and a DNA database can be linked to the medical records. The
intention of the company is to exploit the information for
commercial profit by selling access to the database which can be
used as a research tool for research in epidemiology and in genetic
research as well as in studies of how to maintain health
systems.
A fundamental assumption of the
Health Sector Database Act is that the data on individuals are not
personally identifiable because the personal identifiers will be
coded with one-way methods. The Act presumes that one-way coding
effectively renders the data anonymous. If, however, one-way coding
is found not to be qualitatively different from coding with a key,
the data would be personally identifiable. In that case Iceland is
bound by its international commitments to obtain a priori consent
of the patients for the use of their data for a purpose other than
that for which they were originally gathered. In most circumstances
the physicians receive or obtain the information from the patients
under an ethical and legal duty of confidentiality that can only be
lifted with the consent of the patients or by a legal obligation
(such as specific legislation or a court order).
In this paper I trace the history of
the concepts of personal identifiability and keys during the debate
on the Icelandic Health Sector Database. Originally the definitions
used were derived from the Recommendation No R(97)5 of the
Committee of Ministers to Member States of the Council of Europe on
the Protection of Medical Data[ 2 ]. In response
to criticism they were replaced with definitions from the Directive
95/46/EC of the European Parliament and of the Council of 24
October 1995[ 3 ] on the Protection of Individuals
with Regard to the Processing of Personal Data and on the Free
Movement of Such Data. The question of the existence of keys is
fundamental and I ask what constitutes a key and describe ways of
making keys to open up the database with a look-up table, by
comparisons with genealogies, and from the context of general
information. I will argue that the sort of 'one-way encryption'
called for by the Health Sector Database Act will not render the
data 'anonymous'. I will also argue that the de facto existence of
a coding key to link new information on individuals to their
previous information in the Health Sector Database and to link
information in the DNA and genealogical databases to information in
the Health Sector Database, as well as the fact that the data
themselves allow the identification of individuals means that the
data are identifiable. They, therefore, come under the provisions
of the Directive 95/46/EC that is now legally binding on Iceland.
In contrast, the Recommendation is merely meant as an interpretive
aid to the Directive and it has no legal force. Therefore, the
consent of individuals in the Icelandic population should be sought
before personal medical data are entered into the Health Sector
Database.
2.
Overview of Database Plan
Based on the Act on a Health Sector
Database, no. 139/1998[ 4 ] the government of
Iceland has given a license to a private for-profit corporation,
deCODE genetics of Delaware USA, to create and operate a database
of the medical records on the entire population of Iceland. The
information in the medical records database will be
cross-referenced with a genealogical database of the entire nation
and with a genetic database covering a large number of individuals,
both of which are in the possession of the licensee, to make one
interactive database ( Figure 1 ) - referred
to as the GGPR database of Genotypes, Genealogy, health and disease
Phenotypes, and Resource use in the Icelandic health care system.
DeCODE is permitted to operate the database for commercial profit.
The database will allow subscribers to perform in silico disease
gene mapping, following pathogenesis of disease and complications
and response to treatment, provide information for management of
health and disease and health care resources. The prospective
customers are pharmaceutical and biotechnology companies, HMO's,
insurance companies and public health organisations and deCODE
itself. The Act stipulates in addition that the Ministry of Health
and the Director General of Public Health shall have free access to
statistical data from the database for compiling health reports,
planning, policy-making and other projects that they
specify.
The database is not in operation
yet. DeCODE has been in operation since 1996 and in the past
Icelandic health authorities have been making health reports and
health policy without access to such a database. The database,
therefore, is not crucial for either deCODE business or public
policy. The assumption, however, is that the database would become
a profitable venture for deCODE and that access to it would
facilitate public health policy (and see accounts by 5,
6).
On ownership the license states that
all information transferred to the database is and shall remain the
'common property of the Icelandic nation' under the protection and
rule of the Minister of Health and Social Security. The license,
issued for 12 years at a time, authorises deCODE to create and
operate the database for financial profit. At the termination of
the license or if the license is revoked deCODE shall hand over the
database and all software and software rights for its operation to
the ministry of health. If, on the termination of the license, the
ministry operates the database for profit it must pay deCODE a fee
for software and intellectual property rights. However, if the
ministry operates the database not for profit solely in the
interest of the public health system deCODE will not receive
payments for software or intellectual property rights. Thus, deCODE
seemingly retains some commercial rights. In a contract made in
connection with the issuing of the license deCODE agreed to
indemnify the state of Iceland against any and all claims that
could be made if the Act and regulations are found not to be in
compliance with rules of the European Economic Area or other
international rules and agreements that Iceland is or will become
party to. deCODE also agrees to pay all fines and financial costs
levied against the state of Iceland due to such non-compliance. The
private corporate interests of deCODE genetics and the public
interests of Iceland and Icelanders thus are mingled. However, it
is difficult to discern where public interests end and private
corporate interests start and vice versa.
DeCODE shall pay a fee for the
issuing of the license and the costs incurred by the various public
regulatory bodies monitoring the operation of the database. DeCODE
also pays an annual license fee to Iceland of 70 million Icelandic
kronur (IKR; approximately USD 820,000). If deCODE turns a profit
in operating the database Iceland gets a share in the profits up to
a maximum of 70 million IKR.
Large amounts of information from
medical records on each individual will be transferred to the
database ( Table 1 ). They exist in two forms,
as hand- or typewritten information that will be digitised and
already computerised information and information in a planned
countrywide fully standardised electronic medical records system.
More detailed information will be available for transfer from the
latter system with a long list of items to be transferred ( Table 1 ).
The information from medical records
will be transferred to the database and used under presumed
consent. Persons will not be asked to give their prior affirmative
consent to participation. Instead, people are given the opportunity
to opt-out of the database by registering their intention with the
Director General of Public Health. Those who opt-out before
information actually starts flowing to the database will have all
their medical records data excluded. Those who opt-out after the
database is already in operation can only exclude medical records
data that is generated subsequent to their opt-out.
The procedure for transfer of
information to the database is as follows ( 4 , 7 ). A health institution will one-way code a
person's identity number or social security number[ i ] into an encrypted personal number (PN). Workers
at a health institute gather the medical records information on an
individual that is permitted for transfer ( Table
1 ) into a package. To protect the data during transfer they
encrypt the package using a public/private key issued by deCODE,
the licensee. This package is then attached to the encrypted
personal number instead of to the identity number. This is then
sent to the Identity Encryption Service, a department of the Data
Protection Commission. The Director General of Public Health
maintains an opt-out database of those who have opted out of the
Health Sector Database, with either all their health records data
or specified parts of their health records. The Director General
using the same function will one-way encrypt the identity number of
individuals registered in the opt-out database and transmit that to
the IES. The IES uses the encrypted opt-out list to filter out data
on those who have opted out ( Figure 1 ). The
IES may re-encrypt the already one-way encrypted identity number
(SSN) and transmit this along with the respective data package to
the database where it forms the first sub-database of medical
records with encrypted personal identifiers. The encrypted SSN
becomes the final personal number (PN) that is associated with the
health data of an individual in the database. It is thus clear that
there will be many holders of the one-way encryption function,
including deCODE.
The licensee, deCODE, has built a
genealogical database of the entire Icelandic population and some
of the ancestors of most of the families. This database has about
six to seven hundred thousand individuals with names and identity
numbers. DeCODE will encrypt the identity numbers using the same
one-way function as above and transmit it via the IES to form a
second sub-database of genealogies with encrypted identity numbers
forming the same personal numbers (PN) as for the health data ( Figure 1 ). DeCODE and various physicians
collaborating on individual research project on the genetics of
various diseases have collectively amassed a large amount of
information into a genotypic database. This database also may
contain some information from medical records that pertain to the
diseases involved as well as molecular genetic information. The
collaborating physicians know the identity (names and kennitala) of
the participants. The data from this database will be transferred
via the same mechanisms to form a third sub-database of genotypic
data ( Figure 1 ) associated with the
respective PN.
The three sub-databases of medical
records, genealogies and genotypes are cross-link-able by the
personal numbers (PN, which are the one-way encrypted and
re-encrypted identity numbers or kennitala of Icelanders) and
together form the GGPR database ( Figure 1 ).
They contain micro-data that are database records on individual
subjects associated with the personal number. End users, deCODE
employees, query the database via a query layer to produce
intermediate results. Customers query the intermediate results and
final results are delivered to them as macro-data. Macro-data refer
to statistical results calculated from micro-data, such as the mean
age of a group of individuals.
3.
History of the Concept of Personal Identifiability During the HSD
Debate
3.1 Definitions of the HSD
Act
The Act on a Health Sector Database,
no. 139/1998[ 4 ] has these
definitions:
2. Personal data: all data on a
personally identified or personally identifiable individual. An
individual shall be counted as personally identifiable if he or she
can be identified, directly or indirectly, especially by reference
to an identity number, or one or more factors specific to his
physical, physiological, mental, economic, cultural or social
identity.
3. Non-personally identifiable data:
data on a person who is not personally identifiable as defined in
clause 2.
4. Coding: the transformation of
words or numbers into an incomprehensible series of
symbols.
5. One-way coding: the
transformation of words or series of digits into an
incomprehensible series of symbols which cannot be traced by means
of a decoding key.
According to these definitions
personal identifiability is very broad in scope and, conversely,
non-identifiability is very limited, being defined as the
complement of identifiability. One-way coding is defined as a
method that is supposed to eliminate the possibility of identifying
a person using a key. The definitions make clear that mere coding
is not enough. For even though it produces an incomprehensible
series of symbols it still could be 'comprehended' by the use of a
key. The essential issue here is that the coding is one-way.
Unidirectional one-way coding is supposed to be some kind of
technical method that eliminates the key.
These definitions are now law in
Iceland. Starting in 1997 from A draft Bill written by the current
license holder five major steps can be identified in the debate on
the Health Sector Database ( Table 2 ). The
changes made to the definitions of the various terms are contrasted
in Table 2 . I now discuss these steps and
changes in definitions and concepts during the debate on the Health
Sector Database.
3.2 First Draft of a Bill in July
1997
Dr. K?ri Stef?nsson, CEO of deCODE
genetics which is the license holder for creating and operating the
Health Sector Database, had the Lawyers at Sk?lav?r?ust?gur 12
draft a First Draft of Bill on Health Sector Databases dated July
14, 1997. He presented the draft Bill to the Ministry of Health as
a fax on September 3, 1997[ 8 ]. The aim of the
authors of the draft was that the Bill be passed through Parliament
during the fall of 1997 and that the Act take effect on January 1,
1998. Article 2 of the draft had these definitions:
'3. Personal information:
Information on private matters, health matters, finances or other
matters of a named or nameable individual, which it is reasonable
and natural to treat as confidential. An individual shall not be
counted as nameable if a considerable amount of time and manpower
would be required in order to name him/her. When an individual is
not nameable the information about him/her shall not be considered
to be personal information' (E? translated from the
Icelandic).
3.3 Bill and Draft of a Bill in the Spring
and Summer of 1998
When the Bill on Health Sector
Databases (notice the plural) was presented to the 122nd session of
Parliament in the Spring of 1998[ 9 ] it contained
the definition of the draft Bill. However, one addition was made.
It was stated that even if there exists a key to the data, an
individual shall not be considered personally identifiable if the
entity in possession of the data does not have access to the
key:
'2. Personal data: data regarding
personal matters, including health information, finance or other
items regarding a personally identified or identifiable individual,
which it is reasonable and natural to treat as confidential. A
person shall not be counted as personally identifiable if a
considerable amount of time and manpower would be required in order
to identify him/her. The same applies if the identification could
only take place through use of a decoding key, not available to the
person having the information. When an individual is not personally
identifiable information about him/her shall not be considered
personal information under the meaning of this law' (E? translated
from the Icelandic).
The argument that an individual
shall not be considered personally identifiable ' if a considerable
amount of time and manpower would be required in order to identify
him/her' comes from the Recommendation No R(97)5 of the Committee
of Ministers to Member States of the Council of Europe on the
Protection of Medical Data [2]. In explanatory notes the authors of
the Bill further state that provisions in the Bill regarding the
use of a key are based on ' procedural rules that the Data
Protection Commission, which operates by the Act on Processing and
Handling of Personal Information No. 121/1989, has recently made
for scientific research in the health sector. The rules specify
that data shall be coded with a key before they are handed over to
the researcher and that the key will then be kept by special
guardians appointed by the Data Protection Commission.' These
statements imply that the definitions in the Bill are in accordance
with the Act on Processing and Handling of Personal Information No.
121/1989[ 10 ]. They also imply that the
definitions conform to the procedures on the use of a key already
established by the Data Protection Commission.
The Bill on Health Sector Databases
met extensive opposition in Parliament and by the Icelandic public.
It was withdrawn in the late Spring of 1998, rewritten by a working
group in the Ministry of Health and a new draft version[ 11 ] sent out for comments to various bodies
including the National Bio-ethics Committee and the Data Protection
Commission in July 1998. This was the first time that these
regulatory bodies were formally asked to review the Bill. The draft
contained the same definition that ' a considerable amount of time
and manpower would be required in order to identify' a person from
the Recommendation No R(97)5 of the Committee of Ministers of the
Council of Europe[ 2 ].
3.4 Data Protection Commission's Opinions on
Bill
In its letter to the Ministry of
Health dated September 4, 1998[ 12 ] the
Icelandic Data Protection Commission overturned the draft Bill's
definition of personal identifiability and the reliance on the
Recommendation No R(97)5 of the Committee of Ministers of the
Council of Europe.
The Data Protection Commission's
opinion[ 12 ] was that the HSD Act should be in
accordance with EU Directive 95/46/EU on the Protection of
Individuals with Regard to the Processing of Personal Data and on
the Free Movement of Such Data, which was to be ratified by Iceland
as part of its obligations as a member of European Free Trade
Association (EFTA) and the agreement on the European Economic Area
(EEA) between EFTA and the EU. This means, in the Data Protection
Commission's opinion, that the EU Directive will have to be adopted
as law in Iceland and that both general and specialised
legislation, such as the HSD Act, must be consistent with the
Directive.
Furthermore,
'the Data Protection Commission
maintains that in the definition of the concept of personal data in
the database Bill, the definition of the above mentioned EU
Directive 95/46/EC appears to be totally disregarded; this states
in clause (a) Art. 2 that data on individuals are personal data, if
a decoding key exists for the coded data. The directive makes no
distinction as to whether the identification would require
considerable time and manpower'[ 12 ].
In fact the concept of considerable
time and manpower is not found at all in the EU Directive
95/46/EC[ 3 ] but instead is derived from the
Recommendation No R(97)5 of the Committee of Ministers Council of
Europe, as already mentioned.
The Data Protection Commission also
questioned the assertion that the Bill was in accordance with
Iceland's Data Protection Act[ ii ]. The
Commission also reccomended that the 'definition of personal data
in the Bill not be ambiguous'[ 12 ] and in
particular that it follow the EU Directive that was to become
binding on Iceland[ iii ].
This crystal clear statements by the
Data Protection Commission overturned the definitions in the Bill
as to what would constitute personal data. The foremost experts of
the State of Iceland on personal identifiability and data
protection had spoken loud and clear. The response of the Bill's
authors in the working group of the Ministry of Health was to
eliminate all terms based on the Recommendation No. R(97)5 and the
terms about keys that would be in the possession of someone other
than the researchers. Instead they adopted a direct translation
into the Icelandic of the definition from the Directive (95/46/EC)
that the Data Protection Commission said would ' become binding
under international law on Iceland's behalf.' The Directive states
(Art. 2):
'For the purposes of this
Directive
(a) personal data shall mean any
information relating to an identified or identifiable natural
person (data subject); an identifiable person is one who can be
identified, directly or indirectly, in particular by reference to
an identification number or to one or more factors specific to his
physical, physiological, mental, economic, cultural or social
identity'[ 3 ];
When the Bill on a Health Sector
Database (notice the singular) was submitted to the 123rd session
of Parliament in October 1998 [ 13 ] the
definition had been changed and now was based on the definition
from the Directive.
'2. Personal data: all data on a
personally identified or personally identifiable individual. An
individual shall be counted as personally identifiable if he can be
identified, directly or indirectly, especially by reference to an
identity number, or one or more factors specific to his physical,
physiological, mental, economic, cultural or social identity'[ 13 ].
However, the translation of the
definition of the Directive to the Icelandic as part of the
definition of the Bill was imprecise. The Directive's
identification number was translated into kennitala, which is the
term used for the national identity number of everyone in Iceland.
In the translation of the Icelandic Bill back to the English in the
official version of the Bill[ 13 ] it became an
identity number. Under the Directive the term identification number
is a broad concept encompassing any kind of an identification or
personal number. The Directive is not limited to a specific
identity number such as the kennitala of Iceland. Thus, when the
Bill speaks of an identity number it is narrower than the
identification number of the Directive.
The Data Protection Commission
reiterated its position, this time with its comments to the
permanent Health and Insurance Committee of the Parliament dated
October 26, 1998[ 14 ]. The Commission tried to
explain the difference between disconnecting personal identifiers
from the health data (de-identified data) and the method of coding
the personal identifiers with some encryption function. Coding
produces a new personal number (PN) but the health data are still
link-able to a particular person and thus they remain personal
data. De-identified data (disconnected from personal identifiers)
are regarded as anonymous unless the data were of such a nature or
quantity that the individual can be identified without access to a
personal identifier by reference to certain factors specific to the
data subject's physical, physiological, mental, economic, cultural
or social identity. If that was possible the data would not be
regarded anonymous[ iv ]. In conclusion the Data
Protection Commission said that 'the Bill's assertion that the
database will contain non-personally identifiable health data, does
not hold'. The Commission recommended that the definition be
dropped from the Bill.
The Health and Insurance Committee
did not heed the recommendation to drop the word ' non-personally
identifiable.' The definition thus based on the Directive became
law with the passage of the Act in December 1998. As it is not
workable to have a key for a database that is supposed to be
anonymous, no matter who holds it, as the Data Protection
Commission pointed out, one-way coding was adopted. Following that
it was claimed that a key does not exist because it is not possible
to trace directly back one-way coding of names or identity numbers.
One-way coding of personal identifiers is thus the essential
feature that is meant to ensure that the Bill and Act abide by the
Directive [ 15 , 16 ].
3.5 Admissions That Keys
Exist
Both Dr. K?ri Stef?nsson and
deCODE's department of database have recently admitted that keys
exist. In an interview in the New Scientist July 15, 2000
Stef?nsson states regarding the interconnection of health
information and genetic information:
Stef?nsson: 'Once we have identified
a family with one of these diseases, what we will do is to go to
those people and ask them to give us blood so that we can isolate
DNA. ...When we do this, we will ask for their permission to
cross-reference their names with the help of the health-care
database. But in order to do this, we will have to get their
explicit, signed consent'.
New Scientist: 'Does this mean that
you can identify individuals from the database?'
Stef?nsson: 'No. The information in
the database will be encrypted and the keys will be kept by the
Data Protection Commission of Iceland'.
The fact that there are keys means
that under the Directive and in the opinion of the Data Protection
Commission, the data are personally identifiable and not anonymous.
According to the Commission it does not make ' any difference
whether the person having the information has access to the
decoding key or not.' This had been accepted by the Ministry of
Health and the Parliament when changes were made to the Bill[ 13 ] both in response to the Data Protection
Commission's opinion to the Ministry[ 12 ] and
its opinion to the permanent Health and Social Security Committee
of Parliament (see above).
In an article in the Icelandic
newspaper Morgunbla? February 27, 2001[ 17 ] the
deCODE department of database stated that information will be
rendered non-personally identifiable using special encrypting key
that fulfils very strict technical security measures[ v ]. DeCODE's database department admitted that keys
exist and that it is possible to personally identify individuals by
applying the keys. To say that the keys fulfil 'strict technical
security measures' presumably means that 'considerable time and
manpower' would be required in order to break the keys. Be that as
it may, it is irrelevant in this context. The Data Protection
Commission already pointed out that that arrangement is not
mentioned by the Directive and in response that language had been
removed from the Bill presented to Parliament in the Fall of
1998[ 13 ].
3.6 Genealogy and Genetics
Databases
In the third and final round of
Parliamentary discussion on the HSD Bill a change was introduced
(Art.10) permitting the interconnection of medical records in the
HSD database with a database of genelogical information and with a
database of genetic information. Similarly, during the debate the
definitions of what constitutes genetic data were also changed ( Table 2 ). The Bill introduced to Parliament in
the Spring of 1998[ 9 ] defined genetic
information as information on individuals as well as information on
groups of related individuals and information both on health and
disease. This definition was removed in the Draft Bill circulated
in the Summer of 1998[ 11 ] as well as in the
Bill introduced to Parliament in the Fall of 1998[ 13 ]. During this time, which was the major period
for debate on the Health Sector Database both in society at large
and in Parliament, the definition of genetic data referred only to
information about individuals ( Table 2 ). In
early December 1998, late in the Parliamentary debate, the
definition from the Bill of Spring 1998[ 9 ] was
reintroduced verbatim in a motion to change the Bill[ 13 ] and this definition became law.
These changes and the resulting
definition mean that genetic information covers a wide field
including information on inheritance of traits in groups of related
individuals. The definition also means that it is easier to
recognise individuals based on genetic information than if the more
limited definition had been kept because the information in the
database refers to inheritable traits of individuals as well as of
groups of related individuals.
3.7 Opt-out Database
Another change made to the Bill in
the fall of 1998 was an introduction of Art. 8 on the Rights of
Patients. This specified that a patient could at any time request
that his/her information not be entered onto the Health Sector
Database by filling out a form and filing it with the Director
General of Public Health. The Director General would enter those
individuals on a coded registry or onto the opt-out
database.
The opt-out database must be kept up
to date and is required for the day to day transfer of data to the
Health Sector Database. The opt-out database will provide the means
for filtering out the medical information on those who have opted
out from the stream of data being transferred to the Health Sector
Database[ 7 ]. These individuals, however, will
not be filtered out from the genealogical database that exist at
the licensee and will be transferred via the same transfer layer as
medical information to the Health Sector Database[ 7 ]. The opt-out database makes it more likely that
individuals can be identified under the Health Sector Database
scheme.
4.
Building a Key with a Look-up Table
The claim that one-way coding means
that it is impossible to trace back with a key only holds in a
narrow technical sense. If a personal identification such as the
name John Doe (or his identity number 010476-4878) is sent through
coding, using for example a one-way hash function[ 18 , 19 ], the outcome would be
'6cad0ac09e9c602a6477db4247bdeed1', a new invented and unique
personal number (PN). Similarly if the name Jane Doe (or
020587-5988) was one-way coded using the same method the outcome
would be the new invented personal number
'73c01bf88feb18695bd65e611ef1cf26'. If we only had access to the
invented numbers '6cad0ac09e9c602a6477db4247bdeed1' or
'73c01bf88feb18695bd65e611ef1cf26', it would be very difficult to
find out that one of them represented the name John Doe and the
other Jane Doe. If this was all, the individuals could be
considered to be non-personally identifiable, because it would not
be reasonably possible to go from the one-way encrypted personal
number directly back to the name. The individuals, however, would
only be non-personally identifiable in the narrow sense of going
directly back from the code to the name.
During the operation of the HSD
database, however, there will be a key in operation. The HSD
database is a long-time and longitudinal data gathering and
interconnection of previous, current and future data on each
individual[ 20 , 21 ]. The
database will be updated regularly and when new data are generated
(for example during a person's visit to a physician) they must find
their way to the right place in the database and be connected to
other data on that particular individual. (The same applies to
updating of the genealogy and genetic databases. For the genealogy
database for example, children are born and linkages among families
are formed and broken with marriage and divorce). Therefore, there
exists 'knowledge' of who the individual is and where he/she can be
found in the Health Sector Database or for that matter in any of
the three databases that will be interconnected ( Figure 1 ). That knowledge resides in the method
used for coding. In order to update the database the method must
remain the same, stable in time. The method, therefore, is a key
because with access to the method a look-up table connecting the
names or identity numbers with the encrypted personal numbers or
vice versa can be made effortlessly.
4.1 Coding, a Transformation of
Names
Coding is no more than a
transformation of a name or identity number to another form. With
one-way coding an individual gets a new and invented personal
identity instead of his/her identity number - a so-called personal
number or PN number[ 7 ]. Several documents on the
database refer to a hash function as a method for such a one-way
transformation [e.g. 7 , 19 ]. A
hash function is a transformation of an input m to an output string
of a fixed length the hash value h, or H(m) ? h.
Cryptology basically requires a hash
function i) to accept an input of any length, ii) to give an output
of fixed length, iii) that it be easy to calculate H(x) for a given
input x, iv) that H(x) be one-way , and v) that H(x) be collision
free[ 18 ].
A hash function is one-way if the
function is hard to invert, which means that given some hash value
h it is very difficult to find some input x that will yield that
hash value, H(x) ? h. If given some input x, and if it is
computationally very difficult to find some other input y that is
not the same as x such that H(x) = H(y) (i.e. two different inputs
that yield the same hash value) then the hash function is said to
be collision free[ 18 ]. Sometimes hash functions
may allow collisions that have to be dealt with in a special
manner.
When the Act on a Health Sector
Database refers to one-way coding as the transformation of words or
series of digits into an incomprehensible series of symbols which
cannot be traced by means of a decoding key, it seems to be based
on a protocol such as this hash function. A repeated one-way coding
would take the output of the first hash function as an input for
the second and so on. One can take MD5 (Message Digest 5) as an
example of a hash function for such one-way coding. MD5 will take a
message of any length and 'digest' it to produce a 128 bit
'fingerprint'. Functions such as MD5 are generally used for
electronic signatures of documents. I shall use it here to make an
example look-up table, a key made with one-way coding.
4.2 Look-up Table
Even though one cannot directly
break the key (e.g. through factoring;[ 18 ]) the
function used for the database must remain stable in order to
update the database. Therefore, anyone with access to the function
(or functions) can easily make a table that contains side by side
the input and the output of the function[ 20 ].
A look-up table of names or
identifying numbers and coded (encrypted) names or personal numbers
is a table ( Table 3 ) that contains side by
side the names and the coded names. One can look up in the table to
find the encrypted name corresponding to a real name or to find the
real name corresponding to an encrypted name. Such a look-up table
is a key [ 22 , 20 , 23 ]. This was known during the debate on the
Health Sector Database because the method is described in Appendix
VI to the Bill[ 19 ]: feed the Icelandic National
Registry of names or identity numbers through the function and make
a dictionary or a table of the input and output. A table can also
be made for a more limited group. If a decision was made to go back
and open the database, for example if the Parliament passed a law
to that effect or if a court of law ordered the opening up of the
database, it would only take a moment of computer time to make a
look-up table and open the database with a key. One would only have
to bring together the holders of the function or functions and feed
the National Registry through. Similarly, anyone who knows that a
particular name or identity number is being transferred from a
health institution to the database and can observe its encrypted
personal number appear at the database can make a similar
inference[ 21 ].
4.3 Personal Identifiability During
Preparation for Transfer
In order to transfer data to the
Health Sector Database health records must be opened, read and
digitised. At this stage the data are fully personally
identifiable. This is true for all current data that are destined
to be included in the database. It is also true for the data of the
more than 20,000 people who have already rejected participation by
sending an opt-out form to the Director General of Public Health
because the people who prepare the data for transfer are not
supposed to know who has opted out. Data on everyone will be read,
digitised and sent towards the database. This is also true for all
deceased people. Their records will be opened, read and digitised.
This examination of all health records is done for a purpose other
than that for which they were gathered. Also contrary to the wishes
of those who have opted out, their data will be examined for a
purpose other than that which they were gathered, prepared for
transfer and sent off in the direction of the database. If the
Identity Encryption Service makes mistakes these data may end up in
the database even if there is a specific ban against their use. The
Data Protection Commission operates the Identity Encryption Service
and oversees its work, thus in effect overseeing itself. Thus
issues of privacy are raised for the preparation and transfer as
well as for data already stored on the database.
5.
Building a Key from the Shapes of Genealogies
The Act on a Health Sector Database
permits the interconnection of the Health Sector Database of
medical records with a genealogical database. According to the
Security Target for an Icelandic Health Database made for the Data
Protection Commission by Admiral Management Services Limited[ 7 , 24 ] the genealogical database
of the licensee (deCODE genetics) will be one-way coded in the same
way as the Health Sector Database. The same also applies to a
database of genetic information that the licensee has made through
collaborative research on various diseases. The three databases
must use the same encrypted personal numbers (or be related in a
unique manner) in order for the interconnection of these three
databases to be possible.
The genealogical database, however,
also exists at the licensee using names and/or identity numbers as
the personal identifiers. The licensee has announced a gift to the
Icelandic nation in the form of open web access to its un-encrypted
genealogical database[ 25 ]. The genealogical
data also exists elsewhere in the society. Since the same database
and same genealogical information exists using both encrypted and
un-encrypted names anyone with access to both databases can build a
key by comparing and matching the shapes of the patterns of family
relationships in the two database versions[ vi ].
Theoretically there exists an
enormous number of possible family trees connecting individuals in
some group (the number is a power function of the number of
individuals). The real family tree of a particular group of
individuals, therefore, is likely to be unique and different in
shape from the family tree of another group of the same number of
individuals. The number of children and their gender and the
connections of one family to another through marriage and
childbirth form a pattern that can in most cases be used to
recognise families. There are about 2,500 families with six
children, and somewhat less than 20,000 families with two children
in Iceland; the number of other common family patterns lie in
between these numbers. It is easier to recognise a particular
six-children family than a particular two-children one. The former
are fewer and their potential theoretical patterns are more
numerous. However, the interconnection of families makes them
unique and thus recognisable. The families of John and Jane Doe
have a unique pattern, as all other families in the country. They
are recognisable by the unique shape of the family tree whether the
individuals are referred to by name or by an encrypted personal
number.
Figure 2 shows
an example of two families and their interconnections. The first
pattern is from a genealogical database that identifies individuals
by name or by identity number. The second figure is from the
genealogical database that identifies individuals by their
encrypted personal number. The method used for encryption is a very
safe triple encryption that is supposed to be very difficult to
break. Nevertheless personal identification is possible, and
relatively easy, because the family patterns are unique. The
observed patterns ( Figure 2 ) are the only
family patterns in the two databases that match. One can thus make
a key by reading directly from the figure.
6.
Building a Key from the Context
In the familiar radio or TV game
'Name that person', someone appears in disguise and changes his/her
voice in replying. The participants can ask: 'Are you a man (or a
woman?'); 'Do you play the piano or do you play football?', and so
on. The person in disguise replies truthfully yes or no. Finally
the participants figure out from the context of the questions and
answers who that person is and name him or her. This is an example
of building a key from the context.
Even if one did not build a key
using a look-up table or from comparing genealogies it would
nevertheless be possible to build a key by putting general
information in similar context as is done in the game[ 26 ]. If personal identifiers, such as name,
identity number or cell-phone number, have been irreversibly
stripped and replaced with a one-time-only (disposable) encrypted
personal number one can speak of dis-connected[ 15 , 16 , 27 ]
or de-identified data[ 26 ]. General information,
demographic or health data can be attached to such an encrypted
personal number. As the number of information bits thus attached to
the encrypted personal number are increased the circle is narrowed
until the combination of information bits becomes unique. Such a
combination can be used to point to the individual as if it was a
fingerprint. This amounts to making a key even if the personal
identifiers have been stripped from the data.
This is called re-identifying[ 26 ] the individual based on information that is
generally available. This is much easier in a small nation such as
Iceland than among a more populous nation. Technology also has
changed everything in this respect. With internet access in the
current age of information there is more and more general
information on a person available to almost anyone[ 26 ]. Such general information can be used to form
a combination that uniquely identifies an individual. With that,
one can put into context other information that accompanies the
general information and thus pinpoint to whom sensitive personal
information belongs.
As an example take the identity
numbers of individuals that have been coded either with a
one-time-only encryption function or a one-way hash function as
described above. Attached to this personal number is general
information such as gender, birth date, year of birth, height, town
of residence as well as health information of varying sensitivity.
Examples could be operation for appendicitis, cancer of the colon
or cancer of the breast, or diabetes ( Table
4 ). More sensitive information, such as on venereal or mental
disease, also might be included.
The yearly average number of births
in Iceland is about 4,200, or 11--12 births per day. Few days have
more than 20 births. Having information about birth date and year
thus narrows the circle down to about 20 people at most[ 22 ]. By including information on gender the number
is halved: on average six girls and boys are born per day and very
seldom are there more than ten boys and girls born per day. By
adding height, township or eye colour one can without doubt
recognise most if not all people. Therefore general information
comparable to that required for a passport application is
sufficient to recognise an individual[ 22 ]
without a name or identity number. From that one can identify what
individual has which disease if given information such as that in Table 4 . Individuals who have even a more
'sensitive' disease are recognisable in a similar manner. A male
born February 2, 1979 is one of (on average) six males born that
day. He is 176 cm high and lives in Dalv?k. That must be Helgi. He
has diabetes. One does not need a key, a family tree or personal
identifier for that.
The various bits of information that
will be transferred to the database ( Table 1 )
are of a similar nature as in the above example. There are for
example many dates and times of visits and other bits of
information that are innocuous by themselves ( Table 1 ). They can be combined in a similar
manner to make a unique personal identifier without recourse to
genealogy or personal identity number.
7.
Discussion
The history of the concept of
personal identifiability of the Bill and the Act on a Health Sector
Database[ 4 ] was initially based on the premise
that individuals would not be regarded personally identifiable if
considerable time and manpower was required for identification. The
criticisms levelled at the Bill, as well as the changes made to it
in response to criticisms, show that the initial plan was based on
false premises. The Data Protection Commission made this evident in
its opinion and during the debate on the Bill. In spite of this,
that claim is still being promoted by the main proponents of the
database. This basic premise of the Bill was partly reintroduced
and became law with the definition of Act that one-way coding of
personal identifiers renders the individual's health data
non-personally identifiable because there is no key. A question
therefore arises of what a key is, what personally identifiable
means, and, how keys can be built to open up the
database.
7.1 Personal Data
During the debate on the Bill on a
Health Sector Database it was claimed that the Bill might fulfil
the requirements of international legal instruments[ 15 ] if the technical premise of the Bill was
correct that one-way coding entails a final and complete unlinking
of data and personal identifiers. However, the authors[ 15 ] also acknowledged the possibility that coding
and one-way coding would not be considered qualitatively different
and that therefore one-way coding would not be considered to be a
final and complete unlinking of data and personal identifiers[ 16 ].
The premise of the Act does not hold
up under close scrutiny. One-way coding does not mean that a key
does not exist. One-way coding only means that it is difficult or
computationally intensive to trace back directly from the encrypted
personal number to the identity number or the name. By adopting
this definition the Bill reintroduced the concept of ' considerable
amount of time and manpower' that the Data Protection Commission
had already rejected as it is not part of the Directive. Contrary
to this, according to the Directive account should be taken of all
the means likely reasonably to be used either by the controller or
by any other person to identify the said person[ 3 ]. The data are therefore personal data both under
the Health Sector Database Act[ 4 ] and the Data
Protection Act[ 28 ] that both are based on the
Directive 95/46/EC[ 3 ]. Having ratified the
Directive Iceland now is bound by it. If Iceland is going to fulfil
its international commitments it is necessary to get consent of the
individual before transfer of data to the Health Sector Database.
Multiple one-way coding makes no difference.
Some of the officials of the State
of Iceland who are supposed to enforce this Act claim that within
the meaning of this law the data are non-personally identifiable.
Some of their critics have called this the
flat-earth-theory-of-law: if a legal text asserts that the earth is
flat then it is flat in the meaning of that law even if it flies in
the face of physical reality. Because the database Act asserts that
one-way coding is the transformation of words or series of digits
into an incomprehensible series of symbols which cannot be traced
by means of a decoding key, these Icelandic officials have argued,
it therefore means that a key does not exist in the meaning of that
law.
Equality before the law is a basic
rule of law[ 29 ]. The definition in the first
versions of the Bill on Health Sector Database was based on a
concept from the Recommendation R(97)5 of the Council of Europe
Committee of Ministers stating that not being personally
identifiable refers to methods that would require 'considerable
time and manpower'to break. In my view this cannot be a foundation
for a legislation. If this concept is used as a basis for law it
means that those who have access to considerable time and manpower
are above the law which is contrary to equality. In that case the
foundation of law would also be dependent on the status of
technology, which also is doubtful. For as Art. 27 of the Preamble
of the Directive[3] states the scope of protection 'must not in
effect depend on the techniques used, otherwise this would create a
serious risk of circumvention'.
There is a basic difference between
definitions in the Directive (95/46)[3] and the Recommendation
R(97)5[2]. The Directive, which is now legally binding on Iceland
and has been entered into Icelandic law with the Act on Personal
Protection (no. 77/2000)[ 28 ], gave rise to the
definition of personal data in the Act on the Health Sector
Database. The difference between the Directive and the
Recommendation is that the Directive defines in very broad terms
what personal data are but does not discuss or define what
non-personally identifiable data are. The Recommendation, however,
defines non-personally identifiable data. The effect of this is
that the Directive puts the burden of proof on anyone who claims
that he or she is working with non-personally identifiable data. In
addition the Recommendation is merely meant as an interpretive aid
to the Directive and it has no legal force.
7.2 Building Keys
One-way coding of personal
identifiers does not equal de-identifying data because the database
is a longitudinal collection and linkage of data on an individual.
Because the database is longitudinal the method of coding must
remain stable in time or else the database could not be updated.
Coding of the same identity number will therefore always produce
the same personal number. Anyone who can send an identity number
through the coding process and observe the outcome can thus make a
look-up table, which is a key[ 21 ].
Even if it was not possible to make
a look-up table, identifying a person is nevertheless possible by
inference. The data will also be interconnected with genealogical
data that also will be longitudinal data as are the health data.
The shapes of family trees will change with birth of new children
thus making it easier to recognise families with each updating of
the genealogical database. Family trees soon become unique when the
number of individuals in a group is increased. Comparisons of the
patterns of family trees from a genealogical database containing
one-way encrypted personal numbers as identifiers with the same
genealogical database containing names or identity numbers as
identifiers is a method for making a key.
As already discussed the Health
Sector Database Act [ 4 ] as well as the
Explanatory Notes to the Bill claim that one-way coding renders
information on a personally identifiable individual
non-identifiable because there is no key. The definitions also
claim that it is not possible to identify an individual with
reference to any factors specific to his/her physical,
physiological, mental, economic, cultural or social identity. This
is questionable. Enough general information (passport information)
is publicly available to identify most individuals from the
context[ 22 ]. Keys can be made this
way.
In this paper I have discussed
examples of methods that would be reasonably used to build keys to
open up the Icelandic Health Sector Database. Personal
identifiability is not a distant, theoretical possibility[ 22 ]. It is a real possibility and a problem easy
to solve. It is possible to build keys to open the database and
multiple one-way coding does not alter that in any way. If the
Parliament changed the law and permitted the opening of the
database or if a court of law ruled that it should be opened there
is no technical hindrance to do so in an instant. The main premise
of the Icelandic Act on a Health Sector Database therefore does not
hold up to scrutiny. Also various entities involved in the
preparation of data for and in the operation of the Health Sector
Database could use methods of this nature to identify individuals
in the database. These entities are identified as threats to the
security of the database[ 7 , 23 , 22 ].
7.3 Lessons and
Ramifications
The Icelandic case has legal and
ethical ramifications and various lessons can be learned from it
[e.g. see 30 , 5 , 6 , 31 , 32 , 33 ]
The EU Data Protection Directive
applies to personal data but does not apply to data that are truly
anonymous. The Icelandic HSD Act argues that the technical solution
of one-way coding of personal identifiers renders the data
anonymous. Some have argued that the HSD Act is legal under
Icelandic constitution and Iceland's international commitments[ 16 , 34 ] while others have
reached the opposite conclusion[ 35 , 5 ]. The views that one-way coding can achieve this
have been expressed[ 36 ] although doubts have
also been expressed about whether one-way coding really is
different from coding in this respect[ 16 ]. I
have argued here that coding, be it one-way or multiple, is largely
irrelevant. In reality anonymity does not exist in databases, such
as the Icelandic HSD, that have large amounts of information from
which contextual inferences about personal identity can be
drawn.
The opt-out clause is another issue
that has been reviewed both favourably[ 37 ] and
unfavourably[5]. I have previously argued that it represents a
totalitarian aspect of the Act because in reality an Icelandic
citizen has no choice[ 38 ]. The individual is
given two alternatives: to either belong to the health sector
database via presumed consent (and thus directly be part of the
business plan of a private corporation) or to register with the
government and enter the opt-out database. However, the opt-out
database, kept up to data, is required for the normal transfer of
data to the health sector database. Thus one way or another the
database plan involves everyone, no one is left alone.
The exclusive license issued to
deCODE has ramifications for scientific freedom as well as European
competition rules[ 16 , 5 ].
Also Iceland has adopted European Directive 69/9/EC on Protection
of Databases[ 39 ] that applies sui generis
rights to databases. This may have ramifications when the license
expires and the state takes over the database. There are provisions
in the act that are meant to ensure these rights for the state[ 16 ].
8.
Conclusions
One can reasonably expect that
methods such as the ones described in this paper would be used to
identify persons in the Icelandic Health Sector Database. The
individuals are personally identifiable both in the preparation of
data for transfer, in the opt-out database and in the Health Sector
Database. Therefore, it is both right and reasonable to require the
a priori consent of the individuals for inclusion of their data on
the database and their use for a purpose other than what they were
gathered for, as Iceland's constitution and international
commitments dictate[ 12 , 3 ].
Anything less is unreasonable and unjust.
Notes and
References
1.
Mannvernd (2002), Status of Lawsuits against the Icelandic Health
Sector Database Act and related matters < http://www.mannvernd.is/english/lawsuit.html >.
2.
Council of Europe, Committee of Ministers (1997), Recommendation
No. R (97) 5 of the Committee of Ministers to Member States on the
Protection of Medical Data < http://www.coe.fr/cm/ta/rec/1997/97r5.html >
3.
European Parliament and the Council (1995), Directive 95/46/EC of
the European Parliament and of the Council of 24 October 1995 on
the protection of individuals with regard to the processing of
personal data and on the free movement of such data <
http://www.privacy.org/pi/intl_orgs/ec/final_EU_Data_Protection.html >.
4.
Al?ingi, Icelandic Parliament (1998), Act on a Health Sector
Database no. 139/1998. Passed by Al?ingi December 17, 1998 <
http://brunnur.stjr.is/interpro/htr/htr.nsf/pages/gagnagr-ensk >.
5. Greely
H T (2000), Iceland's Plan for Genomics Research: Facts and
Implications, Jurimetrics, 40, 153-191
<
http://www.mannvernd.is/english/articles/HTGreely_Jurimetrics_2000.html >.
6. Rose H
(2001), The Commodification of Bioinformation: The Icelandic Health
Sector Database, Tech. rep., The Wellcome Trust, London
<http://www.mannvernd.is/greinar/hilaryrose1_3975.PDF>.
7.
Admiral Management Services Limited (2001), Security Target for an
Icelandic Health Database, Admiral Management Services Limited.
Made for Icelandic Data Protection Authority
<
http://www.personuvernd.is/tolvunefnd.nsf/Files/SecurityTarget/$file/SecurityTarget.
pdf >.
8.
Ministry of Health and Social Security, Iceland (1997), First draft
of Bill on Health Sector Databases. Presented to Icelandic Ministry
of Health and Social Security by K. Stef?nsson of deCODE genetics,
September 3, 1997
<
http://www.mannvernd.is/english/laws/HSDbill_english_firstdraft_140797.html >.
9.
Al?ingi, Icelandic Parliament (1998), Bill on Health Sector
Databases. 122 session of al?ingi, spring 1998. Submitted to
Al?ingi, Icelandic Parliament, at 122nd session, Spring of 1998
<
http://www.mannvernd.is/english/laws/HSDbill_english_122session1998.html >
10.
Al?ingi, Icelandic Parliament (1989), Act on the Recording and
Handling of Personal Information No. 121/1989. Superceded by Act on
the Protection of Individuals with Regard to the Processing of
Personal Data No. 77/2000.
11.
Ministry of Health and Social Security, Iceland (1998), Draft -
Bill on a Health Sector Database. Circulated for comments in the
Summer of 1998 <
http://www.mannvernd.is/english/laws/HSDbill_english_summer1998.html >.
12.
Data Protection Commission (1998), Data Protection Commission's
opinion on the draft Bill on a health-sector database. Letter from
Data Protection Commission to Minister of Health, Ingibj?rg
P?lmad?ttir, September 4, 1998
<
http://www.mannvernd.is/english/news/Data_Protection_Commission_040998.html >.
13.
Al?ingi, Icelandic Parliament (1998), Bill on a Health Sector
Database. Submitted to Al?ingi, Icelandic Parliament, at 123rd
session, Fall of 1998 < http://www.mannvernd.is/english/laws/HSD.bill.html >.
14.
T?lvunefnd (1998), Ums?gn t?lvunefndar um frumvarp til laga um
gagnagrunn ? heilbrig?issvi?i. Beint til heilbrig?is- og
trygginganefndar Al?ingis (in icelandic).
15.
Bj?rgvinsson D ?, Arnard?ttir O M and Matth?asson V M (1998),
?litsger? um ?mis l?gfr??ileg efni ? frumvarpi til laga um
gagnagrunn ? heilbrig?issvi?i. Institute of Law, University of
Iceland, opinion on legal aspects of Bill on a Health Sector
Database. Requested and paid for by deCODE genetics which presented
the opinion to Members of Al?ing, Icelandic Parliament, on October
28, 1998 (in Icelandic).
16.
Arnard?ttir O M, Bj?rgvinsson D ? and Matth?asson V M (1999), The
Icelandic Health Sector Database, European Journal of Health Law, 6
(307-362).
17.
Heilbrig?ish?pur gagnagrunnsdeildar ?slenskrar erf?agreiningar
(2001), ?pers?nugreinanleg gagnas?fnun til ?byrgra
v?sindaranns?kna, Morgunbla?i?, February 21. DeCODE's department of
database article in the Icelandic newspaper Morgunbla? (in
Icelandic), < http://www.mbl.is >.
18. RSA
Laboratories (2000), RSA Laboratories' Frequently Asked Questions
About Today's Cryptography, Version 4.1, RSA Security Inc < http://204.167.114.22/rsalabs/faq/index.html >.
19.
Sigur?sson G, Bj?rnsd?ttir S H and Bj?rnsson B r (1998), Fylgiskjal
VI. me? frumvarpi til laga um gagnagrunn ? heilbrig?issvi?i, ?skj.
109. Stiki ehf.: Minnisbla? um feril heilsufarsuppl?singa fr?
heilbrig?isstofnun ? mi?l?gan gagnagrunn. Bill on a Health Sector
Database, Appendix VI. Memorandum from Stiki ehf. on process of
health data from a health institution to a centralised database.
(in Icelandic)
< http://www.mannvernd.is/english/laws/HSD.bill.html >.
20.
Anderson R (1998), The DeCODE Proposal for an Icelandic Health
Database. Evaluation of the privacy aspects of DeCODE's proposal
for a central database of Icelanders' medical records at the
invitation of the Icelandic Medical Association < http://www.cl.cam.ac.uk/users/rja14/iceland/iceland.html >.
21.
Anderson R (1999), Iceland's Medical Database is Insecure, British
Medical Journal, 319, 59, < http://bmj.com/cgi/content/full/319/7201/59/b >.
22.
Benediktsson O (2000), Pers?nugreinanleiki ? gagnagrunni ?
heilbrig?issvi?i. Personal identifiability in the Icelandic Health
Sector Database. Made at the request of Ragnar A?alsteinsson,
September 13, 2001 (in Icelandic)
< http://www.mannvernd.is/greinar/OBgreinanleikiMV.html >.
23.
Anderson R (1999), Comments on the Security Targets for the
Icelandic Health Database. Comments requested by Icelandic Medical
Association on two documents written by Admiral Management Services
Ltd. for the Data Protection Authority < http://www.cl.cam.ac.uk/ftp/users/rja14/iceland-admiral.pdf >.
24.
Admiral Management Services Limited (2000), Approval Process
Methodology. Icelandic Health Database, Admiral Management Services
Limited
<
http://www.personuvernd.is/tolvunefnd.nsf/Files/7163.method.../$file/7163.method.pd
f >.
25.
DeCODE genetics (2000), deCODE genetics and Frisk Software donate
access to the genealogy database to the Icelandic nation < http://www.decode.com/news/releases/older/item.ehtm?id=1382 >.
26.
Sweeney L (1998), Re-identification of de-identified medical data,
National Committee on Vital and Health Statistics Subcommittee on
Privacy and Confidentiality. < http://ncvhs.hhs.gov/980128tr.htm >.
27.
T?lvunefnd (1998), Ums?gn t?lvunefndar um dr?g a? frumvarpi til
laga um gagnagrunn ? heilbrig?issvi?i. Beint til Ingibjargar
P?lmad?ttur, heilbrig?isr??herra. Data Protection Commission
opinion on Bill on a Health Sector Database presented to the Health
and Social Security Committee of Al?ing, the Icelandic Parliament,
on October 26, 1998 (in Icelandic). < http://www.mannvernd.is/login/ums_tolvunefnd_261098.html >.
28.
Al?ingi, Icelandic Parliament (2000), Act on the Protection of
Individuals with Regard to the Processing of Personal Data No.
77/2000 <
http://www.personuvernd.is/tolvunefnd.nsf/pages/1E685B166D04084D0025692200474
4AE >.
29.
A?alsteinsson R (2000), '...einungis eftir l?gunum', ?lflj?tur,
2000 (4), 1-32. '...only according to the law' (in
Icelandic).
30.
Annas G (2000), Rules for Research on Human Genetic Variation -
Lessons from Iceland, The New England Journal of Medicine, 342,
1830-1833.
31.
Winickoff D E (2000), Rhetoric equals cold cash in Iceland: The
Biobank Act and deCODE genetics, GeneWatch, 13, 5-6 < http://www.gene-watch.org/magazine/vol13/13-5decode.html >.
32.
Winickoff D E (2000), Biosamples, Genomics and Human Rights:
Context and Content of Iceland's Biobanks Act, Journal of Biolaw
and Business, 4, 11-17.
33.
Sigurdsson S (2001), Bibliography/Self-help Kit for Studying the
HSD deCODE Controversy < http://www.raunvis.hi.is/~sksi/kit.html >.
34.
J?natansson H (2000), Iceland's Health Sector Database: A
Significant Head Start in the Search for the Biological Holy Grail
or an Irreversible Error, American Journal of Law and Medicine, 26,
31-67.
35.
Roscam Abbing H D C (1999), Central Health Database in Iceland and
Patients' Rights, European Journal of Health Law, 6,
363-371.
36.
Nielsen K K and Waaben H (2001), Lov om Behandling af
Personoplysninger (Copenhagen, Denmark: Jurist- og ?konomforbundets
Forlag).
37.
Laurie G (2002), Genetic Privacy (Cambridge: Cambridge University
Press).
38.
?rnason E (2000), The Icelandic Healthcare Database, The New
England Journal of Medicine, 343, 1734
< http://content.nejm.org/cgi/content/short/343/23/1734 >.
39.
European Parliament and the Council (1996), Directive 96/6/EC of
the European Parliament and of the Council of 11 March 1996 on the
legal protection of databases < http://europa.eu.int/ISPO/infosoc/legreg/docs/969ec.html >.
Appendix
Table 1
I. From already existing medical
records
|
A. Information from National
Registry: Identity number or SSN, one-way encrypted. Gender and
age, residence (county and mail code) and marriage status at the
time of the recording of the information.
|
B. Coded and quantitative
information: Disease diagnosis according to ICD-9/ICD-10 system.
Operation number. Date of arrival and discharge. X-ray, CT, MR
analysis. Research results. Physiological measurements. Coded drug
treatment.
|
|
II. From standardised electronic
system
|
1. Health Institute: The institutes
identity number. Department. Medical speciality.
|
2. Patient identification: Type of
patient. Identity number or SSN, one-way encrypted. Gender.
Marriage status. County of residence. Employment.
Education.
|
3. Arrival at health institute:
Date that a patient enters a waiting list. Date, time and method of
arrival. Where from the patient comes. Reason for
hospitalisation.
|
4. Discharge from health institute:
Date and time of discharge. Date of termination of active
treatment. Date of arrival to walk-in clinic. Repeated visits to
walk-in clinic. Where the patients goes after treatment.
|
5. Reason for arrival.
|
6. Physician's examination at
arrival: Date and time of examination.
|
7. Drugs given at arrival: Drug
type, unit, number, concentration, quantity and frequency of
administering.
|
8. Allergy: Date of recording. Drug
allergy. Other allergy.
|
9. Specialist's treatment
plan.
|
10. Informed decision on
treatment.
|
11. Physicians
instructions.
|
12. Instructions for drug
treatment: Date of instructions. Type and number of drug. Type of
treatment. Concentration, unit, amount, frequency, how often,
method of administering (subcutaneous, etc). Date of
termination.
|
13. Administering of drug according
to instructions: Date and time. Drug type and number. Method of
administering. Effects. Side effects.
|
14. Specialist's evaluation of
treatment.
|
15. Drugs at discharge. Date of
instructions. Drug type and number. Type of drug use.
Concentration. Unit. Amount. Frequency. Method of administering.
Date of termination.
|
16. Diaries.
|
17. Consultations. Date and time of
request. Reason for request. Date and time of reply.
Result/analysis.
|
18. Notes of physician at walk-in
clinic. Date and time of notes. Diagnosis. Procedure number.
Procedure code of physician. Treatment.
|
19. Information gathering by
nurses. Date and time. Examination and measurements at arrival
(e.g. temperature, pulse, breathing). Gordon's health keys.
Nourishment, metabolism and skin. Excretion. ADL. Movement and
activity. Cognitive status, sensation.
|
20. Nursing process. Dates. Goal.
Plan. Progress and evaluation. Nurse's diagnoses.
|
21. Disease diagnosis. Date of
diagnosis. Disease diagnosis by coding table. Disease diagnosis,
physician's text.
|
22. Operations. Operation number.
Physician's operation title. Date of operation.
|
23. Reports of vital signs. Date
and time. Blood pressure. Pulse. Breathing.
|
24. Notes of walk-in clinic nurse.
Date and time. Reason for arrival. Analysis and treatment
(coded).
|
25. Immunisation. Date. Vaccination
ICD-10 code. Vaccine. Side effects.
|
26. Reporting by other health
workers. Date and analysis made by work therapists,
physiotherapist, social workers, speech therapists, psychologists,
neuro psychologists, pastors and deacons.
|
27. Scientific research connected
to medical records.
|
28. Requests for tests and
results.
|
29. Lifestyle. Smoking.
|
30. Coded social
information.
|
31. Genetic information. Disease
diagnosis obtained by examination of genetic material (e.g.
analysis of inherited disease) and diagnosis based on chromosomal
analysis, e.g. on inborn disease or malignant disease.
|
Table 1: Information from medical
records that can be transferred to the Health Sector Database from
a health institute. Based on Appendix B of the license
Table
2
|
July 1997
|
April 1998
|
July 1998
|
October 1998
|
December 1998
|
First Draft of Bill [ 8 ] written by deCODE
in July 1997 and presented to Ministry of Health on September 3,
1997.
|
Bill on Health Sector Databases [ 9 ] submitted to
122 nd session Parliament in April
1998.
|
Draft --- Bill on a Health Sector
Database [ 11 ]. New version
of the Bill rewritten by an ad hoc committee in the Ministry of
Health and circulated for comments in July 1998.
|
Bill on a Health Sector Database
[ 13 ] submitted
to 123 rd session Parliament in October
1998.
|
Act on a Health Sector Database [ 4 ] voted into law on December
17, 1998.
|
Definition of Health Sector
Database
|
1. Health sector database: A
collection of independent work, data, or other material containing
information on health, arranged in an organised and systematic
fashion and that can be accessed electronically or in other
ways.
|
1. Health sector database: A
collection of independent work, data, or other material containing
information on health and other related information, arranged in an
organised and systematic fashion and that can be accessed
electronically or in other ways. Health records that are kept
according to law, other records that individual health institutions
or research institutions keep on the individuals that they provide
health service to, and records that official government health and
insurance bodies keep on the users of the health service and on the
operation of the health service are not considered a health sector
database in the meaning of this law.
|
1. Health sector database: A
collection of data containing information on health and other
related information, recorded in a standardised systematic fashion
in a single centralised database, intended to be a source of
information.
|
1. Health sector database: A
collection of data containing information on health and other
related information, recorded in a standardised systematic fashion
on a single centralised database, intended for processing and as a
source of information.
|
1. Health sector database: A
collection of data containing information on health and other
related information, recorded in a standardised systematic fashion
on a single centralised database, intended for processing and as a
source of information.
|
Definition of health
information
|
2. Health information:
information on the health of individuals, other information
regarding health and genetic information.
|
3. Health information:
information on the health of individuals, other information
regarding health and genetic information.
|
3. Health information:
information on the health of individuals and groups, including
genetic information.
|
6. Health data: information on
the health of individuals, including genetic
information.
|
6. Health data: information on
the health of individuals, including genetic
information.
|
Definition of personal
information
|
3. Personal information:
Information on private matters, health matters, finances or other
matters of a named or nameable individual, which it is reasonable
and natural to treat as confidential.
|
2. Personal data: data
regarding personal matters, including health information, finance
or other items regarding a personally identified or identifiable
individual, which it is reasonable and natural to treat as
confidential.
|
2. Personal data: data
regarding personal matters, including health information, finance
or other items regarding a personally identified or identifiable
individual, which it is reasonable and natural to treat as
confidential.
|
2. Personal data: all data on a
personally identified or personally identifiable individual. An
individual shall be counted as personally identifiable if he can be
identified, directly or indirectly, especially by reference to an
identity number, or one or more factors specific to his physical,
physiological, mental, economic, cultural or social
identity.
|
2. Personal data: all data on a
personally identified or personally identifiable individual. An
individual shall be counted as personally identifiable if he or she
can be identified, directly or indirectly, especially by reference
to an identity number, or one or more factors specific to his
physical, physiological, mental, economic, cultural or social
identity.
|
Definition of non-personal
information
|
3. An individual shall not be
counted as nameable if a considerable amount of time and manpower
would be required in order to name him/her. When an individual is
not nameable the information about him/her shall not be considered
to be personal information.
|
2. A person shall not be
counted as personally identifiable if a considerable amount of time
and manpower would be required in order to identify him/her. The
same applies if the identification could only take place through
use of a decoding key, not available to the person having the
information. When an individual is not personally identifiable
information about him/her shall not be considered personal
information under the meaning of this law.'
|
2. A person shall not be
counted as personally identifiable if a considerable amount of time
and manpower would be required in order to identify him/her. The
same applies if the identification could only take place through
use of a decoding key, not available to the person having the
information.
|
3. Non-personally identifiable
data: data on a person who is not personally identifiable as
defined in clause 2.
|
3. Non-personally identifiable
data: data on a person who is not personally identifiable as
defined in clause 2.
|
Definition of
coding
|
|
|
|
4. Coding: the transformation
of words or numbers into an incomprehensible series of
symbols.
|
4. Coding: the transformation
of words or numbers into an incomprehensible series of
symbols.
|
Definition of one-way
coding
|
|
|
|
5. One-way coding: the
transformation of words or series of digits into an
incomprehensible series of symbols which cannot be traced by means
of a decoding key.
|
5. One-way coding: the
transformation of words or series of digits into an
incomprehensible series of symbols which cannot be traced by means
of a decoding key.
|
Definition of genetic
information
|
4. Genetic information: any
kind of information regarding the inheritable characteristics of an
individual or information that concerns the pattern of inheritance
of such characteristics within a group of related
individuals.
|
5. Genetic information: any
kind of information regarding the inheritable characteristics of an
individual or information that concerns the pattern of inheritance
of such characteristics within a group of related individuals,
furthermore all information that concern the transfer of genetic
information (genes) that relate to characteristics of disease or
health of individuals or groups of related individuals irrespective
of whether it is possible to diagnose these characteristics or
not.
|
4. Genetic information: any
kind of information regarding the inheritable characteristics of an
individual.
|
7. Genetic data: any kind of
information regarding the inheritable characteristics of an
individual.
|
7. Genetic information: any
kind of information regarding the inheritable characteristics of an
individual or information that concerns the pattern of inheritance
of such characteristics within a group of related individuals,
furthermore all information that concern the transfer of genetic
information (genes) that relate to characteristics of disease or
health of individuals or groups of related individuals irrespective
of whether it is possible to diagnose these characteristics or
not.
|
Table 2 : Chronology of HSD Bills and Act and definitions and
changes of definitions of various terms during the
debate
Table
3
Table 3: A look-up table of names of
individuals and their one-way coded personal numbers made with the
MD5 one-way hash function
Table
4
Table 4: Personal identification
from the context of general information
Gender, birth date and year, height
and township of residence are general data sufficient for
identifying an encrypted individual without recourse to a key or
family tree. Sensitive health information that accompanies the
encrypted personal number in the table can thus be assigned to an
individual. Only the first eight characters of the personal number
are given to save space.
Figure
1
Figure 1: Organisation and flow of
health data, genealogical and genetic information and opt-out list
in the making of the GGPR database. Based partly on Figure 3.1 in
Security Target[7].
Figure 2
Figure 2: Comparison and pattern
matching of a family tree from a genealogical database containing
encrypted personal numbers and a database containing names. Only
first names are given to conserve space; see Table 3 for family
names. Broken lines are connections to close relatives and from
there on to more distant relationships of the entire genealogy.
Only the first eight characters of the personal number are given to
save space.
i . The
Icelandic identity number or Social Security Number, SSN, or
kennitala as it is called in Icelandic, is a person's birth date
with the addition of three random digits and a fourth digit
indicating the century. Thus for each birth date there exist 1000
potential SSNs.
ii .
The Data Protection Commission stated: 'it is questionable to
maintain that the Bill's definition is based upon the definition of
Act no. 121/1989. The terms of paras. 3 and 4 of Art. 1 of this Act
entail that data on individuals are personal data within the
meaning of the Act, even if the individuals in question are not
identified by name, ID number or other form of identification,
which can be linked to a person with or without a decoding key. By
the terms of Act no. 121/1989, data are thus normally personal
data, if a decoding key exists for coded data ...and the Commission
does not believe that it makes any difference whether the person
having the information has access to the decoding key or not'[ 12 ].
iii .
The Data Protection Commission further stated: 'wishes to emphasise
that it is necessary for both the terms of general legislation on
registration here in Iceland (now Act no. 121/1989) and the terms
of special legislation (e.g. the prospective Act on a health-sector
database) to fulfil the conditions of the EU Directive, after it
has become binding under international law on Iceland's behalf. It
is also important that legislation in Iceland should be consistent
regarding such important factors as the definition of the concept
of personal data'[ 12 ].
iv .
The Data Protection Commission stated: 'Under the Directive of the
European Community the concept of personal information is broad and
encompasses all information, opinions, or comments that can be
connected directly or indirectly to a particular individual, i.e.
it refers to all information that are personally identified or
personally identifiable. It follows from clause (a) in Art. 2 of
the Directive that information is considered personally
identifiable if the information can be personally identified on the
basis of any characteristic, directly or indirectly, by reference
to an identity number or other characteristic, with or without an
identifying key. Article 26 of the Directive's Preamble states that
the main principles of protection must apply to any information
concerning a personally identified or identifiable person and that
in order to determine whether a person is identifiable (traceable),
account should be taken of all the means likely reasonably to be
used either by the controller or by any other person to identify
the said person. From this it follows that the main principles of
protection shall not apply to data that have been completely
disconnected from the individual such that it is impossible to
trace the information to particular persons.
In principle there are two major
ways for ensuring personal protection in such a database. One is to
'disconnect' the personal information from the identity of the
person and the other is to 'code' the information as it is called.
The Bill on a Health Sector Database states that information on
individuals will be coded before transfer to the database. It
assumes that the information in the database will be updated on a
regular basis when new information is added. In order to do that it
is necessary to be able to find older information on that same
individual in the database and therefore the information in the
database will only be coded and not disconnected. The difference
between these two methods, coding and disconnection, is mainly the
following. When personal data are coded, the individual is assigned
a new, invented registration or personal code while a decoding key
exists, by which individuals may be identified. On the other hand,
when data are disconnected from personal identifiers, the
individual is assigned an invented registration or personal code,
as before, but this code has no decoding key. In this case the
information is considered not to be personally identifiable unless
the information can be personally identified by resorting to other
means, such as by reference to certain factors specific to the data
subject's physical, physiological, mental, economic, cultural or
social identity as per clause (a) of Art. 2 of the
Directive.
By reference to all of this, the
Data Protection Commission considers that the Bill's assertion that
the database will contain non-personally identifiable health data,
does not hold. The Data Protection Commission therefore recommends
that that word [non-personally identifiable] be dropped from Art. 1
of the Bill' (E? translated from the Icelandic).
v . The
deCODE department of database stated: 'Highly advanced technical
solutions have been designed and will be used to three-times
one-way code the individual's identity number. Each one-way coding
will be done with a special encrypting key that fulfils very strict
technical security measures. In order to decode the identity number
and thus to personally identify the health information one would
have to use the three keys in the right order. To ensure that that
does not happen it is assumed that the three keys will be held by
three separate bodies (the health institutions themselves, The Data
Protection Commission, and by deCODE). This automatic triple coding
will scramble the identity number in such a way that it will be
possible to update the information on a particular individual when
new data arrive at the HSD without the data ever becoming
personally identifiable after they have been copied from the health
records. In spite of such security measures, which we assert are
unique in the history of Icelandic scientific research, even
persons who are familiar with scientific research of this nature
doubt that the data will be truly non-personally identifiable' (E?
translated from the Icelandic).
vi .
One can argue that deCODE could possibly make a look-up table or
dictionary of the encoded identifiers of all Icelanders directly
using the genealogical database. As part of the process of building
the database, as described in Figure 1, deCODE will submit the
genealogical database with identity numbers to the IES and then
receive it back with the encoded personal numbers (PN). If the
order of the submitted records is the same as the records received
back a look-up table can be built directly. If not, the matching of
family patterns could be used.
This document was translated from
LATEX by HEVEA.
|