Versions of this presentation were given in March 2019 at the International Studies Association 2019 and the PREA Ethics and Humanitarian Research Conference. It briefly presents the techniques and technologies used to understand people affected by humanitarian emergencies. It then introduces ongoing work to demonstrate the feasibility of deploying Natural Language Processing in order to scale up the use of qualitative data in emergencies. Finally, I discusses the ethical implications of this work, and what rules, principles, and other ethical guidance is needed before AI can be used in humanitarian response.
Establishing Global Rules for the Ethical Use of Artificial Intelligence in Humanitarian Emergencies
1. Avoiding Catastrophe: Establishing
Global Rules for the Ethical Use of
Artificial Intelligence in
Humanitarian Emergencies
Tino Kreutzer | @tinokreutzer | kreutzer@yorku.ca
Dahdaleh Institute for Global Health Research | York University | PhD Candidate
Harvard Humanitarian Initiative | Affiliated Staff
NetHope | Advisor
2. This presentation
1. Techniques and
technologies used for
understanding affected
populations in
humanitarian
emergencies
2. Natural Language
Processing for
qualitative data
3. Humanitarian and AI
principles for ethical use
of data and technology
3. From paper to mobile data collection
to call centers
0
20
40
60
80
2014-
07
2015-
01
2016-
01
2017-
01
2018-
01
2019-
01
SurveysubmissionstoOCHAandHHIKoBoToolboxservers
(inmillion)
7. In the morning it can take a long time because I usually go to the well
close to our house. But then the person who has the key to the lock may
not be there so I need to wait a long time. Then I may need to come back
in the afternoon. So quite often we go to the river which even though it’s
very far and we know that water can make us sick. Our ward chief has a
water pump and the quality is very good, but he charges us a lot.
X
How long is the walk to the chief?
Oh, not far, only 10 minutes.
8. Primary data collection to understand and
involve affected populations
Quantitative
Surveys using mobile data
collection, call centers, etc.
? A
B
Qualitative
Focus group discussions,
qualitative interviews,
feedback hotlines, etc.
Transcription,
translation
Analysis
Best insights,
but hard to scale due to
cost and lack of staff
Less detailed,
but fast and easy to scale
?
10. Understanding
affected populations with NLP
Quantitative & Qualitative
Population surveys, key informant
interviews, focus group discussions,
qualitative interviews,
feedback hotlines, etc.
? Quality controlTranscription,
translation
Analysis
11. In the morning it can take a long time because I usually go to the well
close to our house. But then the person who has the key to the lock may
not be there so I need to wait a long time. Then I may need to come back
in the afternoon. So quite often we go to the river which even though it’s
very far and we know that water can make us sick. Our ward chief has a
water pump and the quality is very good, but he charges us a lot.
water_sources
well
river
pump
water_issues
long wait time
long distance
access fees
bad water quality
water_source_distance
close
NLP example for response analysis
On a typical day, where do you usually get your water from, and are there other
alternatives? Are there any issues with these sources?
water_source_primary
well
13. What could go wrong with bringing AI
to humanitarian emergencies???
intelligent machines can restrict the
choices of individuals and groups, lower
living standards, disrupt the organization
of labor and the job market, influence
politics, clash with fundamental rights,
exacerbate social and economic
inequalities, and affect ecosystems, the
climate and the environment
Montreal Declaration 2018
“
14. Ethical risks of deploying
“experimental technologies”
in humanitarian response
• False sense of accuracy
• Increased remote management and
‘bunkerization’
• Risks to respondents and field staff due to
collected data
• Testing and experimentation with the most
vulnerable populations
• Exclusion of people without access to cell phones
• Exploitative data mining
• Technology can’t solve real social problems
• Uncritical involvement of large technology
companies
17. International
initiatives
• United Nations:
• UN Convention on Certain Conventional Weapons
• International Telecommunication Union
• G7: Charlevoix Common Vision for the Future of Artificial
Intelligence
• EU: European AI Alliance
• Nordic-Baltic Region: declaration of collaboration on AI
• UAE and India: Memorandum of Understanding to establish a
partnership
• International Study Group of Artificial Intelligence, led by
France and Canada
18. Countries with
national AI strategies
• Mostly focused on investing
in AI
• No national policies or
regulations yet
19.
20. Data Protection &
Humanitarian Principles
• IHL, Human Rights Law
• 1965 Humanitarian Principles: humanity, impartiality, neutrality, and
independence
• 1994 Code of Conduct for the International Red Cross and Red Crescent
Movement and NGOs in Disaster Relief
• 1998 Humanitarian Charter
• “people are not put at risk as a result of the way that humanitarian
actors record and share information”
• Protection Principles
• “minimise any negative effects of humanitarian action,” such as
rendering “civilians more vulnerable to attack”
• 2014 Core Humanitarian Standard
• requires organizations to “safeguard any personal information
collected from communities and people affected by crisis that
could put them at risk.”
• 2014 Core Humanitarian Standard
• “clear and comprehensive policies on data protection” should be
“aligned with international standards and local data protection
laws.”
• 2018 Professional Standards in Protection Work (ICRC)
21. Data
protection
guidelines
Title Year Authors
Mapping and Comparing Responsible Data Approaches 2016 Berens, Mans, Verhulst
The Signal Code: A Human Rights Approach to Information During Crisis 2017 HHI / Signal
Professional Standards for Protection Work (3rd ed.) 2018 ICRC
ICRC Rules on Personal Data Protection 2017 ICRC
Handbook on Data Protection in Humanitarian Action 2017 ICRC
Data preparedness: connecting data, decision-making and humanitarian
response 2016 Raymond and Al Achkar
Information and Data Management Training toolkit Protection Cluster
Data Protection Starter Kit 2017 Terre des Hommes / CartONG
Responsible Data Principles 2018 The Engine Room
Big Data for Development and Humanitarian Action: Towards
Responsible Governance 2016 UN Global Pulse
Policy on the Protection of Personal Data of Persons of Concern to
UNHCR 2015 UNHCR
Conducting Mobile Surveys Responsibly: A Field Book for WFP Staff, May
2017 2017 WFP
Data Protection, Privacy and Security for Humanitarian & Development
Programs 2017 WVI
Compromised Connections: Overcoming Privacy Challenges of the Mobile
Internet 2016 Internews
The Humanitarian Metadata Problem - Doing No Harm in the Digital Era 2018 Privacy International and ICRC
ICT4Refugees: A report on the emerging landscape of digital responses to
the refugee crisis 2016 Mason and Buchman
Technologies for monitoring in insecure environments 2016 Dette, Steets, Sagmeister
Building data responsibility into humanitarian action 2016 Raymond et al.
22.
23. A Growing list
of ethical AI
guidelines
• 2017 Asilomar Principles
• 2018 Google AI Principles
• 2018 Microsoft: Principles, Policies and Laws
for the Responsible Use of AI
• Partnership on AI: Tenets
General principles
• Association for Computing Machinery
• Institute of Electrical and Electronics
Engineers (IEEE)
• Microsoft
• 2018 Montreal Declaration
Detailed guidance
26. IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems
Goal: “provide pragmatic and directional insights and recommendations, serving as a key reference for the
work of technologists, educators and policymakers in the coming years.”
27. Montreal AI
declaration
principles
applicable to
humanitarian
response
(examples)
3.1 Personal spaces in which people are not subjected to
surveillance or digital evaluation must be protected from the
intrusion of AIS and data acquisition and archiving systems (DAAS).
5.1 AIS processes that make decisions affecting a person’s life,
quality of life, or reputation must be intelligible to their creators.
5.2 The decisions made by AIS affecting a person’s life, quality
of life, or reputation should always be justifiable in a language that
is understood by the people who use them or who are subjected to
the consequences of their use. […]
6.1 AIS must be designed and trained so as not to create,
reinforce, or reproduce discrimination based on — among other
things — social, sexual, ethnic, cultural, or religious differences.
9.2 In all areas where a decision that affects a person’s life,
quality of life, or reputation must be made, where time and
circumstance permit, the final decision must be taken by a human
being and that decision should be free and informed
28. Proposed
Humanitarian
AI impact risk
assessment
framework
Impact
level
Impact on affected people or
communities
Example
I Little to no impact; errors cause
no harm
Transcription and classification of
survey responses
II Moderate impact; errors can
cause minimal to moderate harm
Automatic analysis and routing of
feedback/complaints received
III High impact, errors can cause
moderate to high harm
Automated decision-making that
would decide whether an affected
person or their household should
receive particular assistance
IV Very high impact; errors can cause
serious to catastrophic harm
Donor-level decisions on attributing
funding for particular sectors or
emergencies
inspired by Karlin and Corriveau (2018)
29. Humanitarian AI ethics principles
• AI is different from other “experimental” technologies.
Risks are greater.
• Imagine all the ways how things could go wrong.
• Algorithmic decision-making in humanitarian response
should require new ethical assessment framework
• There are many principles. Apply them. Teach them to your
technology partners.
• Follow existing standards and guidance on data protection,
data privacy, role of private vendors
30. Our road
ahead…
Partnerships, research, ethical safeguards
•Speech recognition for Yemeni Arabic
•Manually transcribe and translate sample responses to serve as
text training data
•Collect speech recordings from volunteers as audio training
data
•Response classification
•Quality control for quantitative surveys
•Question-specific analysis models
•Validation and iterative improvements
Pilot
•Release of all language models, classification algorithms, and
software API
•Provide documentation on how to create own classification
models
Roll out
In my presentation, I will first talk about the techniques and technologies used to understand people affected by humanitarian emergencies
Second, I will introduce my ongoing work to demonstrate the feasibility of deploying Natural Language Processing in order to scale up the use of qualitative data in emergencies
Finally, I will go into the ethical implications of this work, and what we are doing to guide our project and other organizations planning to use AI in humanitarian response
Why am I interested in this? Why am I at an ethics conference?
For the last ten years I have helped UN organizations and NGOs move from paper-based to mobile electronic data collection. In CAR, DRC, West Africa during Ebola, Nepal after the earthquake.
I’ve also led representative population surveys with more than 40,000 respondents in conflict and post-conflict zones
Since 2013 I’ve been managing the development of KoBoToolbox, a free and open source data collection tool, which is now used very widely in all humanitarian emergencies.
In my research, I’m now addressing some of the many limitations I have witnessed—both technical and human—that prevent us from better understanding people in complex emergencies.
An effective response to international humanitarian emergencies requires us to understand the needs and preferences of the affected population.
…
This is usually done through quantitative surveys by interviewing a sample of the people affected, in face to face interviews …
https://www.youtube.com/watch?v=2dB034oSs-c
MSF OCB
..but also, increasingly, through remotely conducted phone surveys.
(Phone surveys are increasing due to the fact that more people have access to cell phones and network coverage; but also, because a rising number of attacks on aid workers in conflicts makes it increasingly dangerous for field staff to deploy. )
Quantitative surveys, such as for rapid needs assessments, require local survey staff to translate complex responses into categorical or numeric data.
Time pressure to conduct and analyze interviews usually means that these assessments rarely include any open-ended questions.
This means that while people might give our enumerators a lot of valuable information for a particular question, there is often no space and time for it in our questionnaires.
Quantitative surveys, such as for rapid needs assessments, require local survey staff to translate complex responses into categorical or numeric data.
More profound insights continue to require the use of qualitative methods. But they require a lot of time and resources before analysis is available.
For example, ten unstructured interviews of 30 minutes each may take 40 hours to transcribe, while manual coding—even with the help of software like Nvivo—requires an additional 20 hours.
This makes these methods hard to conduct at scale in a deteriorating complex emergency.
While we all agree in the importance of qualitative methods to understand affected people, they are often considered a luxury, especially during the initial response phase.
Natural language processing provides radical new opportunities to help us close the wide gap between qualitative and quantitative methods
NLP refers to a group of Artificial Intelligence tasks for processing and analyzing spoken or written language.
Such capabilities are advancing through commercial applications from Google, Amazon, and others, as well as open source alternatives.
For example, the technology to transcribe human speech into English and some other languages is already extremely advanced. But many of the languages spoken by affected people in humanitarian emergencies present few commercial incentives to be added to proprietary tools.
Our goal is to systematically transcribe, translate, and categorize interview responses to improve the efficiency and effectiveness of humanitarian response.
This will be a lot of work:
For our pilot in Yemen, this requires creating an entirely new transcription model for Yemeni Arabic and identifying the best NLP methods to analyze different kinds of responses.
The greatest potential benefit of this project is for affected people to be better understood during formal assessments
More qualitative interactions can make organizations more accountable as feedback and complaints are gathered more widely.
Here are some examples of how NLP techniques can be used to parse long form qualitative responses.
The complexity depends on the question being asked and the types of data that need to be coded
Languages that we are focusing on are, for example, Yemeni Arabic, Rohingya, Kanuri in Northern Nigeria (also spoken in parts of Cameroon, Chad, Niger, Nigeria and Sudan).
(We are not necessarily talking about small languages: Yemen has 28 million; Kanuri has about 5 million native speakers.)
Even worse, the vast majority of these languages lack large-scale electronic audio and text databases that can be used to train open source tools as well.
READ quote
The Montreal Declaration was developed jointly with many leading AI researchers.
There has been considerable amount of research into the challenges of deploying new technologies and the use of big data in humanitarian emergencies.
Here is just a short sample from the last three years.
BUT AI IS DIFFERENT: These cautions should still apply, but we are now talking about a completely new set of risks for affected people:
One of the biggest challenges with developing AI is to avoid or minimize biases, which can have grave long-term consequences.
Biases may come from engineers’ own prejudices, but oftentimes it is the training data that leads to the worst biases in automated systems.
Biased decision making in AI is hard to measure and control.
Mitigation is challenging, so the emphasis should be on building systems with the goal of reducing existing biases and inequities.
Some of the consequences of these biases are already playing out in the real world. For example:
Biased algorithms were found to exacerbate racial inequalities in the justice sector in the US
Face recognition quality is nearly perfect… unless you are a women, and especially if you are a dark-skinned woman
The fatal Uber crash in Arizona last year shows why this matters in the future: Autonomous cars rely on identifying people, and If you have dark skin you may be less likely to be ”seen” by the car.
Finnemore and Sikkink have created the very useful concept of the ‘Life cycle’ of norms
So far, we are still at the beginning stage of Norm Emergence
I expect the coming
In the last two years we have seen increasing activity at IOs and between nation states to create common policies.
For example, the ITU is leading efforts to study and manage the impacts of AI.
The UN Convention on Certain Conventional Weapons has a forum to discuss legal issues around the creation of autonomous weapons
European Union: There is a European AI Alliance that has established an overarching approach to AI and an agreement to cooperate among European countries.
Charlevoix Common Vision for the Future of Artificial Intelligence: Leaders of the G7 agreed to a shared set of commitments for AI in Charlevoix, Canada
Nordic-Baltic Region: Ministers from the Nordic-Baltic region issued a declaration of collaboration on AI.
AI Agreement Between UAE and India: The UAE Minister for AI and Invest India signed a Memorandum of Understanding to establish a partnership.
International Study Group of Artificial Intelligence: France and Canada are developing a task force to make recommendations on the scope and implementation of the international study group.
Government policies on artificial intelligence have largely focused on making its development a national priority for economic competitiveness, framing it as a national security issue, and providing increased research funding.
For most countries, there are no policies on AI to date.
Such concerns about the ethical implications of AI tools have only recently been recognized as research priorities
In many emergencies, humanitarian organizations often need to work without explicit laws and policies.
Instead, we rely on a tapestry of ethical codes and principles, which are re-analyzed and expanded whenever new challenges arrive.
IHL, Principles, Code of Conduct don't mention rights or responsibilities around information
1998 Charter and Protection Principles for the first time mentioning risks from recording and sharing information
2014 CHS expresses this as a right and makes organizations responsible for safeguarding personal information
WHAT are Sphere's int'l standards and local data protection laws referring to?
No specific guidance on data protection / security/ privacy in Sphere. No other sector wide Standards.
Use ICRC Professional Standards in Protection Work as minimum standards?
Many different guidelines developed, many in parallel
…
Handbook on Data Protection in Humanitarian Action
Public concern around AI has quickly expanded from autonomous weapons and existential risks for humanity to wide-ranging implications for society, the future of labor, and so on
This has resulted in many policy initiatives.
There are several general principles
..as well as principles that backed up by detailed guidance on their implementation
Asilomar: endorsed by more than 3,800 AI researchers and practitioners
Released on March 25th
Goal: “provide pragmatic and directional insights and recommendations, serving as a key reference for the work of technologists, educators and policymakers in the coming years.”
10 principles, 62 detailed rules
How to safeguard against mistakes or manipulation of automated analysis that could lead to wrong decisions?
I believe, engineers and humanitarian organizations planning to deploy AI in humanitarian response should be keenly aware of the particularly complex ethical playing field.
This means understanding the exact ways in which the technology will be used, the type of training data needed, realistic plans to update the models, and ways to measure machine bias.
Humanitarian organizations should also analyze all the potentially harmful consequences that new AI technologies might bring with them.
Organizations should follow the most stringent guidance documents available, such as the ICRC Data Protection Handbook