Microsoft researchers create AI ethics checklist with ML practitioners from a dozen tech companies

While speaking on a panel recently, Landing AI founder and Google Brain cofounder Andrew Ng described a moment when he read the OECD’s AI ethics principles to an engineer, and the engineer told him the words give no instruction on how he should change how he does his job.

That’s why, Ng said, any code of conduct should be designed by and for ML practitioners. Well, Microsoft Research must’ve heard that, because it recently created an AI ethics checklist together with nearly 50 engineers from a dozen tech companies.

Authors said the checklist is intended to spark conversation and “good tension” within organizations. The list avoids yes or no questions, uses words like “scrutinize,” and asks teams to “define fairness criteria.”

Altogether contributors to the checklist are working on 37 separate products, services, or consulting engagements in industries like government services, finance, health care, and education. Interview participants were not identified by name, but a large number work in AI subfields like computer vision, natural language processing, and predictive analytics.

Authors hope their work inspires future efforts to co-design guided support for practitioners working to address AI ethics issues. They say there is currently a disconnect between the focus of the AI ethics community today and the needs of ML practitioners.

“Despite their popularity, the abstract nature of AI ethics principles makes them difficult for practitioners to operationalize. As a result, and in spite of even the best intentions, AI ethics principles can fail to achieve their intended goal if they are not accompanied by other mechanisms for ensuring that practitioners make ethical decisions,” a paper describing the checklist reads. “Few of these checklists appear to have been designed with active participation from practitioners. Yet when checklists have been introduced in other domains without involving practitioners in their design and implementation, they have been misused or even ignored.”

Researchers point to examples of misused or ignored checklists in fields like structural engineering, aviation, and medicine. Work shared by Google AI researchers to build ethics into internal corporate projects also draws inspiration from practices in health care and aviation.

The results of the new checklist are detailed in “Co-Designing Checklists to Understand Organizational Challenges and Opportunities around Fairness in AI.” The work was compiled in conjunction with Microsoft’s Aether Working Group on Bias and Fairness. Authors include Microsoft Research staff and Women in Machine Learning cofounders Hanna Wallach and Jennifer Wortman Vaughan, Microsoft Research’s Luke Stark, and Carnegie Mellon University PhD candidate Michael Madaio.

Responding to insistence from practitioners that any checklist must align with existing workflows, the checklist included in the paper is based on six stages of the AI design and deployment lifecycle rather than on a standalone set of ethics principles.

The paper includes the checklist in its entirety.

The work also includes insights from interviews with machine learning practitioners in order to understand how an AI ethics checklist can effectively help them do their jobs and confront challenges they’ve encountered. Many of those interviewed said today ethics is an ad hoc process and action typically happens when individual engineers raise concern.

In interviews, ML practitioners said ethics initiatives within their companies were typically carried out on an informal, ad hoc basis by people who felt a desire to make fairness a “personal priority.”

Sometimes speaking up exacted a social cost for suggesting a pause to consider ethics issues, because such activity may slow the pace of work and get in the way of a business imperative to meet deadline.

“We heard that participants wanted to advocate more strongly for AI fairness issues, but were concerned that such advocacy would have adverse impacts on their career advancement,” the report reads. “The disconnect arising from rhetorical support for AI fairness efforts coupled with a lack of organizational incentives that support such efforts is a central challenge for practitioners.”

The co-design process follows an approach used to create a surgical checklist with input from surgeons and nurses, as well as co-design processes for tech design taken from housing, education, and public transit projects.

Practitioners also said building ethics checklist into product development processes requires leadership from the top, and that it can be difficult to bake collection of feedback from communities impacted by AI into the product design process.

To start, the authors created their first-draft checklist by drawing from existing checklists and research. Then they found out what practitioners want and don’t want from AI fairness checklists in 14 exploratory interviews with machine learning practitioners and dozens of 90-minute co-design workshops, using that feedback to add to, edit, or remove items from their checklist.

ML practitioners also reviewed the checklist to match it up with the six stages in the AI development and deployment lifecycle.

Those interviewed said ethics initiatives are typically the result of the work of passionate individual advocates rather than something that comes from the top.

Participants generally said they do not want AI fairness checklists to champion technosolutionism or overprescribe technical solutions to social problems; that solutions must be customizable to adapt to different circumstances; and must take into account things like operational processes and company culture.

Although the majority of participants reacted positively to the idea of using AI fairness checklists, some practitioners expressed concern that making ethics into a procedural checklist can lull some into the belief that it’s possible to guarantee fairness or avoid discussion of nuanced, complex subjects.

The Microsoft Research paper won the best paper award from the ACM CHI Conference on Human Factors in Computing Systems conference. The conference is scheduled to take place next month in Honolulu, Hawaii, and as of a March 6 update from ACM organizers, the conference is still going to take place as planned.

Updated 10:35 a.m. March 11

Correction: The initial version of this text initially stated the name of one of the paper authors is said Microsoft Research senior principal researcher Jennifer Worthman Vaughan when in fact her name is spelled Jennifer Wortman Vaughan.

Hannah