Momentum is growing to make Montreal a centre for developing ground rules about artificial intelligence. Here’s a whole new glossary that puts words to those big ideas.
Montreal has already positioned itself as a global artificial intelligence hub. Now the push is on to make it the place where the ethical framework for its responsible development is shaped.
The use of AI is growing rapidly and the technology is becoming more sophisticated. As it is adopted into more and more areas, the risk that AI could cause harm is growing. Some of those risks, experts say, are already here.
The ethical AI movement is being fuelled by both the private sector and people affiliated with universities — although in the world of AI, those two spheres tend to overlap.
It’s partially an effort to encourage the people developing AI systems, and the companies commercializing them, to start thinking about the ethical implications of their work. It’s also a way to get out ahead of government in an effort shape the regulatory framework that could eventually govern the use of AI.
One of the people behind the responsible AI push is Renjie Butalid, the co-founder of the Montreal AI Ethics Institute.
Launched last year, the institute organizes meet-ups and panel discussions. It’s in the process of launching a fellowship program — the first step in a long-term effort to build a global network for AI ethics. And it has formed partnerships with some of the largest companies in the AI sector as well as educational institutions. That includes Dawson College, where the institute is working with educators to develop an AI curriculum.
“We want to help define humanity’s place in a world that’s driven by algorithms,” Butalid said.
He admits that mission statement is a little grandiose, but AI is expected to have a massive global impact.
“Society as a whole is trying to figure out what this all means,” said Butalid, who is also the associate director of McGill’s Dobson Centre for Entrepreneurship, which helps students start businesses.
To help navigate the growing conversation around AI ethics, a new vocabulary has emerged to discuss the emerging technologies as well as the concerns around them.
G is for artificial general intelligence
Think of a science fiction robot: Data from Star Trek, Bender from Futurama, or even Rosie from the Jetsons. They’re able to learn new tasks, have real conversations and can recognize human emotions. The creators of those characters may not have known it at the time, but they were foreshadowing the arrival of artificial general intelligence.
General AI describes systems that can take on any intellectual task a human can — put it in a new situation and the system would be able to learn how to handle it just like a person would.
It doesn’t just process information, it can think.
The AI systems currently in use have “narrow” or “specialized” artificial intelligence. They can “learn” from new information but only within specific parameters.
For example, an AI system could be built to sort between pictures of dogs and cats. It would be trained by showing it thousands of pictures of dogs and cats. When it got the answer wrong, the system would be adjusted and each time, its answers would become more accurate.
But this system has no idea what a dog is. It’s learned what a dog looks like — and it can sort through pictures faster than a human — but even if that system were placed in the body of a robot, it could never pet a dog, unless it was specifically programmed to.
General AI is still a long way off — if it’s even possible — but there are already fears that people will be put out of work as their jobs are automated. An AI system can process data faster than a human, it never gets tired, and you don’t have to pay it.
But some of those jobs it could replace may be better left to machines.
For example, Facebook is kept largely free of violent imagery and pornography by a mixture of automated systems and “an army of people from the Philippines literally looking at obscene photos day in and day out,” Butalid said. However, looking at those sorts of images over and over leaves many of those workers with mental health issues.
“If you’re able to automate all of that, that leaves humans to do other tasks,” he said. “But there are questions around that. You can’t have AI systems that replace human empathy.”
Ethical concerns and technical issues are bound to collide.
In the first 24 hours after the recent attack on two mosques in Christchurch, New Zealand, a video of the attack — shot by the terrorist himself — was uploaded to Facebook 1.5 million times, according to Facebook. While automated systems prevented 1.2 million of those videos from being seen, that means at least 300,000 copies of the video were available on Facebook for at least some time.
Facebook says that when it first took down the video, it created a sort of digital fingerprint (a process called “hashing”) that would ensure any copies would be automatically blocked.
But people uploading the video altered it just enough so that it could slip through the automated system undetected. Facebook says it found at least 800 “visually distinct” versions of the video.
Underlying this all is the larger ethical question: where to draw the line between good and bad.
“How do we incentivize AI platforms for good? Well, take a step back, how do you define good? Good for who?” Butalid said. “I think society needs a forum to have those conversations and I think we can help facilitate that.”
Right now, he said, conversations about the right and wrong uses of AI are taking place in corporate boardrooms and universities — places that aren’t accessible to most people.
B is for bias and D is for datasets
Some of the ethical risks we already face are much subtler to spot than video footage of a mass shooting.
Data is key for contemporary AI. Most systems are used to process information in ways that weren’t possible with traditional software. But data is also essential to train AI systems — they “learn” by being shown vast amounts of data.
The digitization of commerce, the popularity of smartphones and the shrinking cost of storing large amounts of data have led many organizations to accumulate massive amounts of information about their customers.
This information is organized into datasets — collections of data. An Excel table, or the results of a survey, are examples of datasets.
A large company with a lot of information about their customers could have many datasets. A social media platform like Facebook might have even more.
It’s hard to make sense of all that information, though, and that’s part of the reason AI has gotten so much attention lately. Vast amounts of data are now available to train AI systems and they in turn can be used to make sense of all that data.
But that can create new problems.
As Abhishek Gupta, the founder of the Montreal AI Ethics Institute, explains, “we might be missing some of the biases that are already embedded in those consumer datasets.”
For example, a system might be designed with the goal of minimizing loan defaults, said Gupta, who also works an AI ethics researcher at McGill University and a software engineer at Microsoft.
In order to do that, the system could be fed all the historical data a bank has on customers who took out loans.
While a traditional credit-rating system would look at things like an individual’s income and loan history, an AI system could look for patterns in a much broader set of attributes — age or address, for example, Gupta said.
Proponents of this sort of technology say it can make banking more accessible. People who don’t have the credit history to get a loan might be able to qualify because people with a similar profile are statistically more likely to pay it back.
But it’s not necessarily so cut and dried.
The AI system “might start making correlations between some of these attributes that we would normally not think of, like, let’s say, the zip code, and maybe not so much here in Canada, but especially in the United States, what you see is that zip codes are a very strong indication of ethnic backgrounds,” Gupta said. “What it starts to do then is to correlate the ethnic background or the race to the ability to repay loans.”
That type of situation is already happening. In October, a Reuters report revealed that Amazon has stopped using an AI tool to help with hiring after discovering the system was biased against women. Because the company had traditionally received more applications from men — applications that were used to train the system — and hired more men, the system “learned” that the company preferred to hire men.
M is for mathwashing
It’s easy to assume the output from a machine is the result of cold robotic logic, untainted by human bias.
“You start to, in a sense, trust the system more, because you know that it’s working off of a large amount of data, so you would hope that it’s learning things that are more insightful than what a human could have. But the risk there is that some of these biases might creep in,” Gupta said.
Ignoring that risk is called “mathwashing.”
“People assign a higher degree of trust to numerical systems, to machine learning systems, or to any of these systems that are digital or numerical, compared to human-driven systems,” Gupta said. “That notion itself is flawed because where is the data generated from and who’s capturing the data and who’s deciding what to capture in those datasets? It’s humans — and there are inherent biases that get captured in those datasets, and they get propagated into the system.”
One method of “training” an AI system involves showing it inputs that are intended to generate a specific result. If the system doesn’t produce that result, it’s adjusted, or “tuned,” until it does. Then it can be given new information that it’s never seen before.
But it’s a human who’s using AI to test a hypothesis. They are deciding what data to use to train the machine and then interpreting the results.
But without knowing just what factors the machine is “considering” can lead to false conclusions.
For instance, in a study that looked at human trust of machine learning systems, researchers at the University of Washington developed an algorithm that they told study participants could sort between pictures of wolves and huskies.
In reality, the algorithm was a snow detector. When it was demonstrated to study participants, the majority of pictures of wolves it was shown had snow in them, while most of the pictures of huskies did not, so the algorithm’s results appeared to be relatively accurate. More than a third of study participants — graduate students who had taken a machine learning class — concluded that they trusted the system.
Another example is the “dodgy and problematic” use of AI for facial recognition in job interviews, said Gabriella Coleman, a McGill professor who holds the Wolfe Chair in Scientific and Technological Literacy. The technology claims to assess candidates’ facial expressions to detect if they’re being honest and to see if they have the right personality for the job.
“The troubling part is that the science isn’t very good. It’s just like old-school phrenology (a debunked theory linking mental traits to skull shape) in a kind of different package,” she said, describing it as “faux science, wrapped in this veneer of technology.”
And that veneer of technology makes people think it’s accurate, she said.
But the ability of AI technology to find patterns in large amounts of data also raises issues of privacy and anonymity.
For example, Coleman said, one researcher was able to identify software developers from their coding style using AI.
By combining data from multiple sources, it’s possible to learn much more about a person.
“If I buy a dataset that includes your cellphone usage, Uber usage, Loblaws loyalty card, Facebook profile, Google search — and Google search includes not only what you hit search for but what you started writing and decided not to search for, so your very deepest hesitations — and combined that to build a profile on a person, the amount of information you can glean out of that is incredible,” said Marc-Etienne Ouimette, the head of public policy and government relations at Element AI, a Montreal-headquartered AI company.
While many companies only sell anonymized data about their users, or only sell specific types of data about their customers, AI systems already exist that can take those pieces and put them together. For example, it’s possible to buy large datasets that show the credit card purchases made by thousands of people. This credit cart data is anonymized — identifying information like consumers’ names and addresses are removed. But when it is used in conjunction with other data, perhaps anonymized information about people’s movements from a mapping application or social media posts, an AI system can be used to de-anonymize an individual.
“I often conceptualize AI as a steroid for what you can fundamentally do with an underlying dataset,” Ouimette said.
Much of this is already going on — enabled by the terms and conditions on websites, social media platforms and apps that most people don’t read.
P is for privacy paradox
While there is widespread concern about how large companies are using the data they gather from their users, there’s little sign that people are changing their behaviour to avoid being tracked. That has been described as the “privacy paradox,” Ouimette said.
It’s a term widely used by Internet researchers.
“People express that they’re worried about how their personal information is being used and yet they still sign on to things like Facebook, or applications on their phone that, as part of the terms and conditions, will track them and will then sell that information to third parties,” Ouimette said.
But Ouimette said he doesn’t think that’s a paradox at all.
“I thought that was a disingenuous way of putting it, frankly, because it’s not much of a paradox. The reality is you’re concerned but there’s nothing you can do about it right now, unless you’re just going to choose to be a Luddite,” he said.
Even though consumers click “I agree” when they start using these services, that consent isn’t meaningful.
“If you went about reading all of the terms and conditions of all the websites that you visit, all the applications that you use, that would take on average 244 hours a year to do for an individual,” Ouimette said.
That’s not only unrealistic, if an individual doesn’t agree with part of it, there’s nothing they can do except not use the service.
“It’s a take it or leave it situation,” Ouimette said.
And in a connected, increasingly online world, where many of the largest players have near-monopoly status, it’s almost impossible to say no to many of these services without going completely offline.
T is for data trusts
One solution, which Element AI put forward in a white paper released in mid-March, is the idea of “data trusts.”
Based on the idea of financial trusts, a data trust would be a legal entity intended to steward the data of its members under certain pre-established parameters.
“The point of a trust initially, from our perspective, is to reaffirm that the fact that these companies are collecting your personal information doesn’t mean that they own it,” Ouimette said.
It could also give individuals more negotiating power. One user might not be able to get the terms and conditions of a popular app changed, but a trust representing 500,000 users might have more success, he said.
A data trust would have a constitution — establishing what data would be shared for what purposes — and trustees with a fiduciary duty to the trust’s members.
The idea is that there would be multiple trusts, with different constitutions and individuals could use multiple trusts to share their data differently with different online services.
But trusts have to be real in order to be trusted.
S is for skepticism
In Toronto, where Google affiliate Sidewalk Labs is planning to redevelop a portion of the city’s waterfront and create a data-enabled “smart” neighbourhood, concerns about how data collected by sensors placed in the area would be used — and who would own that data — triggered a widespread backlash against the project.
Sidewalk Labs responded by proposing a data trust, according to the Element AI white paper.
But that did little to assuage public fears because the trust was seen as a company-led initiative.
The citizens providing their data have to be appointing the trustee, not the companies using the data, Ouimette explained.
Coleman isn’t convinced that initiatives aimed at encouraging responsible AI development will work.
“I’m skeptical of a lot of the initiatives,” she said. “The history of technology in the West is: someone will do it.”
While some companies may decide not to develop certain products for ethical reasons, others won’t hesitate, she said, and history shows that without strong regulation, someone will develop technologies that have harmful consequences.
For example, last fall, Google abandoned a bid for a $10-billion contract with the United States military after protests by employees who said the deal would violate the company’s AI ethics principles. However, Microsoft, Amazon, Oracle and Japanese firm REAN Cloud continued to bid. This week, Google announced it will establish an external advisory council to grapple with ethics in AI, on the heels of Facebook’s announcement that it will back the creation of an ethics research centre.
For Coleman, there are two issues at play: technologies that can cause harm because they work, such as programs that can de-anonymize people online; and technologies that cause harm because they don’t, such as a software that claims to be able to identify someone’s sexual orientation based on photos of their face.
Those are not hypotheticals. Technologies whose creators claim can do just that have already been developed.
“It’s horrifying,” Coleman said. “It’s like, wow, I can’t even believe that that’s being developed.”
Part of the solution is education, she said.
“My view is that we need this discussion to fundamentally be a part of computer science education and it’s not,” she said.
“If you’re being trained to create something that then has these ramifications for the world, at least you need to be exposed to what those ramifications might be,” Coleman said. “We have a lot of students being exposed to the technology but they don’t really, really fully understand the ramifications until it’s kind of too late.”
One thing that everyone seems to agree on, though, is that there will have to be some sort of government regulation of AI.
Just what that regulation will look like is still an open question and that’s part of what these responsible AI initiatives are intended to do. Their proponents hope to shape public policy and influence the regulatory frameworks that will be created.