In our latest podcast episode, we learn what researchers are discovering about how biases from human language seep into artificial intelligence.


Podcast Transcript

Heather Taylor

Hi I’m Heather Taylor from Simplicity 2.0’s podcast. Today we’re going to talk about chatbots, but not so much the simple ones many of us know today which ask you a yes or no question, may I help you today, and then follow a pre-determined script. Today we’re going to look at thoughts that learn language and as of this summer, even invented itself, apparently.

This June, Facebook announced that it shut down an experiment that succeeded in showing bots could develop their own linguistic system for officially negotiating transactions. The problem – well, the researchers never incentivized the bots to ensure that the language is one that humans could understand too.

Today, this kind of AI is still kind of a bit of an outlier, but data suggests broader adoption may lie on the horizon. McKinsey forecasts that 29 percent of service sector jobs can eventually be done by bots. And there’s a lesson we can draw from Facebook’s experiment is that if businesses want to deploy this technology, they will need to think carefully about the marching orders they give to bots. And to do that, they need to see language through the eyes of a bot. So, let’s dive in.

Simplicity 2.0 is brought to you by Laserfiche, the elite provider of global enterprise content management software, which manages and controls information so you can empower employees to work smarter, faster, and better.

[Music playing]

We’re joined today by Zack Lipton who is finishing up his Ph.D. in the Artificial Intelligence Group at the University of California in San Diego and will be joining the faculty of Carnegie Mellon University this January.

I want to start with a Facebook AI story which the news has been kind of spinning as a terminator rise-of-the-sky-net-type narrative when the reality is a little drier. What does it actually tell us about how advanced AI both interprets directions and learns on its own?

Zack Lipton

I don’t really think that the Facebook story tells us a whole lot about AI. It doesn’t really reflect any kind of substantial departure from what we already know about machine learning. I think what it really tells us is something about the interaction between the research community and the news media.

The Facebook chat box story kind of originated as just a paper that came out of Facebook AI research labs. They’re looking at the chatbots. They’re doing interesting research but I don’t think it’s any more groundbreaking, necessarily than any of the probably 20 other interesting papers on dialog with machine learning that happened this year.

And the story, the pre-print was posted – the sort of usual soft specs that might be interested in that sort of paid attention, like MIT Technology Review. I’m not – I can’t – I don’t remember exactly if they, in particular, wrote about it but stories like that. I remember I interviewed even someone who was curious about it and interviewed me – asked some questions about the paper. That was actually why I read it for the first time.

And then it was basically they had some agents and they were trained and they could communicate and they’re trying – they’re given some goal. And communication literally just means like choose among a set of tokens. Tokens, in this case, being words that this will be what is the message that’s passed to the next agent, and the next agent has to choose some, one word at a time for just to send a small message. And, ultimately, these agents are supposed to be deciding upon some transaction in the setting that they’re casting it.

It’s like a sort of cute toy problem where they’re trying to see if different agents have different values. Can they find a way to send information to each other so they can make a trade?

And the story kind of – people talked about the paper because it was posted by a prominent AI research lab, and then the story kind of died. And then a few months later someone came up with a story of sort of someone has to – Facebook had to shut down an AI. It was just sort of a preposterous statement. It’s almost like every time you turn off your phone you’re shutting off 10 or 20 AIs.

It makes this assumption – this sort of ridiculous idea that like something remarkable is happening. But really, that’s what happens in every research every day and in every lab, is somebody runs an experiment and then it’s done.

Heather Taylor

So, as we’re – so if we look at this situation, we’re saying this is kind of something that’s pretty of the norm. But how do you think that the ability of these chatbots, the ability of this machine learning, how do you think it will evolve in the short- to medium-term? So, we’re talking about using this quite simple and kind of acute problem but kind of what’s next in the next few years?

Zack Lipton

Well, I’ll tell you what the great hope is. Typically chatbots are broken down into a number of components. So, the first component is usually what we call – well, if you assume you’re interacting with someone in text versus speech – if it’s speech obviously, you need some kind of speech-to-text – automatic speech recognition component.

But even assuming that those are there, so you’re just getting free text and you’re going to communicate back in text, you get some input from the user. And the first thing you do is what’s called natural language understanding. So, this is where you take some kind of raw utterance and you need to convert it into some kind of structured form.

Most dialog systems in the world, thing like Siri or Google Now or – the kind of – they tend to work with just what we call slot filling. Like you might say hey, Siri, I’m looking for a movie in San Diego tonight. What time is Transformer playing or something? Assuming you have bad taste in movies.

And then what it’s going to do is turn this into something that’s like – there’s an act and then a number of slot value pairs. And the act in this case might be request and the requested field might be time or location or something. And then you might say movie name equals Transformers. City equals San Diego. So, it turns it into this kind of thing of you recognize.

This is obviously very kind of like a rigid system and it only works in your kind of very specific domain like movie booking or travel booking or something like this where you kind of know what – you have to at least know what are the fields and you have some kind of sense of what are the possible values they might take?

But it turns out you can do pretty useful things within this. You could say send a message to someone and it recognizes what’s the request. It’s like will send messages in your vocabulary, recipient is whoever. So, when you work in these narrow domains you actually can get pretty far just developing a language understanding unit.

Let me give you like a bird’s eye view. There’s the language understanding. Then once you have done that language understanding, you need what’s called a policy. And a policy is basically what is the behavior of the dialog system? What is it going to do next? Is it going to trigger some kind of app? Is it going to respond in some way? Is it going to retrieve some information for you?

So, you need some way of mapping from inputs. Input basically being like what – your observations, which include what the human has told you, onto what should your next action be.
We call this like a – in control theory, in reinforcement learning, we call this – a policy is formally a mapping from states to actions.

So, the way – and then once you choose an action you have to maybe generate a reply and we call that step natural language generation. And that’s the one you can kind of cheat on very often pretty effectively just by using some sort of templates.

So – that might have sounded like a mouthful. But basically, there’s two main parts that you can’t cheat on. One is doing some amount of natural language understanding, and the next thing is doing some amount of – deriving some kind of policy that says what should the behavior of this system be?

Nearly all of the dialog systems that I know of that are actually in production in the world being used, have used machine learning to great effect for natural language understanding. So, going from speech to text is a task that has been revolutionized by deep learning, which is really good at this kind of thing of doing sort of like recognition in sort of like raw signal data, like you have seen division systems that Facebook uses to recognize Facebook – faces and photos. This is the kind of thing machine learning does really well.

Heather Taylor


Zack Lipton

Because it’s sort of just a very sort of well, crisp task, right? Here’s raw audio. Here’s the text. Here’s raw audio. Here’s the text. You don’t have to do any kind of sophisticated planning or reasoning about what might happen in the future.

The trickier task is the policy part. So, we’ve used machine learning really effectively for speech-to-text and for text-to-natural language understanding meaning slot filling, usually.

But the policy part is an area where machine learning hasn’t really broken into like the real commercial applications. And people like the group that I worked with at Microsoft Research Labs for the last year or these folks at Facebook, they’re trying to address the policy part of the puzzle. Can you make systems that are capable of deriving their own policy in deciding what to say?

Heather Taylor

All right. So, if we’re living in this narrow domain, you know, you have this narrow domain that you’re developing – the inputs, the action, creating the policy – these are humans doing this. So, what are researchers discovering about how biases from human language or just human thought patterns seep into this artificial intelligence?

Zack Lipton

I think generally that’s a really important question that sometimes practitioners don’t stop and think about. You usually have a problem statement, right? Like you have something you want to predict and you have some data set like you want will people default on loans. And so, you have a bunch of data work. For each applicant, you have a number of what we call features – it’s some attributes.

You could think of a spreadsheet. These would be the columns in the spreadsheet and there would be a row corresponding to each applicant. And then you’re saying you want to predict, like are they going to default?

And there’s a lot of places where if you basically train a system to make this kind of prediction, and then you take it and you take these predictions and you strap on some kind of decision rule to make decisions based on this. Like if there are more than a certain amount likely to make a loan – to default on their loan, then deny it, right?

If you take this kind of predictive model and attach some decision theory, there’s a lot of ways that sort of human biases or like prejudice attitudes can then sort of seep into the model.

One of them is – one of the classical ones is if the thing you’re trying to predict is sort of an annotation that comes from a human and the human has a bias, then your model is basically trying to imitate the biases of the humans, right? There’s a number of other crazy things that can happen. Like people train what’s called unsupervised models.

So, this is not where you have a well-formed prediction task but it’s where you try to say I want to come up with basically some vector, which means like some list of numbers that represent some like real-world symbolic object.

And people have done this recently for words and language. They say I want to come up with a vector that represents each word. And then once you come up with these vectors you can say things like which words are closest to each other? And they find that when they train this on corpora of natural language, taken from humans, I mean I guess that’s our only source of natural language, right?

But when we train this on things like say all the books or all of Wikipedia, we come up with weird patterns like – for example, like traditionally Black names tend to be closer to words associated with crime.

Or things like you can compute analogies in those spaces. You can say like roughly what is the distance in this vector space between man and woman? And it tends to be – like correspond to the direction between say like professor and assistant professor – something like that, which is sort of reflecting a bias that we know we have in society but we would hope we would not pass on to models that we build, especially when we anticipate using these models to make impactful decisions.

Heather Taylor

So, okay. With all of this – the idea of there could be a bias, there’s this idea of how’s it advancing if it can be very useful on the other side of things? Let’s say you’re an executive, which I think this is going to be happening a lot more and also, it’s going to be a bigger consideration that – and you have your eye on chatbots.

And so, what steps can you take to ensure that the bots speak with your brand voice and service objectives and maybe eliminate some of this bias that could potentially happen that could be a potential downfall of utilizing it?

Zack Lipton

I guess I’d say machine learning systems, you have learning – there are a number of different paradigms for how you can train an algorithm to do something. But they all involve you telling you explicitly – being able to sort of explicitly give it an objective. At least all the systems that we actually use.

And so, some things are really easy to say what’s the objective, right? Like we have some raw audio text and we have the transcription for it. And you could get 20 people in the room and they’re all going to agree about it, and we can collect large amounts of this data we have.
But we can only get the algorithm to do things that we know how to communicate to it whether or not it’s done it correctly. It’s very hard for some of these things like does the chatbot speak with your brand voice? Like these aren’t the most crisply formed machine learning problems. And my guess is that, at least in the one- to five-year time horizon of where we are in chatbot research, that people are doing these really more advanced things that are categorically different than say what Siri has done.

They’re not going to be – they’re trying to address more basic questions – not like is it speaking with our brand voice? They’re trying to get – can we get it to perform some functional job at all?

Heather Taylor


Zack Lipton

And so, my guess is that for that class of user, they’re not going to be looking at the bleeding edge of machine learning and trying to use – as I was describing before, reinforcement learning to sort of learn the policy part of the picture.

They’re probably going to try to get their hands on really good speech recognition, really good natural language understanding. And then they’re going to probably have domain experts who are very meticulously handcrafting the policies that these – that their chatbots execute.

I think, in the very short run, if you really really want that kind of control then that’s probably – you’re limited to that sphere.

Heather Taylor

Perfect. And so, I’m going to ask you one last question, just kind of going on this same kind of tangent of companies and executives bringing this into their companies. So, as they seriously start to integrate bots into their business strategy, how do you think it changes that day-to-day work for both I’d say like the CEO at the top, and then the employees that will have to either work with these bots or utilize them in their day-to-day work.

I’m wondering if CEOs need to think differently when setting direction for both humans and machines, and whether employees will need to learn new skills? What do you think about that in terms of the future of work and bots?

Zack Lipton

I don’t know if you remember like ten years ago when Twitter was kind of a new thing.

Heather Taylor

I do, in fact, remember that time.

Zack Lipton

And we were younger people. So, do you remember this weird kind of feeling that companies were aggressively hiring directors of social media strategy? Like that was my – I was just getting out of college, I guess, ten years ago, so it was like my first taste of the workforce. And I remember seeing this that people were aggressively – people were carving out careers for the first time in social media strategy and these kinds of things.

But there was this weird feeling of like there was kind of a mismatch – like the frothiness of excitement about social media didn’t necessarily correspond to people actually having any coherent ideas about what they were going to do with it?

Heather Taylor

Yes, I really remember that. There’s people I’m like wait, how do you have that job? What? You just tweet a lot?

Zack Lipton

Yeah, and I remember there were crazy things that would happen. Like you could – companies were so – they had the sense that it was important but they didn’t know precisely how. So, I remember I used to do things like I used to tweet at companies. And some companies were just, because they were aggressive about social media, they were insanely responsive.

Like I once purchased a tea at Whole Foods and then I just left it at the counter and forgot about it. And I went up town in New York. And so, I tweeted Whole Foods just like on a lark and said hey, I left my tea at the counter at the Whole Foods in Union Square. How are you going to make me whole? And the human actually responded to me and took personal care to make sure that I got my tea.

Heather Taylor


Zack Lipton

And I think maybe some of that turned out to be a good strategy for customer service, but I think to some extent, like people were just flailing for a while until they figured out what was actually a reasonable way that this could or couldn’t be useful in their business.

And so, I don’t mean to like totally just throw cold water all over the entire enterprise. I’m actually quite excited about what you could do with dialog agents. But at the same time, seeing sort of where the research is – I mean I even would go as far as saying that I think it’s one of the next like really interesting research frontiers where we can make an impact. But seeing just where the research is and what companies are talking about in their marketing materials, I think that there’s like these pervasive misunderstandings about what’s capable right now.

And I think even that very idea, as you brought it up, right, that the CEO needs a bot strategy, if you’re at a bot company – like if you’re the VP at Google in charge of Android or something or you’re at Amazon in charge of Alexa or you’re at Apple in charge of Siri or something, you need to be thinking really seriously about bots.

But I think in most industries right now, probably most CEOs don’t need to think that hard about a bot strategy or I don’t see it as so much more pressing than one or two years ago. And I think you are going to see kind of a huge wave of – I think you’re going to see an initial wave of sort of naïve people building things that don’t work and maybe some other naïve people buying them and getting chastened.

And I think ultimately there’s going to be a huge amount of research progress in this direction and we will build real things that people interact with. But I think maybe some of the exuberance of corporate leadership right now is a bit misplaced.

Heather Taylor

Oh, fantastic. This was an interesting and very informative conversation. So, I’d like to thank you, Zack, for coming to speak with us today.

So, don’t forget to add Simplicity 2.0 to your favorite RSS feed or iTunes. Thanks to Laserfiche for sponsoring today’s episode. Learn more about Laserfiche at or follow on Twitter @Laserfiche. Until next time, this is Heather Taylor for Simplicity 2.0.

Related Posts