data driven sales: building ai that searches, learns, and sells

Data-Driven Sales: Building AI that searches, learns, and sells

Anand KulkarniChief Scientist, Co-Founder

LeadGenius

An Audacious Claim

In ten years, the job ofsalespeople

will be replaced by artificial intelligence.

For those of us who aren’t in sales…

What do salespeople do all day?

They find people/companies who might buy something

old school new school

They analyze which companies want to buy what they’re selling

Sales people engage those prospects in commercial conversations (“selling”)

What do salespeople do all day?

Salespeople Find Companies The Search Problem

Salespeople Analyze CompaniesThe Intent Problem

Salespeople Talk to PeopleThe Sales Problem

AI that Finds Companies The Search Problem

AI That Understands Buying BehaviorThe Intent Problem

AI that Talks to PeopleThe Email Turing Test

Three Problems of Interest

Let’s talk about each of these problems in turn.

AI that Searches for Customers:The Search Problem

The Company Search Problem

At LeadGenius, we want to figure out every single company in the world who might buy somebody’s product.

We’ll start by solving the slightly more general problem of finding every company in the United States.

After that, we’ll talk about how to decide which ones of those companies want to buy something.

Grabbing data about companies

We crawled data from fifty-five sources,including:• Social Media • Online Directories• Secretary of Sate Listings• SEC filings• IRS nonprofit database

What a company looks like

A problem: how do we tell if two companies might be the same?

Unrelated companies have very, very similar names.

Companies change names. A lot.

Entity resolution: The Fancy Way

A company p is a vector of ~30 properties that we know about it.(Name, address, revenue, industry, founding year, technologies used,…)

two companies are the same if distance (p1, p2) < edistance between companies = probability of same

This works, but…

Super slow!

Requires us to do pairwise comparisons …… potentially across a huge numberof data points and data sources.

Sometimes data falls out of date.

Quiz:What’s an easier way to solve this?

Let’s find a set of properties that are less likely to change often.

Entity Resolution: The Easy Way

Two companies are the same if and only if they have the same “official” physical address.

So… how many businesses are in the US?

21,708,021 US businesses6,049,655 US businesses have >1 person

• Yelp (~47M establishments, some of which are same company)

• LinkedIn (~2M unique companies)• CrunchBase (~650K unique companies)• AngelList (~289K unique companies)

Some queries we can answer• Which U.S. industries have the most distinct organizations listed

in LinkedIn?

Industry CountConstruction 157533Real Estate 114366Information Technology and Services 113292Hospital & Health Care 99552Marketing and Advertising 87820

• Q: How many Fortune 500 companies have websites?• A: 499!

Bonus Problems

• How long is information trustworthy after we retrieve it? (decay functions)

• What’s the optimal frequency to retrieve information? (expectation-maximizations)

• How do we nab information from sites that don’t have cleanly-structured schemas? (watch humans do it)

The Problem

Given a set of companies who have brought something from us in the past…… which companies are interested in buying from us in the future?

This is a very hard problem.

Non-generalizable: Whether someone’s buying something depends heavily on the specific industry.

Time-dependent: Whether some company needs a product is always changing.

The Conventional Approach: Machine Learning

From our previous step, we already have a whole set of companies represented as mathematical vectors.

We just need to train up a solid classifier to separate which ones are going to buy from us and which ones aren’t.

How much data do we need?

- confidential -

How it Works

• We train a neural net by showing it a whole bunch (greater than 10,000) labeled examples of companies who have bought our products in the past.

How it Works

• Our system learns a function that separates the objects in space.

How it Works

• For new objects, our classifier can decide which type it is!

There are some good ways we can use them here, too!

companies matchingan ICP

companies not matching an ICP

ICP - “Ideal Customer Profile”

A better strategy: Human Computation

• Pull a probabilistic estimate from our classifier on whether a company is in-market for a product or not.

• If the probability is low – below 80% – we escalate it to a trained person in a 500-person crowd who can make a human-powered determination on whether the company is going to buy or not. They can even add a feature.

• After we make that call, add that data to the training set to make the classifier smarter

• Boosts likelihood of success to human levels… depending on the human.

An even better strategy: monitor behaviors; cluster companies by stage

• Behaviors: For each one of the companies we’ve crawled, see whether they’ve:- Viewed a website that sells that technology- Opened a new job position relevant to the product- Started hiring- Tweeted about problem- Replied positively to an email someone sent

• Stage: Companies need different items at different stages in their lives:- Figure out where they are in a “lifecycle”- Match them against similar companies

AI that Talks to People:The Email Turing Test

How do we write (sales) messages by emailthat are as good as those sent by a human being?

Recent results from deep learning say yes.

Is this even plausible?

Xiaome (Microsoft China) trained on real-world chatbot logs

Support chats trained on data(Deep Mind, Google)

Example: Automated Support Interactions

Describe your problem: browser keeps crashing

Machine: hiHuman: hiMachine: could you please let me know what is the operating system you are using ?Human: linuxMachine: ok this is the issue with the system per- formance , nasser , tools .Human: what is that ?Machine: is it a windows machine ?Machine: we have linux on your machine .

Machine: is it windows xp or windows 7 ?

Human: i am using linuxMachine: let me take a remote of your machine and check if issue has been pushed properly . Human: okMachine: please open terminal and run the below commands in one computer :Machine: sudo apt - get update

1) Writing a Message2) Understanding Responses

Problem

Requirements

Given a topic, an author, an inbox, and a recipient, compose a message that:1) has a maximum likelihood of

yielding a positive reply2) looks indistinguishable from an

email that the author sent

This is a hard problem…. the Turing TestIt’s been around since the 40s.Restricting it to sales and email might make it easier

Secret Weapons

* We can choose to ask a human being from our crowd of trained folks for help. * We can mine the inbox for whatever examples we need or want.

Strategy

1) Craft a generalized template by analyzing the sender’s email inbox

2) Collect data at scale to populate that message3) Change content based on what you discover about that

person

Writing Messages

Going further…- How likely is someone to reply to us based on…- Length?- Tone?- Subject complexity?- Word choice?

Let’s show this to the user and then optimize based on that.

How likely is someone to open this email?

Predicting responses from length

How likely is someone to open this email?

Predicting responses from templatization

Humans in the “crowd” can radically improve our templates automatically

Optimizing Templates

“Wish”, AAAI Human Computation 2014

What did someone say about our email?

Understanding responses

The hard way: sentiment analysis

Positive sentiment corpus Negative sentiment corpus

Twitter as a Corpus for Sentiment Analysis and Opinion Mining (2011)

Question:What’s the easy way?

The easy way: human computation

Scripting responses

From: anand@leadgenius.comTo: sarah@hotlead.comSubj: Quick Question,

SarahHi Sarah,

I saw you guys were hiring for SDRs. We know each other through Michael

James and I wanted to see if we might be able to help you scale your SDR team. I have a few extra SDRs we

can push your way.

Let me know if you’d like to chat further – we’re doing

this for SoldLead8 already. BTW, congrats on your

recent round!Cheers!

Interested?Here’s 3 times that work for

Here’s more information!

Check back later.

Specific question

Automatically schedule a

follow-up mail

Scripting responses into a conversation

AI that Finds Companies The Search Problem

AI that Talks to PeopleThe Email Turing Test

Conclusions

• Company search can be attacked with large-scale crawling, human computation, entity resolution, and careful data updates

• Buying intent can be deduced automatically based on classifiers but is done better with human computation

• Email communication is complex, has a lot of interesting subproblems, and is solvable!

anand@leadgenius.com@polybot, @leadgenius

www.leadgenius.com

(We’re hiring!)

That’s it!

data driven sales: building ai that searches, learns, and sells

Business

patch learns to bark

how the brain learns!

content sells

how software learns v3

searches: mastering splunk improve searches by 500k+ times

a growing neural gas network learns...

sam learns a lesson

laura learns something new:

a snake learns

infusion lunch n learns

cs 416 artificial intelligence lecture 5 finish uninformed...

michigan learns online portal

dark matter searches dark matter searches

bumps, upsells, cross sells and down sells

usa learns teacher's guide

health centers - home - tohono o'odham nation medical &...

macgyver learns spark

sue learns a lesson

haskins & sells

everybody counts. everybody learns