Audience Expansion For Online Social Network Advertising: Haishan Liu David Pardoe Kun Liu
Audience Expansion For Online Social Network Advertising: Haishan Liu David Pardoe Kun Liu
Audience Expansion For Online Social Network Advertising: Haishan Liu David Pardoe Kun Liu
Haishan Liu
David Pardoe
Kun Liu
LinkedIn Corporation
605 W. Maude Avenue
Sunnyvale, CA 94085
LinkedIn Corporation
605 W. Maude Avenue
Sunnyvale, CA 94085
LinkedIn Corporation
605 W. Maude Avenue
Sunnyvale, CA 94085
haliu@linkedin.com
dpardoe@linkedin.com
kliu@linkedin.com
ABSTRACT
Online social network advertising platforms, such as that
provided by LinkedIn, generally allow marketers to specify
targeting options so that their ads appear to a desired demographic. Audience Expansion is a technique developed
at LinkedIn to simplify targeting and identify new audiences with similar attributes to the original target audience. We developed two methods to achieve Audience Expansion: campaign-agnostic expansion and campaign-aware
expansion. In this paper, we describe the details of these
methods, present in-depth analysis of their trade-offs, and
demonstrate a hybrid strategy that possesses the combined
strength of both methods. Through large scale online experiments, we show the effectiveness of the proposed approach,
and as a result, the benefits it brings to the whole marketplace including both LinkedIn and advertisers. The achieved
benefits can be characterized as: 1) simplified targeting process and increased reach for advertisers, and 2) better utilization of LinkedIns ads inventory and higher and more
efficient market participation.
1.
INTRODUCTION
which has become the center of gravity for users daily traffic. Text or display ads appear in ad slots available on many
other parts of the website, such as the top of the page or
right column. For both channels, advertisers are able to target specific audiences they want to reach. Unlike the well
known sponsored search model for advertising on search engines, where advertisers target a list of keywords that they
deem relevant to their campaign, advertisers on social networks are given comprehensive demographic targeting options to precisely define the desired audience. For example,
on LinkedIn an advertiser can reach all software engineers
having the Machine Learning and Java skills who work in a
company located in the US with fewer than 500 employees.
However, even with this pinpoint targeting ability, a similar
challenge to the coverage problem in search advertising remains, as advertisers usually cannot cover all relevant demographic attributes related to the desired product or service.
Indeed, the cardinality of most targeting attributes is well
beyond tens of thouands (e.g., titles and skills), and some
even rise into multiple millions (e.g., company and group).
This makes it costly for advertisers to identify the attributes
they wish to target with their campaigns.
To mitigate this problem, we have developed a system,
named Audience Expansion, to help advertisers on LinkedIn
increase the reach of their campaigns. This is achieved by
automatically enlarging the original target audience (the exact audience) to include similar, like-minded users. For example, if a campaign targets members with the skill Online Advertising, the campaign might also be expanded to
members who list the skill Interactive Marketing on their
profiles. This means advertisers can reach the desired target
audience with less effort setting up campaigns.
Some advertising platforms have provided similar functionality; however, most commonly the expansion is conversionoriented. That is, advertisers must provide a list of known
users who have prior positive feedback, and this list is then
expanded. This requirement limits the usability of the feature to big, savy advertisers who can afford the effort of
continuous marketing research and monitoring. Instead, we
decided to make LinkedIns Audience Expansion feature extrememly accessible, as easy as simply checking an option to
opt-in at the time advertisers specify the targeting criteria.
Audience Expansion aims to optimize for the joint benefit of advertisers and LinkedIn with minimal impact to the
user experience. It benefits advertisers and LinkedIn in the
following ways:
Advertisers by finding, with only limited advertiser
effort, a large audience segment that is likely to engage
2.
Previous work on audience expansion for online advertising has typically fallen into one of two categories: broad
match in keyword advertising, and look-alike modeling in
user-targeted advertising. Keyword advertising includes sponsored search advertising, in which ads are placed alongside
search results for a relevant query, and contextual display advertising, in which advertisements are placed on a webpage
with relevant content. In each case, an advertiser targets
its ads by specifying a set of keywords for which it would
like to bid. If a keyword occurs in the apropriate context
(the search query or the page content), the advertiser then
enters an auction to show its ad. The keyword must match
exactly, or possibly with minor modification (e.g., different
spelling or verb tense). As a result, it can be difficult for an
advertiser to enumerate the entire set of keywords for which
it would like to bid. In response, many advertising platforms offer broad match, in which each keyword provided
by the advertiser is expanded to a larger set of keywords.
For example, the keyword bike repair might expand to related keywords such as bicycle repair and where to fix my
bike.
There are several documented implementations of broad
match. Often, these involve doing offline processing and
storing the result in a lookup table to be used online. Broder
et al. [4] address the issue of broad matches for uncommon
tail queries that are not present in the table. A system
of extracting features from a query is presented; features
for common queries are extracted and stored in an inverted
index, and at run time the common queries most similar to
a given query can be identified using this index. Gupta et
al. [9] present a system that performs supervised learning on
past ad impressions to learn click probabilities for potential
broad matches. The online learning approach described uses
a form of max-margin voted perceptron with time decay to
rapidly adjust to changes in user and advertiser behavior.
While broad match helps advertisers with the problem of
choosing keywords, it does introduce new challenges relating to bid optimization and mechanism design. Even Dar
et al. [8] explore the problem of optimal sponsored search
bidding in the presence of broad match, where the bid for
a single keyword may be applied to many queries of varying value to the advertiser. Amaldoss et al. [1] perform a
game-theoretic analysis of broad match for search using a
model that incorporates advertisers bidding cost (i.e., effort expended choosing bids) and identifies the conditions
under which advertisers and search engines benefit. Dhangwatnotai [7] studies advertiser welfare under the generalized
second price (GSP) auction when broad matches are introduced, while Chen et al. [6] describe a probabilistic mechanism for multi-slot auctions with broad matches that generates larger welfare than the standard GSP.
In user-targeted advertising, advertisers target specific users
based on properties such as demographics or interests. These
properties may be explicitly specified by the user, or they
may be inferred from the users behavior (this is referred
to as behavioral targeting [12][15]). As with keyword advertising, enumerating all possible target segments can be
difficult for an advertiser. In addition, ideal target segments
may contain too few users, while other segments may be too
broad. In response, some advertising platforms offer lookalike modeling, which identifies users similar to a given user
set. Look-alike modeling can also be used in cases where an
advertiser can provide a precise list of users to target, such
as in retargeting of website visitors.
Approaches to look-alike modeling include k-means clustering [13] and frequent pattern mining [3]. Mangalampalli
et al. [11] show that associative classification (a rule-based
form of frequent pattern mining) can be more effective than
other common classifiers when training look-alike models for
campaigns with few conversions. Bagherjeiran et al. [2] consider the problem of look-alike modeling when both a targeting segment and advertiser conversion data are available.
Here expansion is posed as an ensemble learning problem
where the goal is to complement the existing segment while
minimizing overlap.
Another approach that is related to look-alike modeling
is collaborative filtering. A key challenge in applying collaborative filtering to advertising is the extreme sparsity of
interaction between users and campaigns. Kanagal et al. [10]
address this challenge by using a product taxonomy to identify relationships between campaigns.
3.
SYSTEM
3.1 Overview
LinkedIn offers multiple ad formats, including Text Ads
that may appear at the top or side of the page, and Sponsored Updates that appear as native content in a users feed.
As ad targeting works the same regardless of format, our
Audience Expansion system does not make a distinction between formats. For each format there may be multiple ad
slots per page request; for instance, as a user scrolls through
their feed they may see several Sponsored Updates.
When an advertiser creates a campaign, they specify the
ad format, the ad content (also known as the creative), a
daily and/or lifetime budget, a bid, and the targeting to use.
Bids may either be per thousand impressions (cost per mille
or CPM) or per click (CPC). To specify targeting, the advertiser is given choices within a number of categories, such as
location, age, company name, and skills. Within each category, the advertiser is presented with a set of standardized
choices, and can select options to include and exclude. The
included selections in each category are ORed together, and
everything is then ANDed together, producing a targeting
string that represents a logical formula in conjunctive normal form. For example, a targeting string might be:
(location == USA OR location == Canada) AND
(location != California) AND (age == 18-24 OR
age == 25-34) AND (seniority != unpaid) AND
(seniority != training)
Once the targeting has been specified, the estimated size
of the audience and a suggested bid are shown to assist the
advertiser. They are also given an option to enable Audience
Expansion, and may change this setting at any time.
The ads serving flow starts when a member visits a LinkedIn
webpage with available ad slots. Together with the page
view, an ad request is issued to the backend. The members profile attributes are fetched, and then matched with
the targeting criteria of active ad campaigns to find those
that target this member. The matched campaigns then compete in a generalized second price auction, where their predicted CTR and bid jointly determine a rank order, and each
campaigns cost is determined by the next-ranked campaign.
The winning creatives are sent to the frontend to serve. This
workflow is illustrated in the online processes (colored in
dark grey) in Figure 1.
4. MODELING
We now present details of the Similar-X models with their
application in both the campaign-agnostic expansion and
the campaign-aware expansion. We discuss important factors that need to be considered to bring together the various
model components in the practical system.
V (fs ) V (ft )
,
|V (fs )||V (ft )|
and only if both fs and ft are of the same term types, i.e.:
E = {(fs , ft ) : (fs , ft ) T T (fs , ft ) I I} .
We use s = {s(fs , ft ) : (fs , ft ) E} to denote the field
based similarities when s() is applied to all edges in E .
Given the field similarities, it is natural to characterize
the final entity similarity so that it matches the following
intuitions: 1) two entities are similar if there are a large
number of similar fields; and 2) different fields contribute
differently to the final entity similarity. It is then natural to
define the entity similarity as a weighted linear combination:
S(s, t) = wT s .
(1)
The coefficients w can be learned from the historical userentity interaction log. For example, we take pairs of companies that have been historically co-targeted frequently in
ads as positive examples, and companies frequently ignored
when recommended to advertisers for inclusion given their
existing company targetings as negative examples. We then
fit a logistic regression model with elastic net regularization
to the training data.
4.1.3 Personalization
One potential downside of the naive application of SimilarX attribute expansion is the lack of personalization. For
example, if company A is deemed to be similar to company
B from Similar-Companies, then A is added to the enriched
profile of users who work at B, effectively making everyone from B eligible to see ads targeted at A, which may
not be optimal for either users or advertisers. Therefore,
we introduce a personalization scheme to rerank each potentially expandabe entity with regard to a given user. We
achieve the personalization by employing a learned propensity model to score user-entity pairs. For each user and each
entity type x X, we select the top kx results from the
available Similar-X results.
Taking companies as an example, to build the user-company
propensity model we first extract features for users and companies in the same way as described in Section 4.1.1. We
gather training examples from historical user-company intractions, for example, a user following a company as a
positive example and un-following as negative. A logistic
regression model is then trained from these examples.
4.1.4 Similar-Profiles
Similar-Profiles is the pinnacle of the Similar-X family of
algorithms, with a problem size the square of the number of
LinkedIn users (more than 400 million and growing).
Given the large size of the problem, we employ a Locality Sensitive Hashing (LSH) technique, named Arcos [5],
to assist in finding members with high cosine similarity.
Each member is mapped to one of 2n clusters, where n
is chosen to make our nearest neighbor search manageable.
This cluster is built into the member index; this speeds up
the subsequent nearest neighbor search because we can restrict our search to members in the same cluster. A members cluster is specified by n bits, where each bit is determined by the output of a particular hash function. To
obtain each hash function we first choose a random vector r R|F | with each component drawn from a Gaussian
distribution N (0, 1). The hash function corresponding to
the vector r is defined as hr (u) = sign(r u), which effectively partitions the space into two half-spaces by a ran-
Type
n-gram/phrase
standardized
derived
proximities
Field
headline
description
industry
type
company size
skills
interests
view-browsemap
occupation-browsemap
Term Values
Internet, professional, Social Network
connection, productive, Talent Solution
Internet
public company
5001-10,000 employees
Software Engineering, Management, Marketing
professional identity, jobs, software development
Facebook, Twitter, Pinterest
Google, Yahoo, Facebook
m T (c) S(m, m )
p
,
(2)
F (m, c) =
|T (c)|
where T (c) denotes the original targeted audience for the
campaign c, and S(m, m ) is the Similar-Profiles score (see
Equation 1). The intuition for this equation is that the
more similar members there are in the targeted audience to
an untargeted member, the more fit the untargeted member
is to be included in the expanded audience. To make the
fitness comparable across campaigns we have to account for
the individual campaign characteristics. The numerator in
the summation is positively correlated with the size of the
targeted audience, hence we normalize away the size in the
denominator. We empirically find that adding a square root
damping to make the normalization penalize large audiences
less works the best in practice.
5.
EXPERIMENTS
Component
Algorithm
campaignagnostic
personalized
Similar-X
SimilarProfiles
campaignaware
Parameter
{kx : x X}
kp
campaign
selection
post-expansion
filter
Note
top-kx elements to include
from each Similar-X
top-kp elements to include
from Similar-Profiles
threshold size of campaign
exact audience
ratio of paced requests
percentage of most expensive members to remove
2|C|
,
|M |(|M | 1)
where |C| is the number of connections, and |M | is the number of members in the audience.
In this experiment, we randomly sampled 200 campaigns
with campaign-aware expansion enabled. We are interested
in two quantities for each campaign: first, the expansion
ratio, i.e., the size of the expanded audience over that of the
exact audience; second, the density ratio, i.e., the density of
the expanded audience over that of the exact audience.
For comparison we also carry out a baseline test by expanding campaigns to a random subset of all members who
have actively intracted with ads in a one-month window. We
call this method Active Clickers-based expansion. We make
sure for each campaign that the size of the expanded audience generated by Active Clickers (AC-audience hereafter
for simplicity) is roughly the same as that by Similar-Profiles
(SP-audience).
To learn the distributions of density ratios for these two
methods and how they vary with regard to the expansion
ratio, we plot the density ratio against the expansion ratio
in Figure 2. From the figure we can see that the SP-audience
is on average as dense as the corresponding exact audience
&#
'#
!&#
!%#
!$#
Similar-Profiles
Active Clickers
!"#
!&#
'#
&#
%#
Treatment
campaign-agnostic
campaign-aware
hybrid
Impressions
+1.31%
+9.76%
+10.36%
Reaches
+1.35%
+9.78%
+10.40%
Matches
+1.27%
+11.54%
+12.86%
Revenue
+3.08%
+15.49%
+17.47%
Value
+3.09%
+13.98%
+15.84%
CTR
+1.07%
-1.15%
-0.44%
CPC
+0.67%
+6.45%
+6.92%
Dwell Time
+1.30%
+0.36%
+0.11%
Table 3: Relative changes in metrics compared to control. Statistically significant differences are in bold.
Treatment
campaign-agnostic
campaign-aware
hybrid
Impressions
+7.67%
+93.85%
+96.97%
Reaches
+7.77%
+94.29%
+97.47%
Revenue
+9.33%
+105.97%
+111.60%
Value
+9.72%
+100.80%
+106.00%
Adj. CTR
-0.79%
-8.56%
-8.08%
Adj. CPC
+0.46%
+6.00%
+6.86%
Table 4: Relative changes in metrics when only considering impressions won by expansion-enabled campaigns.
more competitive auctions, and not simply because expansion causes advertisers to target more expensive users.
In addition to looking at global metrics, we can specifically
examine how campaigns that enable expansion are affected
by computing the same metrics using only the impressions
of these campaigns. Table 4 shows the relative changes in
metrics (compared to the control) when considering only the
impressions of campaigns that did not disable expansion at
any time during the two weeks of the experiment. Again,
statistically significant changes are shown in bold. Here we
use the Mantel-Haenszel adjustment for CTR, CPC, and
dwell time metrics, using campaigns as strata. These metrics can differ greatly from one campaign to another, and
this adjustment corrects for the different distributions over
campaigns in each treatment to give a clearer picture of the
change a typical campaign would see. Statistical significance
for these metrics is also measured with the appropriate stratified test (Mantel-Haenszel for CTR and stratified Wilcoxon
for CPC and dwell time). We omit the matches metric as it
is less meaningful when considering a subset of impressions.
From Table 4 we can see that impressions for expansionenabled campaigns nearly double under the campaign-aware
or hybrid treatment, with similarly large increases for reaches,
revenue, and value. As before, the change in value roughly
keeps up with the change in revenue. Dwell time again shows
no statistically significant change. As discussed previously,
the increase in CPC for expansion essentially matches the
global increase due to increased competition. The relative
drop in CTR for the campaign-aware and hybrid treatments
is around 8%. Although we would ideally not see any difference in CTR, this drop is not overly large, as users reached
through expanded targeting are still clicking on ads at a reasonable rate. Still, improving CTR is the one obvious area
for improvement, and we will later present some ideas as
part of our future work. It is interesting to note that CTR
remains nearly unchanged globally; the explanation is that
while a user shown an impression due to expansion may be
somewhat less likely to click than the campaigns originallytargeted users, the user is not less likely to click on this impression than on the impression that would otherwise have
been shown without expansion.
Overall, our live traffic experiment confirms that our approaches to audience expansion (particularly the campaignaware and hybrid approaches) are able to greatly increase
the reach of campaigns while maintaining reasonable costs
and user engagement levels.
Encouraged by the result, we gradually ramped the Au-
dience Expansion system to a full enablement with the hybrid approach being the dominant strategy. The impact of
the system to the marketplace has been significant, as can
be seen from the change over time in desktop coverage in
Figure 3. Coverage is defined as the number of delivered
impressions as a percentage of the number of available impressions, which measures how effective the ads inventory is
utilized. It is evident that the coverage increases in response
to the Audience Expansion ramp-up.
!#+"
!#*"
!#)"
!#("
6;.@;7A"@.B8C25":.B8.C52,
.D.@7"7E8.2C=2"
!#'"
!#&"
$(,-./,$("
$(,012,$("
$(,013,$("
$(,415,$("
$(,678,$("
$(,9:;,$("
$(,<=>,$("
$(,?7:,$("
!#%"
1.00
0.90
[4]
0.80
CTR Ratio
[3]
0.0
0.2
0.4
0.6
0.8
1.0
[5]
6.
[8]
[9]
7.
[7]
[10]
[11]
[12]
[13]
[14]
REFERENCES
[15]