a technical introduction to rtbkit
DESCRIPTION
Datacratic is the leader in real-time machine learning and decisioning and the creator of the RTBkit Open-Source Project. Mark Weiss, head of client solutions at Datacratic shares some of the challenges companies and developers face today as they move into Real Time Bidding. In this presentation he does a developer deep dive into design and implementation choices, technologies, plugins and provide some real world RTB customer use cases. You will also learn how you can join the RTBkit community get support for your upcoming RTBkit initiatives.TRANSCRIPT
A Technical Introduction to
Open-Source RTB for Everyone
Overview● The Project● RTB Competitive
Landscape● The Problems With RTB
○ System○ Selection○ Value
● How RTBkit Addresses the Problems with RTB
● Demo
A Little Bit About
A Little History
● Created by machine-learning and digital marketing company Datacratic
● Code base evolved from running RTB in production from 2011-2013
● Open sourced in Feb. 2013, with ongoing support from Datacratic
● Apache-style governance started Jan. 2014
Participation and Governance
● Apache-style governance○ BDNFL - Benevolent
Dictator Not for Life○ Councillors○ Committers
● Outside contributions welcome● Github pull request workflow --
committers review and merge● Contributor guidelines● Users can become
Contributors● Contributors can become
Committers -- currently two outside Committers
Support, Community, Adoption
● Free support from the community and Datacratic
● Community support from 100s of users in 25+ countries
● Datacratic provides engineering support for development, code review, governance and evolution
● Participation and contributions from Rubicon Project
● 230 active developers● 35 committers, 11 outside of
Datacratic● 10 installations in prod: N.
America, Germany, France, Russia, Argentina, China
Development Support
● Getting Started Guide● Working test system:
○ mock Exchange configurable to run any bid requests
○ mock Ad Server○ fixed-price Bidding Agent
● Example Code● Documentation● Packaging script and weekly tagged
packages for download● Ubuntu AMI (ami-31acd858)● Google Group support● Pull request review and support
User Profiles - Reason for Adopting
Data from ongoing survey, 50 responses
User Profiles - Expected Spend
Data from ongoing survey, 50 responses
User Profiles - Type of Inventory
Data from ongoing survey, 50 responses
User Profiles - Geographic Targets
Data from ongoing survey, 50 responses
The Problems With RTB
The Problems With RTB
SYSTEM VALUESELECTION
Provided by RTBkit Customized by User
General/Technical Specific/Business
Solves the RTB System Problem
SYSTEM SELECTION VALUE
ScaleSpeed
Distribution
Reliability
General/Technical Specific/Business
Provided by RTBkit Customized by User
Addresses the RTB Selection Problem
SYSTEM SELECTION VALUE
ScaleSpeed Show user an ad?
What ad?
Distribution
Reliability
Provided by RTBkit Customized by User
General/Technical Specific/Business
Addresses the RTB Value Problem
SYSTEM SELECTION VALUE
ScaleSpeed
Distribution
Show user an ad?
What ad?
What is it worth?
What should I pay?Reliability
General/Technical Specific/Business
Provided by RTBkit Customized by User
RTB Competitive LandscapeSystem Pros Cons Degree of
Difficulty
Exchange / DSP UI
Easy to get started ● Manual, hard to scale● Lack of control over bidding
strategy and data
Low
Intermediate Hosted Bidding
● More control over bidding strategy and use of data
● Don't have to do Ops
Strategy and use of data mediated by vendor and product features
Medium
Roll Your Own Bidder
Full control of all aspects of the system
Solely responsible for everything Hardest
● Benefit from core problems being solved
● Benefit from● community● Flexible customization
● Full control (optionally) but requires digging in
● Responsible for ops
Hard
Addresses the System ProblemHow
Architectural Overview
RTBkit Core● Router● Banker● Post Auction Service● Service Monitor● Agent Configuration Service
Plugins● Exchange Connectors● AdServer Connector● Bidding Agent● Augmenter● Logger
Bidder Core Responsibilities
● Core working bidder system● High-performance real-time components● Multiple data center support● Reliable global banker updated once per
second with guarantees against overspend● Strongly typed currency support● Guaranteed response time to exchanges● Automatic load shedding● Flexible high-performance filtering of bid
requests● High-performance parsing, routing, filtering,
logging and monitoring
Inventory Integration
● Ships with 9 Exchange Connectors:○ Rubicon Project○ AdX○ FBX○ AppNexus○ Nexage○ MoPub○ GumGum○ BidSwitch
● OpenRTB 2.1 support
Router Responsibilities
● Gets bid requests from Exchange Connector
● Uses Filters to filter eligible campaigns● Passes bid requests through Augmenter● Passes bid requests to Bidding Agents to
generate bid responses● Communicates with Banker to guarantee no
overspend● Guarantees timely response● Only runs if system components are
available
Router Components● Gets bid requests from
Exchange Connector● Uses Filters to filter eligible
campaigns● Passes bid requests
through Augmenter● Passes bid requests to
Bidding Agents to generate bid responses
● Communicates with Banker to guarantee no overspend
● Guarantees timely response
● Only runs if system components are available
RTBkitRouterExchange
Exchange Exchange Connector
Static Filters
Augmentation Loop
Dynamic Filters
Auction Loop
Slave Banker
Master Banker
Bidding Agents
Augmenter
Post Auction Service
Agent Config
Router Data Flows● Controls the amount of
data flowing through● Dynamically directs the
Exchange Connector to shed load to guarantee timely response
RTBkitRouterExchange
Exchange Exchange Connector
Static Filters
Augmentation Loop
Dynamic Filters
Auction Loop
Slave Banker
Master Banker
Bidding Agents
Augmenter
Post Auction Service
Agent Config
Bid Request Lifecycle
RTBkitRouter
Exchange Exchange Connector
Static Filters
Dynamic FiltersAugmenter
Post Auction Service
Ad Server
Conversion Source
Bidding Agents
Ad Server Connector
RTBkit Data Flows● Five asynchronous data
flows flow through the Router:
○ Bid request processing
○ Banking updates○ Event Matching○ Notifying Bidding
Agents of Events○ Filtering and
Bidding Agent configuration
RTBkitRouterExchange
Exchange Exchange Connector
Static Filters
Augmentation Loop
Dynamic Filters
Auction Loop
Slave Banker
Master Banker
Bidding Agents
Augmenter
Post Auction Service
Agent Config
Ad Server and Conversion Integration
● Standard HTTP JSON connector for receiving Wins, Clicks and Conversions
● Event matching of Wins to bid response● Event matching of Clicks and Conversions
to Wins● Logging of all campaign events and
matched campaign events
Post Auction Service
● Clearinghouse for matching all bids to Wins, Clicks and Conversions
● Router sends Bid Request messages● Ad Server Connector sends Wins, Clicks
and Conversions● Matched Clicks and Conversions similarly
generate Matched messages● 15-minute window or bid is Inferred Loss● Match events sent to Logger and to Bidding
Agents● Shadow account spend bookkeeping● Current bottleneck, can process events in
the hundreds / sec● Recently improved: sharded hash tables,
one thread per core● More improvements on the near roadmap
Post Auction Service
Router Bids
Ad ServerConnector
Events
Shadow Account
Bidding Agents
LoggerMatched Events
Matched Events
Wins andInferred Losses
Banker Responsibilities
● Single source of truth for budget available for each Campaign and for each Account
● Authorizes spending of Campaign budget by Bidding Agents for a Campaign
● Enforces that each Budget has one Account owner
● Caps per-Campaign and per-Account spending
● Guarantees won't overspend if wins are cheaper than bids
● Insulates banker state from "shadow account" bookkeeping in Router and Post Auction Service
Banker Design
● Totals always go up -- you can always reason about the relative timing of entries
● Double-entry bookkeeping● Multiple increasing Currency Pools● Atomic, idempotent persistence● Designed for high-latency, low-bandwidth
unreliable connections● Updates global state once per minute
Banker Account Types
Budget Account● All budget for Account tree set in Master
Banker Budget Account● Cannot bid from this account● Cannot track spend directly● Can transfer budget into child Spend
Accounts● Can have child Spend Accounts● Only exists in the Master Banker
Spend Account● Must have a parent Budget Account● Can bid from this account● Can track spend directly● Cannot have children● Can be shadowed into a separate process
Budget Account
Spend Account
Master Banker
Spend Account
Account Hierarchies and Spend Tracking
Banker - Account Hierarchies● Spend accumulates from Children to Parent
Budget Account in Master Banker● Temporary bookkeeping for bids happens in
shadow accounts in separate processes● Shadows sync once per second
○ Router shadow - tracks budget committed to pending bids
○ PAS shadow - tracks budget to debit on Wins and to credit on Losses
● Natural partitioning will allow for shardingSpend Concepts
● Budget: amount allowed to spend● Spent: amount actually spent● inFlight: amount in live bids● Allocated: amount allocated to sub-accounts● Adjustments: sum of adjustments
BudgetA
SpendA:B
Master Banker
SpendA:C
Shadow Spend
A:B
Router Slave Banker
Shadow Spend
A:C
Shadow Spend
A:C
PAS Slave Banker
Shadow Spend
A:B
Banker Currency Pools
● Currency Pools store entries as 64-bit integers
● Multiple Currency Pools per Account● Each Account hierarchy mutated by a single
process● Strongly typed Currency, won't allow cross-
currency conversions● Automatic scaling conversions (e.g. CPM to
micro-dollars)● Debit and Credit Pools● Credit operations -> increase to a credit
Currency Pool in a hierarchy● Debit operations -> increase to a debit
Currency Pool in a hierarchy
Account Currency Pools
Credit Debit
budgetIncreases budgetDecreases
allocatedIn allocatedOut
recycledIn recycledOut
commitmentsRetired commitmentsMade
adjustmentsIn adjustmentsOut
spent
Banker Account Tree Currency Pools
Name Formula Description
tree.budget budgetIncreases - budgetDecreases Tree max spendable amount
tree.inFlight sum(commitmentsMade) - sum(commitmentsRetired)
Total outstanding bids
tree.spent sum(spent) Total spent
tree.adjustments sum(adjusmentsIn) - sum(adjustmentsOut) Total adjustments to spent
tree.effectiveBudget sum(budgetIncreases) - sum(budgetDecreases) + sum(recycledIn) - sum(recycledOut) + sum(allocatedIn) - sum(allocatedOut)
Max spendable amount according to current internal state
tree.adjustedSpent tree.spent - tree.adjustments Total spent after adjustments
tree.available tree.effectiveBudget - tree.adjustedSpent - tree.inFlight
Tree remaining spendable amount
Banker Parent-Child Currency Operations
Name Action Description
child.setBalance increase child.recycledOut and decrease parent.recycledIn
Set child account balance lower
child.setBalance set balance higher
increase child.recycledIn and decrease parent.recycledOut
Set child account balance higher
child.recuperateTo increase child.recycledOut and parent.recycledIn until child.balance == 0
Banker Debit and Credit Data Flow
Banker APIs
● REST API suitable for human reader and outside tool integration
● Also used by Router, Post Auction Service and Bidding Agents
● API presents a simple wrapper over the Account Type, Account Hierarchy and Currency Pool concepts
○ All Accounts in tree or subtree○ Accounts in (sub)tree by name○ Shadow Accounts in (sub)tree by
name○ Account children by name of parent○ Account balance of sub(tree) by name○ Account budget of tree by name
Banker Persistence
● Banker state stored in Redis● Banker dumps its state each second● Read-Modify-Write so only delta transmitted● Can detect out of date or corrupt data
○ If a value goes down○ If sum(credit) - sum(debit) !=
available● On a banker crash and restart, it reads and
reconciles state from shadow accounts and persistent store
● Maximum of one second of data lost● If routers/post auction loops (with shadow
accounts) stay up, no data lost
{
"md" : {"objectType": "Account"; "version": 1},
"type": account type ("budget" or "spent")
"budgetIncreases": amount (in USD/1M),
"budgetDecreases": amount (in USD/1M),
"spent": amount (in USD/1M),
"recycledIn": amount (in USD/1M),
"recycledOut": amount (in USD/1M),
"allocatedIn": amount (in USD/1M),
"allocatedOut": amount (in USD/1M),
"commitmentsMade": amount (in USD/1M),
"commitmentsRetired": amount (in USD/1M),
"adjustmentsIn": amount (in USD/1M),
"adjustmentsOut": amount (in USD/1M),
"lineItems": additional keyed amounts,
"adjustmentLineItems": additional keyed amounts
}
Logger
● Logging occurs in a separate process that each component uses
● Automatically handles compression and log rotation
● Pub/sub model using the RTBkit service discovery mechanism (Zookeeper)
● Supports target multiple outputs (file system, S3) and route messages to one or more outputs
● Supports combining multiple messages● Supports callbacks● Can be extended as needed
Monitoring and Operations Tools
● Extensive code instrumentation that logs to Carbon
● Lock-free, high-performance carbon logging library, with tunable sampling rate, one-second granularity and various useful functions
○ labelled occurrence○ counters○ levels (min, max, mean)○ values (min, max, mean)
● Can use library to add any custom metrics you desire
● Operational dashboard● All standard and custom metrics
charted in graphite● Launcher and real-time tmux shell
Monitoring and Operations Tools
● All standard and custom metrics charted in graphite
Monitoring and Operations Tools
● Launcher and real-time tmux shell
Addresses the Selection ProblemHow
Filter Design and Features
● Router passes bid requests through Filtering pipeline
● Bids must pass all filters to reach Agents● Thread safe● Useful primitives driven by configuration
and available as building blocks for custom filters
● Predefined Agent and Creative filters● Designed to guarantee performance first,
be flexible and powerful second● Regex support
○ Example: Location filter supports regexes at Agent and Creative level to support dynamic filter by geo
Generic Filter Primitives
● Building blocks for included Predefined Filters and for user Custom Filters
● Encapsulate generic comparison logic● IncludeExcludeFilter
○ True If any included And none excluded
● ListFilter○ True If any match in List
● RegexFilter○ True If any match regex
● IntervalFilter○ True If any within interval
● DomainFilter○ True If bid.domain in
DomainList
Filter Levels
Filter Levels● Agent Filters
○ Control whether Agent bids on bid request
● Creative Filters○ Control whether Creative eligible to be
the one returned in bid response
Agent
FormatLocation
ExchangeLanguage
Creative
Exchange
Location
Language
Host
URL
Segments
Hour of Week
Fold Position
User Partition
Filter Types
● Static Predefined Filters○ Creative filters match bid and Agent
creative sizes○ Config filters match bid request
attributes to filter attributes ● Static Segment Filters
○ Filter based on attributes set by Exchange Conn. bid request parse
● Static Custom Filters○ Creative or config filters○ Simple wrapper class API
● Dynamic Predefined Filters○ Based on system state○ notEnoughTime, tooManyInFlight
● Augmenter Filters○ Custom logic and data
Predefined
Segment
Static
Predefined
Dynamic
Augmenter
Custom
Bid RequestBid
RequestBid Request
Bid Request
Filter Priorities and Performance
● Prioritized execution order optimized for performance, not business logic
● Selective, inexpensive filters run earlier● Expensive filters run later (or not at all!)● Only fast filters run on the Exchange
Connector thread, which must guarantee a response within response SLA time
● Static filters build bitfield lookup table from configs, batch process filters per bid request in 64-bit blocks
● Filter matching tests match and retrieves eligible creatives in one pass
FormatLocation
ExchangeLanguage
Creative
Agent
Exchange
Location
Language
Host
URL
Segments
Hour of Week
Fold Position
User Partition
Agent
Bid Request
Bid RequestBid
RequestBid Request
Custom Filter Development
● (Creative)IterativeFilter<MyFilter>○ Simple wrapper interface○ Set priority, return bool per request○ Less scale, no batch processing
● (Creative)FilterBaseT<MyFilter>○ ConfigSet of filter configs○ CreativeMatrix maps each creative to
its filters○ FilterState stores state of processing
filters for current bid request○ Filter batch process by intersecting
ConfigSet and CreativeMatrix○ Filter code uses ConfigSet bit
operator-style interface and also sometimes raw bit operators
struct HourOfWeekFilter : public FilterBaseT<HourOfWeekFilter> {
HourOfWeekFilter() { data.fill(ConfigSet()); }
static constexpr const char* name = "HourOfWeek";
unsigned priority() const { return Priority::HourOfWeek; }
void setConfig(unsigned configIndex,const AgentConfig& config,bool value) {
const auto& bitmap = config.hourOfWeekFilter.hourBitmap;
for (size_t i = 0; i < bitmap.size(); ++i) {
if (!bitmap[i]) continue;
data[i].set(configIndex, value);
}
}
void filter(FilterState& state) const {
state.narrowConfigs(data[state.request.timestamp.hourOfWeek()]);
}
private:
std::array<ConfigSet, 24 * 7> data;
};
Augmenters: Your Logic and Data
● Bid Requests pass through Augmenter after Filtering, before Bidding
● Allows for custom filtering based on combinations of bid request fields, your data and business logic you code
● Filter based on user agent, device, geo, user data, etc.
Augmenter Implementation
● Provides thread pool of background threads to run augmenter calls
● Enforces 5ms timeout on router thread● Sync and async versions. Use async with
callback for calls to outside DBs.● RTBkit ships with Redis Augmenter. Other
stores such as Aerospike are in the wild.● Separate config for each Bidding Agent● Augmenter data is arbitrary JSON● Can subscribe to other RTBkit data streams
to write data○ e.g. - frequency cap Augmenter
subscribes to PAS MATCHEDWINs
Router
Bid Request
Fast DB TM
Thread Pool
Augmenter Impl.
Augmenter
Post Auction Service
Data Sink Callback
Augmented Bid
Request
Addresses the Value ProblemHow
Bidding Agent Configuration
● Bidding Agents configure the Core● Agents register Agent Config with the Agent
Configuration Service● Router, PAS and Augmenter periodically
pull updated Agent configs from the ACS● Router registers
○ creatives per campaign○ dynamic filters○ augmenters and augmenter filters
● Router passes ad markup from config to Exchange Connector for bid response
● Router forwards bid requests passing filtering to eligible Agents
● PAS forwards Matched Events to Agents● Augmenter adds augmented fields to Bid
Request based on Agent Configuration
Bidding Agent Configuration (con't)
● account -- which Accounts in an Account tree the Agent bids for
○ Implement different bidding strategies within an account or "account group" by mapping Agents to named accounts in an account tree
● maxInFlight -- outstanding bids● bidProbability per Agent can be used for
pacing and bidding strategy● creatives, languageFilter and
segmentFilter supported Static Filters● augmentations here configures an
Augmenter filter● providerConfig -- ad markup, not shown
{ "account": ["parent", "child"], "bidProbability": 0.1, "creatives": [{"id": 1,"width":300,"height": 250,
"providerConfig": {"supplySourceX": { "markup": "markup goes here", "attributes": ["alcohol"]} }, ...], "languageFilter": {"include":["en"], "exclude":[]}, "segmentFilter": { "sample1": { "include" : [], "exclude": ["bad"] }, "colors": { "include" : ["blue", "red"]}}, "augmentations": { "freq-rec": {
"required": true, "config": {"maxPerDay": 10}, "filter": { "include": [], "exclude": ["too-many"] }}
}, "maxInFlight": 10}
Custom Bidding Agent Implementation
● C++ or JavaScript● Programmatic configuration● Custom bidding logic based on
○ bid attributes○ a custom Win Cost Model to adjust for
desired margin and data costs● Currency support supports bidding at
different price granularities● Pacing support
○ Custom pacing logic○ Guaranteed communication between
Bidding Agent and Banker● Bid callback called on every bid request● Router sends back bid status messages● Post Auction Service sends back event
status messages
Custom Bidding Agent
ConfigurationBidding Logic
Win Cost ModelPacing
Currency Helpers
Router
Bid Request1Bid Response
2
onWinonLossonNoBudgetonTooLateonDroppedBidonInvalidBid
3
Post Auction Service
onImpressiononClickonVisit 4
Augmenters: Your Logic and Data
● Allows you to augment the bid request, adding any fields you want, based on combinations of bid request fields, your data and business logic you code
● Supports custom logic per agent● Augmented fields are then available to the
Bidding Agents● So, you can influence your bidding logic by
adding to the bid request
Future Directions
Near Term● Scalability Improvements
○ PAS○ Number of Agents
● Improved Packaging● Decoupled Bidding Agent API
This Year● Performance benchmarking tools● Protocol versioning of messages● Open plugin platform supporting
3p marketplace
RTBkit Resources
Links● http://github.com/rtbkit● http://rtbkit.org● https://groups.google.com/a/rtbkit.
org/forum/#!forum/discuss
Developers Getting Started Guide● https://github.
com/rtbkit/rtbkit/wiki/Getting-Started
About UsMark Weiss
● Head of Customer Solutions at Datacratic@marksweiss
Datacratic● Machine-learning software provider● Platform supports real-time decisioning● Current products target digital marketing
○ Hosted RTB Optimization○ Self-Serve and DMP Lookalike
Modelingwww.datacratic.com@datacraticWe're hiring! http://datacratic.com/site/careers