Transcript
Page 1: Learning in environments with agents we don’t control

Learning in environments with

agents we don’t control

Page 2: Learning in environments with agents we don’t control

WRANE — November, 2010 — Geoff Gordon 2

Making models

Skill of making models that represent reality

‣ bringing in other disciplines (besides CS AI econ stats math control philosophy)

How do we describe intelligent agents?

What (approximate) equilibria (or other solution concepts) are relevant?

Page 3: Learning in environments with agents we don’t control

WRANE — November, 2010 — Geoff Gordon 3

Setting up the learning problem

How do we even measure success of learning?

How do we express prior information?

What is visible? (actions, outcomes, payoffs—for self, other agents)

Page 4: Learning in environments with agents we don’t control

WRANE — November, 2010 — Geoff Gordon 4

How do we get the data?

Exploration / experimentation (vs. exploitation)

‣ problem of driving off a cliff

‣ but more problems for games: e.g., accidentally revealing info

Want to avoid being taught and exploited

Want to present a “table image”

Page 5: Learning in environments with agents we don’t control

WRANE — November, 2010 — Geoff Gordon 5

Complexity

Can we get generalization bounds analogous to those from COLT, statistics?

How do we measure complexity of a model or model class? Choose the right complexity?

Are we doomed to model opponents as less complex than ourselves? Is this a problem?

What if the [game, opponent set] changes: how stable are our performance metrics and generalization bounds?

Page 6: Learning in environments with agents we don’t control

WRANE — November, 2010 — Geoff Gordon 6

Complexity, cont’d

Ensembles

‣ work really well in Netflix, KDD cup; not as well in Lemonade Stand

‣ is there something about our [adversarial, dynamic, non-Markovian] setting that hurts them?


Top Related