Opponent Modelling by Expectation-Maximisation and Sequence Prediction in Simplified Poker

Richard Mealing, Jonathan L Shapiro

    Research output: Contribution to journalArticlepeer-review

    904 Downloads (Pure)

    Abstract

    We consider the problem of learning an effective strategy online in a hidden information game against an opponent with a changing strategy. We want to model and exploit the opponent and make three proposals to do this; firstly, to infer its hidden information using an expectation-maximisation algorithm, secondly, to predict its actions using a sequence prediction method, and finally, to simulate games between our agent and our opponent model in-between games against the opponent. Our approach does not require knowledge outside the rules of the game, and does not assume that the opponent’s strategy is stationary. Experiments in simplified poker games show that it increases the average payoff per game of a state-of-the-art no-regret learning algorithm.
    Original languageEnglish
    Pages (from-to)11-24
    Number of pages14
    JournalIEEE Transactions on Computational Intelligence and AI in Games
    Volume9
    Issue number1
    Early online dateOct 2015
    DOIs
    Publication statusPublished - 1 Mar 2017

    Keywords

    • Bayes methods
    • Computational modeling
    • Games
    • Nash equilibrium
    • Predictive models
    • Expectation-Maximization
    • Counterfactual regret minimisation

    Fingerprint

    Dive into the research topics of 'Opponent Modelling by Expectation-Maximisation and Sequence Prediction in Simplified Poker'. Together they form a unique fingerprint.

    Cite this