New GPT-based AI concept

I came to realize that adventure AI can be implemented using GPT (Generative pre-trained transformer) architecture. Simply put, what Transformer does is to transform sequences of events (or actions) into matrix representation, which is generated via deep learning and thus can contain complex and vast knowledge. Then Transformer can be used as a generator that generates resonable sequences of actions, or reacts to game events (especially enemy actions).

We’ve seen this solution already work in a number of AI applications, so it’s verified.

This could also be ran on GPU for decent performance, hopefully.

We already have all the pieces:

  • We’ve got Goals which we try to sequence in a resonable order that leads to best outcome
  • We’ve got some fuzzy logic, which can be interpreted as a single-layer neural network performing limited but still non-trivial multi-dimensional evaluation.
  • We’ve also got some arbitrary logic written by hand for arbitary objects, but this can be generalized based on object’s properties.
  • We can easily record all game events via NetPacks
  • The state of game can also be represented as a finite vector of numbers
  • The desired behavior can be obtained from recording actions of any number of players on any number of maps and processed independently. This is virtually infinite source of training data.
  • Reacting to events can be extended to future scripted events, so AI will be able to handle scripts resonably.
  • Such system is expected to have constant processing time.

Requirements:

  • State vector representing all (or most of) game state via numbers
  • Action vector representing any action a player might take at any time, or random game event (not many of these I guess)
  • AI has to be able to perform every action that game allows
  • Action sequences can be recorded from human players, but also from current AI in a self-improving loop
  • Some automated server where players could upload their training data

Challenges:

  • Actually designing neutral network of certain size and structure that will suit the task ¯_(ツ)_/¯
  • Contain game state in a fixed-size vector of resonable size, no matter the size of the map or number of players.
    • Represent new objects added via mods in same format
    • Predict all the state vector and action parameters in advance - Embedding
    • Possibly a full game should be played so that:
  • We can determine the winner and make AI pursue winning strategies
  • AI can learn actions that lead to actual victory, and not just random exploration
  • Suitable amount of training data
  • Suitable processing power to pre-train the network
  • Final performance
  • Answers might not be exactly correct, like AI trying to visit tile (1, 2) instead of (0, 3) or whatever. Still, we’ve got server that stops AI from taking illegal actions and AI can always correct itself with the following action.
    • This alone won’t stop AI from taking heavy losses in battles due to poor battle AI.

I don’t have even slightest estimation of numbers required. However, if Stable Diffusion needs only 4 GB model and can generate all kinds of images on RTX 3050 from that, it can’t be too bad.

A bunch of misc ideas:

  • Nullkiller does a great work of reading bank configs to simple numbers, we should do that for as many objects as possible.

  • Monsters should be represented by their bonuses and not just IDs. Same for Artifacts, town buildings and such.

  • Unlimited size of the map can be reduced to limited number of “new object discovered” events. This will also make AI explore naturally.

  • Do not save illegal actions which were denied by server.

  • We can fine-tune AI with human-in-the-loop training. That’s distant future though.

  • Similiar approach can be adapted to Battle AI, which again is a sequence of actions, and the possibility space is very limited. In fact it will be way easier.

Thoughts? Opinions?

2 Likes

It’s not a bad idea. I like to make simple programs in Java or Lua for modding other games. ChatGPT was very useful for that. It’s not 100% great with logic, so you have to double-check everything it does. Sometimes it makes garbage that at first glance looks useful.

It seems that models that feed “action tokens” to transformers already exist in robotics, so this idea might actually be very realistic as I pitched it initially.