[AI] Some initial observations

rhn · July 29, 2014, 9:42am

After finishing the first week, I noticed a lot of stuff that can be improved. The most important, it would seem, is the weird problem with AI slowness.

I have observed (using top) how the AI turn affects CPU usage on my machine, compared to my turn.
My turn: 33% constant. I guess that’s because the game updates screen at a constant rate - this is what I was expecting.
AI turn: 33-43%. Now this was a surprise. AI is CPU-bound, so if it’s single-threaded, the AI process itself should take 100% CPU, adding up to 133% on my multi-core system (assuming AI runs on server and UI on client process). Maybe the problem is not the slowness of algorithms but the AI being synced to the frame rate somehow?
This could be an artifact of my local setup, of course.

There’s also a segfault during AI turn, but I don’t think I know VCMI enough to debug it.

I found a bug in mechanics. Visited objects status gets reset after load. I’m trying to fix that - it seems that CGHeroInstance::serialize forgets about visitedObjects.

Some cosmetic bugs:

AI turn has its own music that VCMI doesn’t use.
There is a sound effect after a chest is picked up (after the player chooses money or exp). This effect is missing.
A lot of problems with overlapping bitmaps, especially when stacks move in front of an obstacle in battle.

I’m going to use my free time to fix any bugs I find, so pointers are welcome. I’ll pay more attention to bugs affecting mechanics though.

rhn · July 29, 2014, 11:58am

I tried to save/load the game with visitedObjects inside serialize(), but it didn’t help. When I visit the Knowledge Stone (or whatever it’s called in EN version), I can save & load and its status is still “Not visited”.
I’m not sure whether I’m looking at the right mechanism, so I’m going to try again maybe tomorrow.

More bugs: text is often bolded, making it difficult to read, in the save game menu clicking on the text edit field changes its contents to something-“autosave” each time.

Yes, those bugs will go into the bug tracker eventually - it’s just quicker to do it here.

Warmonger · July 29, 2014, 12:10pm

Oh yes, that’s mostly appreciated. Fixing bugs is something we need much more than compiling VCMI on 20 different platforms

Probably. Why exactly visited objects were moved from object property to hero property, Ivan?

Certainly this field should be serialized.

KroArtem · July 29, 2014, 12:56pm

bugs.vcmi.eu/view.php?id=1854
bugs.vcmi.eu/view.php?id=1849
Feel free to create one more issue about visited objects
To be serious, it’s a good idea to check bugtracker before creating one more issue, it seems to be a lot of duplicates here and there.

Ivan · July 29, 2014, 4:04pm

On AI speed:
It is possible that the cause of current AI slowness is threading. Or to be precise - context switching:

AI thread sends network package to server
<OS switches context to server’s networking thread>
Server processes package and sends result to client
<OS switches context to client’s networking thread>
Client processes net packs and sends it to AI thread

AI thread processes result and proceeds

Not sure if we have last context switch but these context switching may cause huge delays without any visible CPU usage.

There was discussion between me and Tow somewhere here with such use case:
Player A recruits hero, visits Stone of Knowledge
Player A loses hero in battle.

Player B recruits same hero in tavern.

In this situation player B should receive knowledge on objects visited by hero. This means that list of visited hero must not be implementation detail of object (e.g. CGHeroVisitable) because player interface should not access actual classes. Placing this data in hero class seemed like the best approach here.

BTW -rhn, when fixing this bug make sure that PlayerState does not have same problem - it also got such field to store info on player-visitable objects. And there is always a chance that I forgot to serialize it.

rhn · July 29, 2014, 5:09pm

By chance, I have recently worked on a code that had >100 threads running simultaneously, each CPU-limited. They destroy the CPU usage, compared to 2 threads. Context switches definitely do show up as CPU usage.

Now that you mention client<->server communication, where does the AI live? Is it on the server? Is it the client? If it’s the client, does it have a “mirrored” copy of the game state to reduce round-trips to ask the server to update game state?

Did you mean that the same hero can go to the same rewardable objects with different players and receive the rewards?

I’ve already fixed it - there’s a pull request in git. Can you tell me what kind of objects are player-visitable? Obelisks?

What’s a good way to debug under Linux? I tried QtCreator, but not only it can’t display variables passed by reference, also it crashes often. Should I build VCMI with debug flags (-g -O0) to fix it?

Warmonger · July 29, 2014, 5:42pm

AI is on client side, as every player. It can only get info provided by server that was checked for correctness / accessibilty. For example, it’s not possible to check state of object that is behind fog of war.

This was introduced by Tow to ensure max safety and cheat-proof multiplayer, but looks questionable for AI.

One possible workaround is to store a lot of info on AI side and reuse it. In fact I tried it for some optimizations, but it’s limited.

Ivan · July 29, 2014, 5:43pm

AI lives on client and while client does have own copy of game state this is insecure solution - player can use modified copy of client to cheat during multiplayer. Idea was to completely remove this copy in future. (not sure why AI is running on client instead of server though)

However there is one situation when this is unavoidable - actions. For example when AI moves hero pack must be send to server, processed and results - sent back to client. Not sure how fast OS would wake idling networking threads.

At some point we had “quick” updates where client would apply changes without waiting for server reply (and abort if server returns error). But this turned out to be too buggy and was removed.

Not exaclty. Server (who handles all this) will always have access to full info and change of owner won’t cause any issues. But client should not know info about non-owned heroes.

The problem was player interface - information visible to user. We had two possibilities:
a) Store info on visited objects in player interface. However if hero changes owner new player won’t have info on objects visited by this hero before. He will see “Not visited” when hovering over objects that were visited by previous owner(s) of this hero.

b) Store this info in hero object. In this case change of owner will automatically transfer info on visited objects to new owner.

Obelisk and keymasters are those that give something to player.

But there is another group - object that reveal their state after visit - e.g. player will see guards of creature banks, skill from witch hut, requirements for tree of knowledge, etc.

But this part looks to be false alarm - this field is already handled by serializer.

I use QtCreator, sometimes - gdb console directly and valgrind to track memory-related issues.

Compiling with -O0 is a good idea, CMake has -DCMAKE_BUILD_TYPE=“Debug” flag for this.

rhn · July 29, 2014, 7:17pm

This is what I had in my mind. Shame it didn’t work, because that’s how I’d try to implement AI.
After all, AI doesn’t cheat by design, so the only task it needs the server for is to verify that all of its messages were legal at the end of turn. They should always be legal because the game state used would be an exact copy of the server’s.
This way, AI wouldn’t need to wait for server replies during its turn at all, only wait for the last reply “all ok, your turn is over”.

EDIT: come to think of it, the state would have to be updated with the server every time a piece of the map is discovered. Not so simple any more.

The other best thing is to do AI on server indeed.

Warmonger · July 29, 2014, 7:27pm

It’s not only about cheating, but also correctness. We used to have some bizzare bugs in AI, for example it managed to have 8 or more creatures in an army. It’s not possible to do by GUI, but ar that time server didn’t forbid it. Till today we don’t know how it happened, but for sanity server now performs checks for such case.

dimi.t.d · July 29, 2014, 8:34pm

I see what you did there… anyway I think porting a product should result in some bugfixing.

Speaking of that… on the iOS build the AI runs verry fast, but I can’t be sure if its all working correctly since i haven’t tested it much yet. The most important changes I’ve made there are to remove the shared objects linkages and compile the AI with the main code. I need to make some more testing to make sure it all works… Tomorow I’ll try building it for OSX and see if this is the case, though I don’t see why using shared objects will cause such drastic slowdown.

EDIT:: The adventure AI doesn’t work properly, must be something else I’ve messed up.

Tow · August 8, 2014, 8:06am

I really doubt that server communication is the significant factor for AI slow-downs. Once, it is not that slow (otherwise it would be visible in the GUI), second there should not be that many message exchanges. (only when AI actually does take some action).
The AI is (or at least was, when I lat touched it) written quite hastily and naively — the algorithms surely can be optimized. It also spent quite much time on pathfinding, sometimes several times for same heroes (when switching between them). I don’t exactly remember how it is now but it might be worth checking.

But the general suggestion is always to profile first — it is extremely hard to accurately guess, where is the chokepoint.

rhn · August 8, 2014, 11:57am

I’m trying to profile AI - so far, my approach was using thr prof tool on the single AI thread, but either I can’t identify the blocking points, or they are spread across many functions evenly (unlikely).

Unless someone knows how to profile correctly under Linux, I’m going to try and separate the AI into a separate process and profile that process instead. That’s going to be branch that I will publish without the intention of merging.
That branch might still be interesting to see if AI crashes are as frequent as before.

Ivan · August 8, 2014, 12:17pm

valgrind has tool called “callgrind” for profiling. Problems:

Speed. Valgrind is a complete CPU emulator so speed is abysmal - around 10% of normal speed.
Output. You need some tool that can visualize it. KCacheGrind and QtCreator can definitely do this, can’t say about other tools.

I’ll try running profiler myself, will post results once finished.

anon · August 8, 2014, 12:23pm

I’ve heard good things about Intel VTune Amplifier, it’s a feature-packed profiler which is free (not as in freedom though, it’s closed source) for non-commercial use on Linux.

rhn · August 8, 2014, 1:20pm

I tried the builtin QtCreator profiler (based on valgrind). All I got was a hangup, followed by a need to kill, followed by no results I hope you are more lucky.

Ivan · August 8, 2014, 3:25pm

And I have some random hangups during AI turns…

Still I don’t see any obvious bottlenecks in AI - CPU usage spreads quite quickly across ~10 different functions which don’t have anything in common.

What IS weird is that when running without attached profiler CPU usage of vcmiclient never goes higher than ~30%-40% during AI turn. So I don’t think that bottleneck comes from CPU. It HAS to be something else.

UPDATE: anon, this Intel tool looks to be quite interesting. Yes, they do provide free non-commercial version and it works perfectly on my system. Now I only need to analyze results.

Ivan · August 8, 2014, 5:03pm

Okay, first results of what Intel profiler reports. Haven’t actually tested any of these.

Average CPU load from AI threads is… less than 10%
According to profiler most of the time CPU was idle

**AI spends huge amount of time in locks. **
Example. This code:

cb->setSelection(*h);

calls this:

CCallback::setSelection(const CArmedInstance * obj)
{
    ...
    sendRequest(&(CPackForClient&)ss);
    ...
}

which calls this:

int CBattleCallback::sendRequest(const CPack *request)
{
    ...
    cl->waitingRequest.waitWhileContains(requestID);
    ...
}

Resulting in HUGE delays. Profiler claims that EACH AI thread spent ~10-30 seconds waiting for this lock. And I see ~30 such threads.

Another possible issue - AI threads
why are we creating that many threads? I’m referring to this line

void VCAI::yourTurn()
{
	...
	makingTurn = make_unique<boost::thread>(&VCAI::makeTurn, this);
}

Can we avoid that thread creation somehow? Some delays (although not as big as setSelection) also come from this line

Simultaneous turns
Not related to bottlenecks but how realistic such feature is? Graphs where all AI’s execute one after another on 4-core CPU are quite long
Besides - slowest system that we seems to be supporting (Android) usually have 2-4 cores. This may act as a great speedup there.

Warmonger · August 8, 2014, 5:25pm

Simultaneous turns is the feature intended for human players, not AI.

It wouldn’t hurt, however, if AI could execute multiple threads in parallel - but it requires careful code design.

rhn · August 8, 2014, 5:26pm

Which file/line is this? There are several copies related to AI.

I’m trying to interpret those lines - does it mean that the AI keeps switching heroes all the time (perhaps to see what action to take)?

I have no idea whether simultaneous turns are possible, but I’m sure it would screw up the existing rules about who goes after who. Imagine CPU1 attacking a castle of CPU2 on new week. The outcome would depend on whether CPU2 was “clicky enough” to buy all new units.