After finishing the first week, I noticed a lot of stuff that can be improved. The most important, it would seem, is the weird problem with AI slowness.
I have observed (using top) how the AI turn affects CPU usage on my machine, compared to my turn.
My turn: 33% constant. I guess that’s because the game updates screen at a constant rate - this is what I was expecting.
AI turn: 33-43%. Now this was a surprise. AI is CPU-bound, so if it’s single-threaded, the AI process itself should take 100% CPU, adding up to 133% on my multi-core system (assuming AI runs on server and UI on client process). Maybe the problem is not the slowness of algorithms but the AI being synced to the frame rate somehow?
This could be an artifact of my local setup, of course.
There’s also a segfault during AI turn, but I don’t think I know VCMI enough to debug it.
I found a bug in mechanics. Visited objects status gets reset after load. I’m trying to fix that - it seems that CGHeroInstance::serialize forgets about visitedObjects.
Some cosmetic bugs:
AI turn has its own music that VCMI doesn’t use.
There is a sound effect after a chest is picked up (after the player chooses money or exp). This effect is missing.
A lot of problems with overlapping bitmaps, especially when stacks move in front of an obstacle in battle.
I’m going to use my free time to fix any bugs I find, so pointers are welcome. I’ll pay more attention to bugs affecting mechanics though.
I tried to save/load the game with visitedObjects inside serialize(), but it didn’t help. When I visit the Knowledge Stone (or whatever it’s called in EN version), I can save & load and its status is still “Not visited”.
I’m not sure whether I’m looking at the right mechanism, so I’m going to try again maybe tomorrow.
More bugs: text is often bolded, making it difficult to read, in the save game menu clicking on the text edit field changes its contents to something-“autosave” each time.
Yes, those bugs will go into the bug tracker eventually - it’s just quicker to do it here.
On AI speed:
It is possible that the cause of current AI slowness is threading. Or to be precise - context switching:
AI thread sends network package to server
<OS switches context to server’s networking thread>
Server processes package and sends result to client
<OS switches context to client’s networking thread>
Client processes net packs and sends it to AI thread
AI thread processes result and proceeds
Not sure if we have last context switch but these context switching may cause huge delays without any visible CPU usage.
There was discussion between me and Tow somewhere here with such use case:
Player A recruits hero, visits Stone of Knowledge
Player A loses hero in battle.
Player B recruits same hero in tavern.
In this situation player B should receive knowledge on objects visited by hero. This means that list of visited hero must not be implementation detail of object (e.g. CGHeroVisitable) because player interface should not access actual classes. Placing this data in hero class seemed like the best approach here.
BTW -rhn, when fixing this bug make sure that PlayerState does not have same problem - it also got such field to store info on player-visitable objects. And there is always a chance that I forgot to serialize it.
By chance, I have recently worked on a code that had >100 threads running simultaneously, each CPU-limited. They destroy the CPU usage, compared to 2 threads. Context switches definitely do show up as CPU usage.
Now that you mention client<->server communication, where does the AI live? Is it on the server? Is it the client? If it’s the client, does it have a “mirrored” copy of the game state to reduce round-trips to ask the server to update game state?
Did you mean that the same hero can go to the same rewardable objects with different players and receive the rewards?
I’ve already fixed it - there’s a pull request in git. Can you tell me what kind of objects are player-visitable? Obelisks?
What’s a good way to debug under Linux? I tried QtCreator, but not only it can’t display variables passed by reference, also it crashes often. Should I build VCMI with debug flags (-g -O0) to fix it?
AI is on client side, as every player. It can only get info provided by server that was checked for correctness / accessibilty. For example, it’s not possible to check state of object that is behind fog of war.
This was introduced by Tow to ensure max safety and cheat-proof multiplayer, but looks questionable for AI.
One possible workaround is to store a lot of info on AI side and reuse it. In fact I tried it for some optimizations, but it’s limited.
AI lives on client and while client does have own copy of game state this is insecure solution - player can use modified copy of client to cheat during multiplayer. Idea was to completely remove this copy in future. (not sure why AI is running on client instead of server though)
However there is one situation when this is unavoidable - actions. For example when AI moves hero pack must be send to server, processed and results - sent back to client. Not sure how fast OS would wake idling networking threads.
At some point we had “quick” updates where client would apply changes without waiting for server reply (and abort if server returns error). But this turned out to be too buggy and was removed.
Not exaclty. Server (who handles all this) will always have access to full info and change of owner won’t cause any issues. But client should not know info about non-owned heroes.
The problem was player interface - information visible to user. We had two possibilities:
a) Store info on visited objects in player interface. However if hero changes owner new player won’t have info on objects visited by this hero before. He will see “Not visited” when hovering over objects that were visited by previous owner(s) of this hero.
b) Store this info in hero object. In this case change of owner will automatically transfer info on visited objects to new owner.
Obelisk and keymasters are those that give something to player.
But there is another group - object that reveal their state after visit - e.g. player will see guards of creature banks, skill from witch hut, requirements for tree of knowledge, etc.
But this part looks to be false alarm - this field is already handled by serializer.
I use QtCreator, sometimes - gdb console directly and valgrind to track memory-related issues.
Compiling with -O0 is a good idea, CMake has -DCMAKE_BUILD_TYPE=“Debug” flag for this.
This is what I had in my mind. Shame it didn’t work, because that’s how I’d try to implement AI.
After all, AI doesn’t cheat by design, so the only task it needs the server for is to verify that all of its messages were legal at the end of turn. They should always be legal because the game state used would be an exact copy of the server’s.
This way, AI wouldn’t need to wait for server replies during its turn at all, only wait for the last reply “all ok, your turn is over”.
EDIT: come to think of it, the state would have to be updated with the server every time a piece of the map is discovered. Not so simple any more.
The other best thing is to do AI on server indeed.
It’s not only about cheating, but also correctness. We used to have some bizzare bugs in AI, for example it managed to have 8 or more creatures in an army. It’s not possible to do by GUI, but ar that time server didn’t forbid it. Till today we don’t know how it happened, but for sanity server now performs checks for such case.
I see what you did there… anyway I think porting a product should result in some bugfixing.
Speaking of that… on the iOS build the AI runs verry fast, but I can’t be sure if its all working correctly since i haven’t tested it much yet. The most important changes I’ve made there are to remove the shared objects linkages and compile the AI with the main code. I need to make some more testing to make sure it all works… Tomorow I’ll try building it for OSX and see if this is the case, though I don’t see why using shared objects will cause such drastic slowdown.
EDIT:: The adventure AI doesn’t work properly, must be something else I’ve messed up.
I really doubt that server communication is the significant factor for AI slow-downs. Once, it is not that slow (otherwise it would be visible in the GUI), second there should not be that many message exchanges. (only when AI actually does take some action).
The AI is (or at least was, when I lat touched it) written quite hastily and naively — the algorithms surely can be optimized. It also spent quite much time on pathfinding, sometimes several times for same heroes (when switching between them). I don’t exactly remember how it is now but it might be worth checking.
But the general suggestion is always to profile first — it is extremely hard to accurately guess, where is the chokepoint.
I’m trying to profile AI - so far, my approach was using thr prof tool on the single AI thread, but either I can’t identify the blocking points, or they are spread across many functions evenly (unlikely).
Unless someone knows how to profile correctly under Linux, I’m going to try and separate the AI into a separate process and profile that process instead. That’s going to be branch that I will publish without the intention of merging.
That branch might still be interesting to see if AI crashes are as frequent as before.
Still I don’t see any obvious bottlenecks in AI - CPU usage spreads quite quickly across ~10 different functions which don’t have anything in common.
What IS weird is that when running without attached profiler CPU usage of vcmiclient never goes higher than ~30%-40% during AI turn. So I don’t think that bottleneck comes from CPU. It HAS to be something else.
UPDATE: anon, this Intel tool looks to be quite interesting. Yes, they do provide free non-commercial version and it works perfectly on my system. Now I only need to analyze results.
Can we avoid that thread creation somehow? Some delays (although not as big as setSelection) also come from this line
Not related to bottlenecks but how realistic such feature is? Graphs where all AI’s execute one after another on 4-core CPU are quite long
Besides - slowest system that we seems to be supporting (Android) usually have 2-4 cores. This may act as a great speedup there.
Which file/line is this? There are several copies related to AI.
I’m trying to interpret those lines - does it mean that the AI keeps switching heroes all the time (perhaps to see what action to take)?
I have no idea whether simultaneous turns are possible, but I’m sure it would screw up the existing rules about who goes after who. Imagine CPU1 attacking a castle of CPU2 on new week. The outcome would depend on whether CPU2 was “clicky enough” to buy all new units.