[AI] Some initial observations

Okay, AI should now be fixed and branch + pull request are already on github. I still need to find nice way to re-enable obviously useful cheats but other than that everything should be OK.

BTW - can somebody on Win check my branch and see if there are notable changes in speed on Win side?

I think I can do that, but can you tell me how to get all the commits You’ve made in one merged one?
Edit: Ok, got it. Compiling it now.

This is true, but should AI react all changes immediately? Let AI react OTF only on enemy heroes (or may be smth more) and analyse anything else only after full move (succesfuul or not).

Ahh, I see your point. In general AI should move continously until it stops. However, there may be unpredictable events on its way (now) and maybe some scripts in the future.
Not sure if that’s easy to implement, though, as hero movement caused uncountable number of bugs in the past.

Kantor, I kind of expected AVS or Warmonger to reply here but as long as OS is not Linux I’m fine :slight_smile:

This depends on tools that you use. You need to find two commands:

  • pull from git repo ( github.com/vcmi/vcmi )
  • checkout branch named “optimization/swiftAI”
  • recompile project

if you want to switch to “normal” build without these changes then switch to main branch named “develop”:

  • checkout “develop”
  • recompile

Idea was to execute this batch on server but as soon as something happens (event, battle, script, etc) stop movement and let AI decide what to do - react or continue movement (send again remaining part of queue).

I also not sure how easy something like this would be to implement but it may help with AI speed issues in future.

Example:
AI want to visit object at tile (0, 20) with hero at (0, 0). But there is (hidden) event at (0,10)
AI sends batch of movement requests to server (0,0 -> 0,1), (0,1 -> 0,2) …

Server starts moving hero tile-by-tile, sending HeroMoved events on each move until it reaches (0, 10). Then hero movement stops, server processes event visitation and sends all necessary messages to clients.

AI processes “event visited” message and resumes movement or selects another action

Yes, this is it. AI should sent a list of tiles it wants to visit (as one request) and wait for result once. If cb->moveHero() method is modified to accept list, this should require no significant changes or workarounds.

Server should not send HeroMoved events always. Server should distinct human and AI players and know enemy speed for human players. Server should send separate HeroMoved events only for human players for their own heroes and visible enemy heroes only if enemy speeed is not “fastest” aka “teleport”.

Warmonger, biggest problem with batched movement would be server-side detection of interruptions - how to detect that movement must be aborted? Simplest approach would be to move as long as hero is not visiting anything but not sure if this would be 100% reliable especially with scripts.

AVS, IMO speed should be purely client-side property. Why gui-less server should care about speed of animation? For fastest/teleport speed it is client who should just skip the animation. And that’s how max speed works right now.

It may be a good idea to send batched HeroMoved to client at least in case of batched movement to create a bit smoother movement on client but I don’t think that we need any other changes than that.

Ok, I recompiled VCMI and will shortly post my observations.
Edit: Hmm, I’ve tested few maps and I must say it’s very nice to see how AI now rapidly acts, so, yes, AI is faster now (when comparing to older one, is visible a huge difference). As I also saw that AI battles with neutral slightly slows the turns, but as I said - ‘slightly’. I’ll soon run a ‘longer test’ to see how much time it’ll take to spend let’s say 4, 5 months in game.

So these changes are not Linux-only, yay!

That’ looks pretty much like on my side with almost no delays in AI actions except battles - neutral heroes are always on move. AI acts even faster when behind fog of war (or with disabled AI movement visibility).

Yes, I say even more, here are my logs I made today. Both of them were running for 6 game months. One log called ‘old’ says that game was running somewhere about 4 hours (check VCMI_server_log, first and last line), but for some of time I had checked on visibility of AI moves. Speaking of new AI, 6 months passed smoothly within 1,5 hours, of course all the time visibility of AI moves was disabled. This You can check in log called ‘new’.
VCMI New AI Logs.rar (3.16 MB)
VCMI Old AI Logs.rar (8.94 MB)

Not sure if I should push AI optimization further but running 1-2 analysis to check for possible bottlenecks won’t hurt.

AI bottleneck 1 - huge amount of checks for tile visibility
Accessing fog of war seems to be too slow right now. Warmonger, can you clarify what’s the purpose of this code? I think it may become too slow with time as AI explores map (I could be wrong though)

int3 VCAI::explorationNewPoint(HeroPtr h)
{
	...
	for (int i = 1; i < radius; i++)
	{
		getVisibleNeighbours(tiles*, tiles*); <--- Profiler hates this line.
		removeDuplicates(tiles*);
		...

AI bottleneck 2 - Huge amount of access to player/team state maps, operator]**
Mostly caused by #1. Access to playerstate/teamstate seems to be too costly due to search in std::map’s. Perhaps we should turn PlayerState/TeamState maps into vectors.

Another option would be to reduce amount of accesses to player/team state, most notably - tile visibility.

AI bottleneck #3 - AI thread creation.
Thread creation is fast but not when this is performed like 100 times per minute (true for my autoskip tests). Should be replaced with thread pool/shared AI thread

Networking bottleneck #4 - Delay before client receives reply from server
While we no longer have ~100 ms delays from TCP but now but there is still ~15 ms half of which is caused by serializer. A bit beautified stack trace:

std::map<>::count - stl_map.h
CTypeList::castSequence+0x141 - Connection.cpp:487
CTypeList::castSequence+0xd6 - Connection.cpp:514
CTypeList::castSequence+0x8a - Connection.cpp:529
castHelper<&IPointerCaster::castRawPtr>+0x65 - Connection.h:206
CTypeList::castRaw+0x51 - Connection.h:250
CISer<CConnection>::loadPointerHlp<CPack*>+0xb7 - Connection.h:1273
CISer<CConnection>::loadPointer<CPack*>+0x261 - Connection.h:1255
LoadPointer<CConnection, CPack*>::invoke+0x22 - Connection.h:358

Map handler bottleneck #5 - moving objects on map (e.g. heroes) is too slow
Mostly comes from CMapHandler::hideObject() which iterates over entire map to remove all blocked/visitable positions of this object.
Should be optimized to iterate only over object size. However iteration must be done BEFORE object is actually moved or we must know its previous position. Not sure if can be implemented easily.*

[quote]

getVisibleNeighbours(tiles*, tiles*);

**
This code gets tiles at the edge of visibility border, from the very edge closer to the center. This way we can look for further-most tile for exploration. It was all made by Tow and looks like very simple solution to somewhat complex problem.

It doesn’t seem, however, that already checked tiles are removed from the list. So this code might be indeed inefficient.

Access to (small) map should be really lightweight.

Otherwise it may be troublesome to implement, as player vector is not continous.[/quote]

Apparently not lightweight enough. Right now we have one access to map for each access to FoW. This results in a huge amount of access to player state. And that’s too much for vcmi to handle. One way to fix this would be to get pointer to FoW table once and use it instead of accessing team state each time.

Yeah. One option would be to add NOT_PRESENT status in addition to existing loser/winner/ingame. But this means that all accesses to player state should be rechecked.

I noticed that with AI running for whole day (35 months) it slowed down considerably, but after relaunching VCMI, it is fast again. This may have something to do with memory leak, as memory usage grows every turn.

please tell me what is the point so long to test one map? Catch errors? Better to take a few maps and send them away for 12 months than one very long. At least 2-3 months. But different. And well if 7 computers and all sorts of objects.
When I was trying to find a fault on one map, but I it did not, 241 months, I drove it. Of course have been an accident because of a memory leak, but after loading saves everything was in order. But errors in AI I did not find one. But there was a car with other error maps.

Or are you pursuing any other purpose? I do not understand. Explain

EDIT:

I didn’t notice this thread was a month old. Please disregard.