Common format for configuration files

Hi. Currently every config file in VCMI uses its own format instead of something more-less unified. However I’d like to have support of some common format like XML (not a big fan of this one), JSON (our settings.txt uses something similar) or something similar.

There is no sense in replacing everything we have so far - I just want to have one format that will be used from now - for example currently I need one for animations. Of course I can make one more format and one more parser with error checking (modders\artist will have to use it in future) but why don’t do this once and for all?
Later we may need it for such features like mods support.

I personally would like to use JSON - it have simple syntax, don’t have any complicated stuff inside (array, map(key-value) and basic types: string, float, bool is basically whole format ). I also have tiny read-write parser that should work for VCMI (no need to look for\write one).

XML on the other hand is more popular but it have over-verbose syntax and probably too powerful for simple config files.

Any suggestions or better propositions?

Well, most of config files for VCMI can be openeded with TextEdit and are just simple tab-spaced tables editable with any Excel / OO Calc. Since they are not expected to be complex and full of options, simplicity is desired for average user to be able to edit them.

However, I spot some issues with bonusnames.txt, which is more complex topic as some of descriptions need to be branched, same is true for graphics for Bonus System. Smarter solutions would be welcome.
Also, in some distant futue I’m going to implement expert system which involves simple, yet relatively diverse rules and they will need a complex parser. Since I’m not quite into theory of compilation, it would be nice to have some reference, especially one already used in VCMI. I mean, having dozen of different file styles won’t be clear for anyone - and these files are going to be subject of modifications.

That’s a good idea to unify our config/settings files. In the next days I’ll create a new settings file for unit animation speed and projectile speed which would has its own format.

I would choose XML because there are many people who understand this format, according to Wikipedia it can even be shorter than JSON and I’m working often with it. =)

JSON looks nice, definitely better than XML ( … uu_Xml.png ). Feel free to integrate JSON parser into VCMI and use it in new config files.

OK, after a further investigation and because of our usage I’ve changed my opinion. :slight_smile:

It seems that JSON has a better support for data structures like arrays, booleans,… It’s perhaps also better suited for our config/settings files than the XML format which feels more like a document for delivering mostly text information with semantical descriptions. Thus JSON has a better distinction of numbers, boolean types and strings. It allows only UTF8 encoded strings. JSON looks somehow cleaner, less unnecessary information.

I’ve fixed several identical bugs in the parsers in the past, so I’d like to see that move towards JSON happen.

For the parser, would you consider which seems quite complete, popular and supported ?

I saw some nice online JSON editors. Haven’t find anything that can be downloaded thought…
Plain text work fine for 2d-arrays or some lists. But for more complicated cases we’ll need something else. Current situation for example with buildings (5 separate config files) is far from perfect.

Speed of animation is one of the things I’d like to add into my animation description. Can you post here\PM your version of format for this?

Great site: "You must logon to download zip files."
At least they have browse code tab. Looks like one I saw somewhere on github. I’ve made search for JSON parser there looking for something small (2000 lines of code in 12 files to parse such simple format?) and with some simple error checking (just line\position of one error with some description). Nothing. :confused:

After this search I took my JSON parser which I made earlier this year and added error checking in it. Without any 3x slowdown in error checking mode like in this one. Uses only STL. Just ~500 lines.
Of course it’s not perfect - no string escaping or unicode support ATM (all input is std::string) but why I can’t find something similar?

We may also need some things not present in JSON (like comments and some basic shema checking) so using our own parser will be more flexible solution.

JSON is all about parsing, using the data and exporting in back to it. It can be edited by hand in text editor or edited in program more targeted at specific data structures, like generators, etc.

Well, comments are present in JSON :wink:

data: {
// one line comment
something: 0
/* multi

There are no comments in JSON according to rfc4627 But of course it`s no problem to add them.

Well, strange I’d say, because most parsers support them. Every that I’ve used atleast.

I’ve converted a few config files to json. I’m rather happy with the result. The source code is not smaller, but the config files are much more readable. I’ll convert more as time permits.

Hi Ivan,

I have a problem with rev 2627. Default values for the configuration blurs the split between data and code, and I doubt the following is more readable:

Also adding bogus values to vectors when there was none before is bad and may introduce bugs in the rest of the code.

I think this commit should be reverted altogether.

I’ve made this mostly due to a lot of code that looks like this:

node = json"someValue"];
if (!node.isNull())
    someValue = node.Float();

Currently only last line is required. Also lack of “else” block may result in undefined value in someValue.

I’ll remove that parameter from all methods except Float( here default value can be -1, 0 or 1)

True. There is error actually - that “!” should not be there. But compare old and new JSON code:

"sex" : 1,
"female" : true,

Later version is definitely more readable.
So correct code in this case will be:

That looks better, although I still don’t think default values are good because they hide details in the config files (ie. you need to read both the code and the config file to understand what the config file defines). The object should have already be initialized with a default value.

And I still see and issue with that:

Before we don’t insert anything. After we insert junk.

I’ll revert r2627, it causes crashes on startup for me.
Consider following line:

const JsonVector &defs_vec = g"additionalDefs"].Vector();

When g"additionalDefs"] is null, the Vector() function returns const ref to its default argument — a temporary empty vector constructed just before the call. That temporary is destructed shortly after and defs_vec becomes a dangling reference! Any attempt to use it (like foreach) is illegal.

As for the idea of default values, I generally like it because saves us some repeating. However Ubuntux also has a good points about violating data/code separation and allowing ill-formed config files to silently feed engine with junk data.

It would be best if we devise some way of specificating a default value in the config file itself. For example, we could add at the beginning of each config file a machine-readable format specification (something like DTD): what file contains, what attributes are required and allowed, what are their types and default values (if present). That would allow not only setting default values but also better file validation. Well designed format of such specification could also serve as documentation, largely replacing current comments in the file.

What do you think about that?

You’re right but it’s quite counterintuitive behavior – the first const reference to temporary lengthens the lifetime of that temporary. But passing it further as const reference does not…

The enhancements I’d like to see in the json code are:

  • default instanciators. eg:
int id = config"id"];
std::string name = config"name"];

I’m fed up typing Float(), String() and so on.

  • the input should come from a memory array+length, not a string, for speed purpose. That way you just have to map the file in memory and point the parser to it instead of loading the file in a string (and reallocating memory many times in the process). The current parsing is (relatively) really slow.

Otherwise I like the simplicity of that code.

ubuntux, your idea is against c++ standard.
the functions can have “same” names when parameters differ (count or type), because name is translated to contain thing that specifies input type. the return value is not into count, that means you cannot have two functions with same “name” and input but different output. operator is still the function by low level meaning (of course range applies!).

that implies that your approach is directly impossible, however you can use two parameters operator (eg << >> %) to get similair approach, however it may feel more diffcult to use. so I’ll either use >> operator and make it similair to stream


or leave it as is.
well you can use 3 parameters operator ( ?: or function operator) but i dont know if it fullfils the readability.

the speed problems ubuntux states are fair to resolve as he means, but it isn’t essential until we use json io more frequently in runtime.


operator] could return something implicitly convertible to float, bool, int, string etc.

That wouldn’t make any sense. How compiler would know, whether to destruct object bound to c-ref? How would we prevent from destructing it multiple times (we can make several c-refs bound indirectly to the temporary). That’d effectively require using some kind of GC.

You forgot about conversion operators. Node class can have operator int() and operator string() and they’ll be pretty different functions.