View Single Post
Old 08-07-07, 08:45 PM   #61
skwasjer
The Old Man
 
Join Date: Apr 2007
Location: Netherlands
Posts: 1,547
Downloads: 26
Uploads: 3
Default

As promised, and prior to the upcoming release, I wrote up a brief explanation of the mechanism I coded into S3D for property data type detection. This is actually the primary reason why the application will still be alpha. I need help from the community to make the parser foolproof. As my modding skills are absolute zero, I can't guarantee detection is accurate, and this is where you (should) step in. Read up on this if you are interested.

Warning: skip if you don't have some technical background...

PS: this doc is focused on the next release, some things I discuss are not in the current public alpha...!!!


A word on automatic property data type detection
Auto detection of property data types is not foolproof. Remember this. There are even scenarios possible when (especially regarding string types, more on this later), where a property may be detected as one type today, and as another the next (because you changed it's value). This is why I recommend 'predefining' as much properties as possible (the ones you change!!!). I have not done this yet, because I'm not 100% sure about certain data types, because there's so many of them, and because I'm lazy You can help out by providing feedback on each value that gives problems, so I can predefine more data types over time.

So how does it work:
  • First, S3D checks the xml config file with predefined data types. If a property is predefined, it is used, and autodetection is skipped. You'll see this in the editor by the label 'predefined'.
  • Next, S3D checks if the property is possibly a string. My criteria (although it is not perfect, but it's the best I can do for now, without affecting performance) is:
    • Every byte (except the last) must be greater than or equal to 32. If only one byte is less than 32 (non-printable control characters) then it is not considered a string. Silent Hunter uses the Windows-1252 encoding: http://en.wikipedia.org/wiki/Windows-1252
    • The last byte must be 0 (the null terminating character).
  • Next, S3D checks the data size of the property:
    • 1 byte = 'byte' type. This can also be a 'bool' type (true or false, 0 or 1) but in that case you have to predefine it.
    • 2 bytes = 'ushort' type. This can also be a 'short' but in that case you have to predefine it.
    • 4 bytes = 'float' type. This can also be an 'int' or 'uint' but in that case you have to predefine it.
    • 8 bytes = 'ulong' type (often they are id's). This can also be a 'long', 'double', or 'vector2' but in that case you have to predefine it.
    • 12 bytes = vector3.
  • If all else fails, S3D assumes the property is a collection. It could however also be an 'array', a 'stringCollection', or another available type. Sometimes, one of the above autodetection rules can even be incorrect (for example, 12 bytes could also be a stringCollection!!!). We can never know for sure using this technique. This is when a data type must explicitly be defined in the xml config file, to either overrule wrong detection or overrule the final default 'collection' type.
So why do strings in particular pose a problem? Well, say you have a string of arbitrary length. Say, you decide to clear the value to no string (empty). In the file structure, an empty string is written back as a single byte, the null terminating char, or '0'. Once the file is read again, and if the property is not predefined as a string, my auto detection rules will fail to identify it as a string again, it will in fact see it as a byte value this time (see the rules above again). There are more similar scenario's possible and should be avoided at all cost.

The biggest benefit of this mechanism is that S3D is able to open most files without problems, for reading even though not 100% accurate. I recommend that you never modify autodetected property values though, just because of the problems described above. Before modifying, analyse each property you want to change for it's datatype. Is it correct? What should it be if not? Next, ask me or try yourself to add definitions for them. Try to stay away from global definitions as much as possible, unless absolutely sure (they may cause other files to read incorrectly!). Once they work, change the values as per your wish, and feedback the definitions back to the community, and also me so I can include them in next updates. You are then helping out to improve S3D's property parser and editor. Over the months to come, the parser should rely less and less on autodetection, and only use definitions, resulting in less parse mistakes in the long run...

PS: a how-to on property data type definitions is described in the xml config file 'propertydefinitions.xml' in the installation folder.


Hope this clears up a few things. I guess it will raise questions too though, so ask away if you have any...

Bed time now.
skwasjer is offline   Reply With Quote