HEX EDITING TUTORIAL - PART II By DI at
http://www.campaigncreations.org/development/hex.shtml
Well, now you know how to add, subtract, multiply, divide, convert, and generally deal with hexadecimal numbers, right? No?... Well, if you are still not totally clear about the hex number system or just can't seem to do calculations in hex, that's O.K. That's not good, but you can still probably manage. I do suggest that you try to understand it though, as it makes everything else tremendously simpler. Anyway, you'll be happy to hear that Part I is probably the most difficult part of the tutorial, and if you can understand the concept of hex reasonably well, the rest will be a snap. Nevertheless, right now I do suggest you also enlist the aid of a calculator that can convert numbers between hex and dec since you will not be able to compute the bigger values in your head even if you're good. [Windows 95 comes with such a calculator program; you can find it in the Programs/Accessories section of your Start menu. Just input a number and click the radio buttons to switch it between formats -- you can even to calculations in hex =]
Terminology
This section will deal primarily with the words people use when they talk about hex editing, like "byte" or "offset." All of it is really just fancy programming jargon that defines very simple things. And all you have to do is remember a few definitions, use the words a little bit yourself, and you'll be understanding everything in a jiffy.
When you open a file in a hex editor, you are likely to see something like this:
0000 2353 ABC3 0000 9000 B122 000A
ABC3 0000 9000 B122 000A 2353 ABC3
0000 2353 ABC3 0000 9000 B122 000A
9000 B122 000A 2353 ABC3 0000 2353
etc...
And then you're likely to think, "WTF?" Well, its not all that confusing as it seems. If you'll notice, the "blob" above is nothing more than a lot of numbers in hexadecimal. That's how "data" is stored in files (or at least how you see the data in a hex editor). Now, let's break it up and define the parts:
Byte - this is the basic unit in hex editing. It simply refers to a hex number which has two places like 23, B1, 0A, and 00. You know when they say some thing is 100 kilobytes or something like that? All that means is 100 thousand (kilo- means thousand) of these little two digit numbers. Thankfully, you probably will never have to deal with something that big, but even if you do, there are easy and fast ways to find the bytes you are looking for. A byte can hold any value from 00 to FF, or 0 to 255 in decimal.
Integer / Word / Short Integer (Short) - these three words all refer to the same thing: 2 bytes, or a hex number with four places. For example, 2353, 000A, B122, and 0000, are all integers. Usually people use the term "short," though don't expect it to be universal. A short can hold any value from 0000 to FFFF, or 0 to 65535 in decimal.
Long Integer (Long) / Dword / Double Word - Again, multiple words defining the same thing: four bytes, or a hex number with eight places. For example, 2353000A, 00009000, ABC30000, and 9000B122 are "long"s (which is the preferred abbreviation). A long can hold any value from 00000000 to FFFFFFFF, or 0 to 4294967296 in decimal.
Those three "variable" types are basically all you need to know about. Now, here's where the stuff from previous section of this tutorial comes in. While the data appears in hex form, values are still normally used in decimal (normal) form. For example, if a game looks at a certain byte to determine, say, how much damage a unit will do, and it finds 0A, you will still probably see the decimal number "10" in the game. Thus, when looking at and editing a file, you are usually instructed to edit a certain byte, short, or long in terms of a decimal value, not in a hexadecimal one. For example, if someone tells you to change the value of a certain byte to 12, they probably mean put a 0C there, not a literal 12 unless specifically stated. Thus you need to know how to switch between decimal and hexadecimal, or at least have a calculator handy to do that for you.
I'm going to digress for a moment to point out a very important fact. Most files you will edit have their data stored in a notation called Little Endian or Intel Notation. This means that the "numbers" are not really in order as you would think they are. In fact, each value is actually stored backwards in terms of their bytes. I.E., the value for a long (four bytes) is actually read with the last byte first, then the third byte, then the second byte, then the first byte. For example, the short:
10 00
does not store the value of 4096 as you might think. You first have to reverse the order of the bytes to get the actual hex number:
00 10
So actually, that short stores the value 16. Note that each byte "pair" of hex digits don't change; in other words, you don't reverse the order of all the digits, you only reverse the order of the bytes, while keeping the individual two-digit numbers the same. Let's do another example just so you're clear:
0000 2353
is read as:
53 23 00 00
That's 1394802688 in decimal if you're keeping track. =)
O.K., now you should be able to read values from a hex editor pretty well. I suggest you pick a few blocks at random from that big blob of numbers above and try to figure out their values (or just do the exercises at the end of this lesson). Here's some more hex terms you should know:
Offset - this just refers to the location of a byte in a file. In other words, it refers to the number of a byte. It is important to note that hex editors will start numbering bytes from 0, not 1, so the first byte would be at offset 0. For example, if I told you to go to offset 101 in a file, I mean go to the 100th byte (assuming my 101 is in decimal). Let's do an example. Assume this is a file:
2353 ABC3 0000 9000 B122 000A
Offset 6 (decimal) in this file is where?...
2353 ABC3 0000 9000 B122 000A
0 1 2 3 4 5 6 7 8 9 10 etc.
Get it? File offsets are sometimes given in hexadecimal. For example, you may be told to go to offset 2ACD, which means go to the 10958th byte (2ACD hex = 10957 dec, but we add 1 to find the actual byte number because numbering starts at 0). If a offset number is given to you with a 0x prepended to it (i.e., 0x2ACD), you know its in hex. Of course, if it has letters in it, it must be in hex. =) Hex offsets are often given with a couple of zeros (00) prepended to the number just so all of them line up (i.e., 0012, 0123, 0001, etc.). These are normal hex numbers (in other words, you don't have to switch the order of the bytes and the zeros in front don't mean anything). You should be comfortable working with offsets in decimal or hexadecimal, and a good hex editor will display and allow you to search for offsets in both.
Signed and Unsigned - Sometimes you will come across things like a "signed long" or an "unsigned short." A "signed" variable means it can hold both positive and negative values (i.e., it can have a + or - sign). "Unsigned" means it can only hold positive values. I am going to omit the method of calculating the value of a signed variable because it is a little complicated and will take quite a bit of back tracking. Most variables you encounter (in data files) will be unsigned so it will not matter. Just understand what the terms mean.
ASCII - This refers to your standard set of "characters." This includes all the letters a-b, A-B, numbers 0-9, and other miscellaneous symbols like !, @, #, ", etc. Each character or ASCII symbol is represented by one byte. For example, the byte 31 refers to the character 1. A hex editor will usually display the ASCII equivalent of the hex you are looking at off to the side, because sometimes the data is just plain text (as in a text file =). A lot of byte values don't correspond to any ASCII character, however, which is why you're likely to see lots of "period" . characters where there is just raw data.
String - A string is just a series of ASCII characters. "dog" is a string, and so is "you suck ****". =)
Tag - This is usually a string in a file that denotes the start of a section. For example, the string tag STR (in ASCII) in a chk file (uncompressed SCM/SCX) denotes the start of where the map's text (or strings) are stored. A hex editor will allow you to search for these tags in the ASCII of a file so you can easily find where certain parts of a file begin (assuming that the file has tags in it of course). Remember that the tag does not actually contain the actual data, it just shows where the data starts. And remember that the tag is not part of the data -- so you would not edit any of the bytes which make up the tag itself, only the bytes which follow after the tag. For example:
FF53 5452 20A5 8200
The tag STR corresponds to the red bytes. So the data does not actually start until the 20 byte.
Excercises:
Name the variable: (byte, short, or long)
1) 0395 9275
2) 01
3) AB90
Calculate the decimal value: (all hex in Intel notation)
4) 0010
5) E803
6) C78A A900
Take the following hex file: (Intel notation)
0000 0000 9475 ABCD 0123 4444 0010 1000 8800
7) What hex value does the byte at offset 9 (dec) hold?
8) How many offsets are there in this file? (00 counts as one)
9) There is an unsigned short that begins at offet 0C (hex). What is its decimal value?
10) There is a three character (ASCII) tag that begins at offset 04 (hex). What is the hex value of the next byte (directly following the tag)?
11) There is an unsigned long that begins at offset 07 (hex). Change the value of that long to 16 (dec). Overwrite the current value and write out the entire modified file.
Answers: 1) long; 2) byte; 3) short; 4) 4096; 5) 1000; 6) 11111111; 7) 23; 8) 18 dec or 12 hex; 9) 4096; 10) CD; 11) 0000 0000 9475 AB10 0000 0044 0010 1000 8800
O.K. Now, you should already know the basics of hex editing. The first thing you should do now is get yourself a hex editor.
. . . . . . . . . . . . . . . . . . . . . . . . . additional tutorials. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .
In addition, here are links to a couple other hex editing tutorials:
http://www.flexhex.com/docs/howtos/hex-editing.phtml
http://www.tamriel-rebuilt.org/?p=mo...tuts/dblad/hex
On to mini tweaker