WAD "Where's All the Data" files used by DOOM and various other
games are simple containers, similar to zip and other archive
formats, without additional complexity (such as compression) and
data-centric rather than file. This article describes how to
read the WAD files used by DOOM, DOOM II, Rise of the Triad and
similar games of that era. Yes, I'm talking DOS and 1993, not
the more modern reboots.
The article only covers reading of a WAD and extracting its
contents, it does not cover the format of the individual data
within given that the data is application dependent. With that
said, I'll be covering the DOOM picture format in the next
article.
In 2018 I looked into the MIX format used by the Command &
Conquer games which is very similar to WAD but for reasons I
don't recall I didn't end up writing a post about the format.
Recently I finished reading Jimmy Maher's excellent series on
DOOM and that reminded me I had wanted to look into WAD and
other container formats for my own future use. As I have been
completely unable to finish a single draft blog post I currently
have, I decided something fresh and new (to me anyway!) was a
good idea.
Although I don't normally plug other sites, Jimmy's blog The
Digital Antiquarian is a fantastic blog of the games of
yesteryear and I wish I could write half as well as him.
About WAD Formats
There are various formats of WAD file available, each building
on the previous. This initial series of articles only covers the
original version first introduced in DOOM. At the time of
writing, I haven't looked the other versions but I plan to look
at some of them in future articles.
I have tested the code presented in this article with WAD files
from Shareware DOOM, ULTIMATE DOOM, DOOM II and Rise of the
Triad: Dark War.
The Format
The format is simple enough. There is a 12 byte header which
details the wad type, the number of lumps of data it contains,
and an offset where the directory index is located.
Range
Description
0 - 3
Either the string IWAD or PWAD
4 - 7
The number of entries in the directory
8 - 11
The location of the directory
The directory index is comprised of (16 * number of lumps) bytes
which describe the lumps. Each 16 byte header details the size,
the position in the data and the lump name.
Range
Description
0 - 3
The location of the lump
4 - 7
The size of the lump
8 - 15
The name of the lump, padded with NUL bytes
As far as I know, the directory can be located anywhere in a WAD
file, or at least anywhere after the header. All of the WADs I
have examined have the directory at the end of the file which
makes perfect sense from a serialisation standpoint, but there's
no reason why it couldn't be elsewhere. The only rule is that
all elements in the directory index must be contiguous.
The first four bytes of the file header are either IWAD or
PWAD, and this denotes the type of the WAD. The I prefix
means this is an "internal" WAD, which is the main WAD for a
game. The P prefix denotes a "patch" WAD, which allows a WAD
to override the lumps from the main internal WAD, e.g. for
providing custom levels, skins or other data.
Reading the Header
Reading the header is quite straightforward - first read in the
12 bytes into a buffer and define the WAD type based on the
first byte. Next, we extract 32bit integers from each set of 4
bytes in the remainder of the header that contain the number of
data lumps and then the start of the directory listing.
Note: In the interests of clarity, parameter and data
validation have been omitted from the snippets in this
article.
You could use the BitConverter.ToInt32 method, but then if
this code was ran on a big-endian system, the
BitConverter class would automatically reverse the bytes,
returning values that would be very wrong and so this set of
articles will use their own code which ignores the endian-ness
of the system and will always read and write as little-endian.
Reading the Directory
Now that we know where the directory index is located in the
file, we can read out the individual lump details. As with the
WAD header, we declare a buffer big enough to fill the directory
header, then read in the bytes. Using the same GetInt32Le
method described earlier, we extract the size of the lump and
its position in the file.
Next, we find the real length of the lump name, by starting at
the end of the array and working back until we find a non-zero
value. Once we have this length we call
Encoding.ASCII.GetString to extract the name. Unfortunately,
if we called this API without defining the true length, the
returned string would include anyNUL padding bytes.
About Names and Empty Data
Lump names may not be unique and can appear multiple times. For
example, every DOOM map that I've looked at so far has a lump
named THINGS, another named LINEDEFS and several more.
As a result, DOOM seems to make use of a uniquely named lump
(e.g. E1M1) that serve no purpose other than to be a bookmark
to a contiguous set of lumps that make up a feature (and
sometimes another placeholder at the end if the lumps are
dynamic). For placeholders, the lump size is set to zero, and
the lump offset is either set to the offset of the next valid
lump or again zero. This also means that, depending on the
application using the WAD, lump order is important.
Reading Lump Data
To read the actual data for a given lump, we would set the
Position of our backing Stream to the lump offset and then
only read data up to the length of the lump.
This sounds error prone and means you have to know this
information up front instead of being able to pass a Stream to
another method. So for this case, I created an OffsetStream
class which basically acts as a window into another stream
without being to read data it shouldn't or the caller needing to
explicitly know about source boundaries.
With this class in place, I can now get a Stream that only
provides access to the a specific lumps data with a call similar
to the below.
I can then dispose of this stream or pass it to another method
(for example ImageFile.FromStream) without needing to know or
care that this is part of something bigger or affecting that.
Putting it all together
For this example, I created the WadReader class, which is a
forward reading class for quickly enumerating the contents of a
WAD. I also added a WadFile class which will load all the lump
meta data into a collection for further use.
Using the WadReader
The WadReader is designed for quickly enumerating the contents
of a WAD. It maintains enough state to know where it is in the
WAD, but nothing else, expecting the consumer to take care of
storing whatever information is required. This would be useful,
for example, if you wanted to pull out one or more lumps for
load on demand.
The WadReader class exposes a Type and Count property, and
a GetNextLump method which can be used to enumerate.
GetNextLump will return a valid object as long as there are
items remaining, and null once it reaches the end of the file.
Using the WadFile class
The WadFile class loads all the lumps (but not the actual
data) into a collection so that it is always available. You can
then pull out lump data at any point without having to re-read
the directory and provides convenience methods for more easily
pulling out WAD data. It isn't as efficient as WadReader, but
easier to use. It also supports write operations whilst
WadReader does not.
Where's All The Source Code
There is no single download available for this sample as rather
than doing a simple demo as I do for most blog posts, it is a
slightly more complex solution covering reading, writing and
various other features too. The full project is available from
our GitHub page.
Wrapping Up
The WAD format has no real features and so is simple to read and
write. The linked GitHub page includes a demonstration program
which allows WAD files to be opened and contents extracted.
Like what you're reading? Perhaps you like to buy us a coffee?
Hi ! First of all, thanks for the detailed def of a WAD.
But... i was disappointed to find no tool using these data, for i'm no programmer.
Is there a link i missed ? Or can you point to an existing tool that could dive informations like type (I/P wad), title, levels #s & titels, aso), like WinRAR displaying the content of an archive, but with specific infos.
Many thanks again.
The founder of Cyotek, Richard enjoys creating new blog content for the site. Much more though, he likes to develop programs, and can often found writing reams of code. A long term gamer, he has aspirations in one day creating an epic video game - but until that time comes, he is mostly content with adding new bugs to WebCopy and the other Cyotek products.
In a prior post, I described id's WAD format used by
classic games such as DOOM and how to read them. This post
covers how to write them. As with my first post, this only
covers the original WAD format, not the enhanced ones which
followed.
In my previous post, I described id's WAD format used by
classic games such as DOOM and how to read them. While
researching the format though, I wasn't 100% sure that I was
extracting lumps properly - the only readable file I'd
discovered was `DMXGUS` in `DOOM1.WAD`, and also `LICENSE` in
`DARKWAR.WAD`... hardly conclusive.
Armed with the specification from the DOOM FAQ I decided
to take a brief segue into decoding the pictures to verify the
lumps I was extracting were valid.
WAD "Where's All the Data" files used by DOOM and various other
games are simple containers, similar to zip and other archive
formats, without additional complexity (such as compression) and
data-centric rather than file. This article describes how to
read the WAD files used by DOOM, DOOM II, Rise of the Triad and
similar games of that area.
The article covers reading of a WAD and extracting its contents
# Krapul