Storage of Public Information

Public information is information which we're all entitled to because we live in this society. Broad-based access to this information is under threat not only from funding restrictions, but also from ignorance and apathy on the part of those who are charged with providing and maintaining that information.

With the rise of the "Standard Microsoft Desktop" it becomes all too easy to assume that everyone has the same software and therefore to ignore issues of data portability. Many people spend their working lives coccooned in a homogenous computing environment and aren't even aware that such issues exist!

Closed storage formats

Have you ever tried to read an Excel spreadsheet or a Word document without using a Microsoft product to do so? If you're not a "computer person", then I expect it probably hasn't occurred to you to try, but stay with me for a bit.

In effect Microsoft owns any data stored in these formats because you have to pay them money to be allowed to get at it -- to buy the software necessary to access it. Sure you can copy the file, but without the software your computer can't display it to you.

And unlike the computer you run it on, you don't have a choice of where to to get that software -- or even to write it yourself: Microsoft's standard licence says that nobody is allowed even to look at the inner workings of Excell or Word to figure out how it stores information.

There do exist bright & hardy souls who take this as a challenge and figure out the storage format just by looking at document files, but they work under the shadow of a lawsuit (which means that it's a game strictly for amateurs -- nobody would gamble a company on it), and they're always running behind the changes that Microsoft keep making.

And besides, they shouldn't have to waste so much effort over something that is only hard to find out because Microsoft have been purposely secretive about it. This isn't like a patent whereby the storage method is "a valuable discovery" and therefore worthy of recompense -- no, the difficulty arises because there are so many different ways of storing data, and Microsoft won't tell us which ones they have chosen.

Of course, Microsoft's licensing terms are hardly any different from most other software companies, but

  1. they have a stronger propensity than most to use opaque storage formats (ie that they won't publish descriptions for), and
  2. also they're implicitly supported by all the people who claim that Word is "standard" because "everyone has it".
Well, everyone doesn't have it, and so publishing in an opaque storage format forces people to pay money to Microsoft, or go without. When it comes to public information, that isn't acceptable.

Open storage formats

An open storage format is one which is publicly documented so that anyone can interpret to contents of the storage file, and/or write software to do likewise.

Even if you're not of a mind to write such software, it means you can have a choice of where to get the software: to avoid software that won't run or crashes or is too slow on your computer, and avoid software vendors who charge too much or who seem not to care about serious bugs that could damage your data.

Since when does our public information belong to Microsoft?