Back when I started using Macs in the early 90s, I worked in Microsoft Word version 5.1, and all my text stuff got saved as Word files, or more precisely - early Mac version Word files, which Microsoft thoughtfully obsoleted when they introduced Word 6 - AKA "Word for Windows for the Mac," - in 1994.
Word 6 could open (but not save) Word 5.1 files, but the ability to save documents as Word 5.1 (also 4 and 5) documents was restored in Word 98 and Word 2004 for OS X, but the file format change was a wake up call. I still have probably a thousand or so Word 5.1 documents archived, and for now, Word 5.1 steel works pretty well in OS X Classic Mode, but that will be all over when I upgrade to Leopard and/or finally switch to an Intel Mac.
Fortunately, OS X Text Edit, Tex Edit Plus. and BareBones Software's excellent and free TextWrangler can open Word 5.1 documents, albeit as plain text with no intact formatting and a bunch of Microsoft propriety gobbledygook code at the beginning and end, but I can add least still get had my stuff, although any graphics content is lost, which is significant, because I used to use word as a rudimentary page layout app., and a lot of those old documents have graphics. The other workaround would be to go through the tedious process of opening and saving all those Word 5.1 files in RTF format, and I may eventually do that.
I also dabbled a bit with Apple's Claris MacWrite Pro word processor in the mid-'90s, not long before the program was terminated and folded into ClarisWorks 4. MacWrite Pro had a proprietary file format more inscrutable than Word's, and files I saved with MacWrite Pro are not conveniently available. Not cool.
Another word-crunching program I used for a while in the early '00s and liked a lot was my MacOpinion and Applelinks columnist colleague Marc Zeedar's Z-Write, which has the happy facility of a document window with dual panes that support multiple sections in the same document. On the left is a scrollable list of Section names. (New files have only one pre-defined Section called "Default." You can change the default name within Preferences.) The right-hand pane is a standard text editing window containing only the text related to whatever Section you have selected on the left (if no Section is selected, the editing window is blank). When you click on a Section name, the respective text appears on the right for editing, as you'd like.
The ability to sort various parts of an article and research resources respectively into instantly accessible sections greatly facilitates the organization and development of a piece of prose, especially longer ones that involve extensive research like a magazine article or book. The downside is that the only application that can open Z-Write documents (that I've found, anyway) is Z-Write, so your Z-Write files are not portable to computers that don't have Z-Write installed, even to get at the raw text content.
One of the reasons (although my disgust with Word 6 was the big one) I switched to Nisus Writer around 1997 was the fact that it's save format was a variant of plain text, and Nisus Writer Classic documents can be opened by any text editor. These days I have no longer have Nisus installed, but my archived Nisus Writer files open in OS X Text Edit or Tex Edit Plus.
Lately, I've been using Apple's Pages word processor from the iWork '08 productivity software suite. Pages is the successor to AppleWorks and ClarisWorks, and by extension to MacWrite Pro, which its clean interface with lots of white space reminds me of a bit. It's a nice place to work, although more than a bit sluggish on my 1.33 GHz G4 machine. ONe interesting thin g about Pages is that it is actually something of a word processor/ page layout application hybrid, and uses an XML graphics-oriented save format rather than a conventional text-oriented word processor format. This makes for honking large document files and some other issues, but
AppleWorks text documents and Microsoft Word documents can be imported, and files can be exported to RTF, PDF and Microsoft Word .doc formats. Pages can also save and open RTF files with pictures, and open and saves RTFDs which may contain images, but does not handle pictures embedded in RTF. It also doesn't handle html conversions, so has plenty of limitations as a general purpose word processor, and those page layout/presentation format default Pages documents are anything but ideal for archival reference,
I've used Eudora as my principal email client since first going online back in the '90s, and while there are plenty of reasons why I love Eudora, an important one is that Eudora mailbox archives are another plain text variant that can be opened by text editors and word processors on both PCs and Macs.
In a recent column, Computerworld's John Webster asks rhetorically:
":How many word processing formats from the 1970s or 1980s are still usable?... We rarely if ever think of saving our digitized thoughts for the sake of posterity. But for the sake of historians, lawmakers, sociologists and scientists yet to be born, we should."
Or even for the sake of being able to access and retrieve our own electronically stored data.
Webster continues:
"I thought of the U.S.’s founding fathers writing the Constitution and wondered what that process would be like if they were all Microsoft Corp. customers. For sure, they’d print out the final version for all to see - on parchment maybe? But what about all the draft versions and e-mails back and forth - in short, all the supporting documentation that clue us in on their states of mind and tell us what they really intended? I dare say those files would be gone forever. And of those that remained, would any modern program still be able to read or use those formats?"
You can check out John Webster's full commentary here.
The Storage Networking Industry Association (SNIA) Data Management Forum’s 100 Year Archive Task Force, formed in 2004 and operated as a global, multi-agency group working to define best practices and storage standards for long term digital information retention was created by SNIA because of what the organization considers a pending crisis in long-term preservation of digital information in the IT datacenter.
The crisis has two principle axes, losing information that is stored digitally due to corruption, loss of access, loss of discoverability, or loss of readability and second, losing control of the ability to keep up with migrating the overwhelming volume of information to new media and into new logical formats. Many standards and best practices exist,but even so, much of the remaining work to do in solving the long-term preservation problem lies in the storage domain. In response to this need, this Task Force was born. The 100 Year Archive Task Force’s objectives are as follows:
Goals
Produce with a multidisciplinary team a best practices for long-term digital information retention reference model that covers the information-storage domain the technology domain unaddressed in all existing ‘archive’ standards and best practices such as ISO 14721:2002 - Open Archival Information Systems, OAIS or the Sedona Conference.
Integrate ILM-based practices into the long-term digital information retention process so we can sustainably automate IT infrastructure in support of business and information requirements.
Define reference models and possible technical standards that solve and provide for scalable physical and logical migration the two ‘big challenges’ of preservation.
For more information, visit:
http://www.snia-dmf.org/100year/
This is of course also an issue in the context of everyone's personal computer data, and not just over 100 years spans, but even ten years.
Back in the late '90s it finally dawned on me that the vast majority of my output was either being marked up in HTML for the Web or submitted to editors in plain text email messages anyway, so using any full-featured word-processor amounted to several magnitudes of overkill. I had being long since discovered Tom Bender's delightful little styled text editor Tex Edit Plus, so around mid-1998 I switched to TE+ as my main text crunching, HTML markup, and document archiving application. Nine years later, I'm still happy with that decision, and the OS X versions of TE+ are better than ever, and happily support the built-in OS X spellchecker.
Tex Edit Plus files are plain text files, although TE+ does support a substantial degree of text styling and formatting and even embedded photos, movies, and music clips. The documents can be opened with pretty much any word processor or text editor, so you're not dependent on any proprietary software in order to access your data on virtually any computer, Mac or PC.
I do also use DEVONthink Pro Office, a superb information management application, which does store its data in a proprietary file format with everything in one big database file (or files - you can have multiple databases). This makes the whole thing very convenient for backups, and the DEVONthink database format also seems remarkably efficient seems to be able to store a vast amount of information in a relatively compact document. DEVONthink is such a solid, stable, and robust application, that I don't worry overmuch about lost data due to file corruption, but it does have the same disadvantage as Z-Write in that only DEVON applications can open DEVONthink databases.
Nevertheless, just importing everything into DEVONthink does have considerable appeal, given the program's powerful hierarchical filing structure and AI functions for sorting, searching content and finding documents, all of which makes it ideal for everything from keeping a simple notebook to organizing large information collections, and you can back the whole kit and caboodle up by simply copying the DEVONthink database file to your backup media.
DEVONthink happily can export documents in plain text or RTF files (and for good measure it also can open MS Word documents (although sadly not Word 5.1 files) and PDFs as editable text, but I still keep copies of anything I wouldn't ever want to lose access to as Tex Edit Plus files.
My general recommendation is to store your important stuff in at least two different file formats, one of them as universally accessible as possible, such as plain text or Rich Text Format (RTF).
For pure archival storage, especially when text formatting, graphics, or multimedia content are a factor, it's probably hard to beat PDF as a file format, although it leaves much to be desired for stuff that will need to be edited in the future. PDF has become the closest thing there is to a cross-platform lingua franca file format, and which has many virtues in its own right. Most Mac graphics apps. and an increasing number of productivity applications, such as Papyrus Office and Yep! can save text and word processing files directly as PDF documents, and just about any app. can save in PDF format via the OS X Print command. For example, as an exercise, I just converted this column draft to a PDF by selecting Print from the File Menu; then selecting "Save as PDF" from the "PDF" pull down menu in the lower left of the Print dialog.
That instantly produces a PDF file archive of my article draft that I can open in Preview, Adobe Reader, DEVONthink, Papyrus, Yep!, and Skim, to name a few, all with the ability to select and copy text, and from Tex Edit Plus, ToyViewer, or virtually any OS X image editing program as a non-selectable graphic. It's not even that bulky, although at 52k it requires considerably more storage space than the original 16k text document.
Indeed, Because PostScript-to-PDF conversion technology is integrated into OS X's Quartz 2D, any Mac OS X application can draw images on screen from high-resolution PostScript/EPS data instead of low-resolution bitmap. This means you can print PostScript-quality documents on all printers, even on non-PostScript devices.
For more on getting the most out of PDFs in OS X, see:
http://www.applelinks.com/index.php/more/getting_the_most_out_of_pdf_in_os_x/
And most important of all, whatever file format you decide to use for archiving your valuable data, do frequent backups and backup updates.
Charles W. Moore
Tags: Blogs ď MooresViews ď Hot Topics ď

Other Sites
I found this out a long time ago when I first switched to Linux in 1999. I have everything in open formats on my website and the source files backed up on CD-ROM. I am also printing out most my writing in books through Lulu. I have copies of everything important on the web, CD-ROM, LaCie hard drive, and print in my fire safe onsite as well as CD-ROM and print at my sister’s house in town, my parent’s house out of town, and my uncle in Europe. I also have my writings at several social networking sites as well. I upload immediately to my website, backup several times a day to CD-ROM, and backup once a day to my external hard drive. I also use Pages mainly for flyers. I use HTML for my basic formatting for small documents which I convert into PDFs with htmldoc. When I finish writing a book, I convert it to LaTeX and fix it up with TeXShop and then I convert it into about 15 other open formats. I use Scribus for any graphic based books. Then it goes off to print. I use both my Mac and Linux about the same amount. As far as editing PDFs, look at the .46 version of Inkscape for that.