SensusAccess Conversion Best Practices


Table of contents

This article is provided courtesy of Sean Keegan, SensusAccess Solutions. This is the original Conversion Best Practices PDF.

The quality of a conversion is dependent upon the quality of the original document. Additionally, the resulting output format may include enhancements for navigation if the original file contains the appropriate semantic markup. For instance, a Microsoft Word document containing the heading style markup for chapters (such as, Heading 1, Heading 2 and so on) will convert into a more usable DAISY or EPUB format with the relevant chapter navigation elements. The following best practices identify simple methods to prepare the file before converting in order to achieve a high-quality output.

PDF and image-based files

Top

Converting to Microsoft Word and text files

SensusAccess will convert image-based documents into Microsoft Word, RTF and text files. You may also find it useful with some image-based documents to convert initially to Tagged PDF and then copy and paste the text from the Tagged PDF into Microsoft Word. This may result in a better reading experience and may remove non-essential content.

With the Microsoft Word version of the document, you can more accurately clean the content for conversion into MP3 audio or for use with assistive technologies. Most conversions will take just a few seconds within Microsoft Word and involve the use of the Find and Replace tools. For more information on using the Find and Replace tools, see Using the Find and Replace in Microsoft Word for removing special characters in a document.

NOTE: In the Find and Replace examples below, replace the <space> value with one spacebar and do not include the quotes.

Top

Image-file to tagged PDF to Microsoft Word document

Top

Image-file to Microsoft Word document

To clean up a Microsoft Word file for use with assistive technology or for creating MP3 files, perform a search and replace to remove optional hyphens and section breaks. Identify the special character you wish to find in the Find: box and leave the Replace with: box empty. See Using the Find and Replace in Microsoft Word for additional information on removing special characters in a document.

Top

Authoring Microsoft Word, RTF, text files

Top

Authoring HTML files

Top