Epub Formatting Task Phase
Step 1: Ready the Source Document
- Open the Word or Google Docs file. We’re going to clean up the source document for file conversion.
Step 2: Use Search and Replace to set fancy quotes and the like.
- First, open the search and replace. Here’s a refresher, youtube clip if unsure. Or type Ctrl + H.
- Search and replace “ with “. It looks the same but the html regards them as different. Use the same keys for both. Click Replace All. So you would put “ in the find and put another “ in the Replace box.This will replace straight quotes with fancy quotes, which makes it look like a book instead of a Word document. The computer will automate for left and right quotes so we do this only once.
- Do the same for apostrophes: ‘. Search and Replace ‘ with ‘
Step 3: Dashes
- We will search and replace for dashes. This is more annoying as there might be several dashes in your document like en dashes to em dashes.
- Search and replace – with the emdash symbol —. Do the same for en dashes if you have any. You’ll find the em dash symbol and en dash symbol in the symbol section for Word. This Youtube clip will help you if you’re having trouble.
- Check to make sure that the search and replace function did this right. Sometimes Word changes the key – on the keyboard automatically to an en dash (–) and sometimes not. For em dashes, type – and press enter but it’s iffy there too. In my experience, just use Find for each one to make sure the search and replace did its job right. There shouldn’t be that many and doing this now will save time later on. So I suggest to check on each dash to make sure it’s the exact dash you want by using the Find feature and manually scanning the document.
Step 4: Use Search and Replace to get rid of “dangerous entities”
- Ebook formatter Paul Salvette calls these “dangerous entites” because it’ll screw up your html. Instead of something like &, the ebook will show something screwy like #$@$32.
- Search and Replace &, <, and > (latter two are optional)Note: I usually just do ampersand (&) since I don’t usually have the pac-man symbols (<,>) but replace them if you have them.We will replace & with & (it’s an html entity the computer will understand.) Put & in the Find Box and put & in the Replace box.Click Replace All.
- If needed, do the above for the <,> symbols. Use < and > in the Replace section respectively.
- Do the ampersand (&) first because obviously if we replace the ampersand with the html entity it will mess up the other symbols that will come after it and in the next few steps that uses regular ampersands. < will look like < in the text because we’re searching and replace all instances of &. So always search and replace the ampersand (&) first.
Step 5: Ellipses …
- Paul Salvette, author of The Ebook Design and Development Guide, my go-to source for formatting, suggests to search and replace your ellipses … with the html entity &hellip.
- Use search and replace with your current ellipsis on the Find box and your replacement ellipses (…) on the Replace box.
Step 6: Italics, Bold, and Underline
- Use Search and Replace for these three cases. This clip will show you how to add italics, bold, or underline to put in the Find box.
- Now that we’ve put italics, bold, or underline in the Find box, we need to turn to the Replace box.
- For italics, put <span class=”italic”>^&</span> in the replace box and then click Replace All.
- For bold, put <span class=”bold”>^&</span> in the replace box and then click Replace All.
- For underline, put <span class=“underline^&</span> in the replace box and then click Replace All.
- As to what they mean, ^& will tell Search and Replace to ignore the word or words in the middle. The span class titles (the ones in quotes) <span class=”italic”></span> (or “bold” or “underline”) can be named whatever you wish so long as you’re consistent and that it corresponds with your stylesheet.
Step 7: Look over your document to make sure the </span> did not break on the line. They should all be together like:
<span class=“italic”>insert sentence here</span>
<span class=”italic”>insert sentence here
- If a space is intended, always add a space after </span>. The text editor treats it like word so you need to give it a space if you want one there. If you don’t add a space after </span> when you actually wanted a space, the Previewer will render the sentence like:
This is a sentence.This is the second sentence.
See that, there’s no space at the end of the period in that example. If a space is not intended, like if a period is after </span>, then that means you don’t want a space there. Examples:
Space is intended:
<span class=”italic”>This is a sentence.</span> This is the second sentence.
Note the period is included in the italics command and so, we’ll need a space to make the second sentence.
Space is not intended
<span class=”italic”>This is a sentence</span>. This is the second sentence.
This time the period is not included in the italics command so we don’t need a space after </span> since we want the period and the last word to be together.
- As for styling the punctuation, it depends on your preferences. Personally, I prefer to put all punctuation inside their respective italics, bold, underline, etc. commands. Here’s I would do in the previous example:
This is the <span class=”italic”second sentence.</span> This is the second sentence.
Step 8: Awesome! We finished cleaning up the Source Document.
Step 9: Make a New Notepad++ file of your ebook project.
- Open Notepad ++. Make a new document by clicking File New. Name it something like insertnameofprojecthere.html. Remember, it’s important that you save it as an html file as the default file is a .txt file. So put .html at the end like “insertnameofprojecthere.html”. We’re going to move the source document to this file.
Step 10: We will now transfer from the source document into Notepad++.
- Head to your Word file source document, Select All (Ctrl+A) and copy (right click copy or Ctrl+C).
- Paste the document into the notepad file.
Step 11: Look over the notepad file and make sure your content is where it should be meaning headers have their own lines, paragraphs are all together instead of half a paragraph off in their own lines, span tags are together, etc.
- Don’t skip this step. Word sometimes creates formatting issues with the html. Lines go astray so check to make sure they don’t.
- Check the color of <span class=”insertclasshere”> if you see a peach color on those words, it needs to be changed. If you see the words in purple, it’s ok. These are Notepad++ colors. We need to do this because when we changed the “ in Task Phase Step 2, we were in Word, and it made a different type of quotation marks for the span html markup. It will not render italics if it’s peach in color and instead you’ll have the words <span class=etc. etc.> literally on your document. To change it:
- Copy the peach colored span class into the Find boxManually type <span class=”insertclassnamehere”> in the Replace Box.Click Replace All
- If you did it right, your span classes should have a purple color. Again, these are Notepad++ colors. It may be different if you’re using a different text editor.
Step 12: Whitespace nuking
- The file will most likely be incredibly untidy. Do not panic. We will “whitespace nuke” to get rid of all those unsightly white spaces.
- Head to bbebooksthailand.com. Scroll down to the bottom of the website and click on the Developers link and then under Design Pad, cut and paste your entire document into their box and click whitespace nuke button (found in lower middle of Control Panel under “Whitespace Ops.” Paul Salvette does this on this video at 11:16.
- We’re cutting and pasting instead of copying and pasting because we want it to be replaced completely and that’s better with a blank document done with the Cut function. Your Notepad file should be left blank once you Cut the text out.
- Once we’ve done the Whitespace Nuke, you’ll see that it’s a lot tidier. Cut all the text out from the bbebooks Design Pad site.
- Pressing Ctrl+A is the easiest way to Select All and then do Cut.
- Paste the whitespace nuked text back into your Notepad++ file .
Step 13: We will now add html markup. First we’ll add paragraph tags for the whole document.
- We will use Search and Replace in Notepad++.
- Search and Replace can be found under “Search” in the Notepad++ Menu bar (right by Edit). Once you click on Search, a drop down menu will appear, click Find, and then click the Replace tab. Or you can do shortcut Ctrl+F. The Find Box will show up and then click the Replace tab.
- Under Find, put ^(.+)$
- Under Search, put <p>\1</p>
- You should now have paragraph tags on every line of your notepad file.
- Paul Salvette does this step in his video at time 12:29.
Step 14: Add Doctype declaration and the stylesheet into your notepad++ file.
- Now we need to copy and paste the doctype declaration. It’s a set of instructions that will render the text to an ebook. This is what should be on your notepad++ file (use Kindle for PC or Mac to easily cut and paste):
<?xml version=”1.0″ encoding=”UTF-8″ ?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.1//EN” “http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd”>
<html xmlns=”http://www.w3.org/1999/xhtml” xml:lang=“en”>
<meta http-equiv=”Content-Type” content=”application/xhtml+xml; charset=utf-8″ />
<link type=”text/css” rel=”stylesheet” href=”../styles/stylesheet.css” />
<title>Title Here (of the html file not your book)</title>
- The doctype declaration was modified from bbebooksthailand.com (scroll down to bottom of website, click on Developers link and then click XHTML 1.1 Boilerplate). It’s the same exact thing except where you see “href=“ on the sixth pacman symbol, I’d already put the relative link to the images folder assuming you named the folder “styles” and your stylesheet “stylesheet.css” (the relative link says to back out of this file, go to syles folder, go to stylesheet.css). If not, you should use the bbebooks version and input your own correct relative link.
- Copy and paste the Doctype declaration into your Notepad++ file.
- Make sure your content is between <body> and </body>. Also, don’t forget to put your title and if you need to, modify your stylesheet name where it says href=“../styles/stylesheet.css” /> Do a relative link like: href=”../styles/insertstylesheetname.css” />
Step 15: Choose whether you want to split up the file now before validation or keep the whole file together before validation. I prefer to split up the file into their constituent parts as it’s easier to spot mistakes in shorter files. Paul Salvette in his tutorial video chooses to keep the file intact before validation so head there if you prefer to keep it intact. Note that you’ll have to split up the files at some point; it’s just a matter of when. Splitting up the files before or after validation will help your ebooks load faster then ebooks in one massive file.
The rest of this guide assumes you already split up the files into their parts like individual chapters and such. Salvette’s video has a tutorial if you’d like to keep the file intact for now.
Step 16: We now need to prepare to split the main notepad++ file into separate html files.
- Make separate notepad++ files. Start off with like chapter1.html, chapter2.html, titlepage.html etc … until you have all the files you need. It’s important to name them as one word as the validator prefers it that way and will cause errors if you don’t. Also use the .html file extension.
- On the preliminary notepad++ files, add the doctype declaration used in Step 14 into each of your html files.
Step 17: Split up the files into their own respective html (notepad++) files
- Begin separating your main document into their own notepad++ files so Chapter 1 in the main file goes to chapter1.html etc.
- This has the added benefit of creating natural page breaks in the ebook. If we had kept it as one big file, then we’ll have to manually input page breaks into the big notepad++ html file.
- I find it easier to focus on the individual sections of the book. Recall the parts of the book being Front Matter, Main Content, and Back Matter.
- Split the Front Matter first. They may need their own styling if you haven’t done it yet. Check the Formatting Lessons section again for tips. Natural break points are the copyright, title page, dedication, etc.
- Next comes the Main Content. This should be ready to go, but check anyway. I find the natural break points to be chapters.
- Split the Back Matter such as About the author, other books by, etc.
- Keep pasting the individual parts until you’re done. My longest so far had over a hundred html pages but yours will depend on how many parts your ebook has. It should be shorter unless you have a ton of chapters.
Step 18: Create Cover HTML file
Step 19: Add Heading tags.
- Depending on your heading setup, put h1, h2, h3 etc. for your chapter headings (Chapter 1, Chapter 2, etc.), subheadings etc. For instance, sometimes I use the h1 tags for my Title Page and h2 for my chapter headings. It depends on your CSS setup and how you want each heading tag to look. For this book, h1 is reserved for the title page and h2 for the subtitle. h3 is reserved for each chapter heading.
- You can only go down to h6. I found this out by accident for one of my books and had to rework my headings because I had more than six headings.
- Unfortunately, this will have to be done manually since Search and Replace is too cumbersome.
Step 20: Preparing to Build your Book and Preparing your CSS
- We will now prepare to build your book. Mentally think about the individual pages of your book. How will you design your title page? Your copyright page? Any text in your book that needs special styling? For example, will your chapter headings be in serif, sans serif, native Kindle font, or an embedded font your picked out?
- It helps to sketch out what your book will look like. On pen and paper or in a design software, draw out the pages of your book. What will your book look like to the eye? Go through the ebooks and print books you own for design inspiration.
- I tend to split the book into the Front Matter, Main Content, and Back Matter and design the individual pages for the three sections. Obviously it won’t be for all the pages of your book or that would take forever. For a fiction book, for instance, sometimes it’s just having to design what the chapter page will look like (the spacing between chapter and text, first line modification, etc.) and a regular page unless you have section breaks or other stylistic touches you want to add.
- Depending on your content, sometimes you’ll need special class attributes for certain portions of the text, most likely a paragraph tag.
- Don’t forget to prepare your CSS too.
- Don’t have to get the CSS perfect right now. You might need to add some classes later on but it should mostly be done for you if you use the CSS template.
Step 21: Building your Book
- We will now build your book and design the individual pages the way we want. Can’t really help you here since every book is different. If you need design inspiration, find ebooks you like, open them up, and peer inside them using Calibre. When you open them up, you’ll find the structure is the same or almost the same as what you’ll find here (Fonts, images, css, text, etc.). I’ve found books published by Amazon are good examples of well made ebooks.
Step 22: Check Table of Contents
- Make sure the internal links are correct in the table of contents. The Table of Contents is rife for careless mistakes. I know I’ve made my fair share of them because it’s so easy to make. Your eyes can glance over typos and mistakes so be extra cognizant of that.
Step 23: Now comes the most nerve wracking part. We’re going to validate the notepad file.
- Head to the W3C Validator.
- Validate each file by copying and pasting the notepad ++ html file into the W3C Validator. You can also browse for your files an upload into the Validator.
- Having to validate each file is more cumbersome than the other method, but it has the upside of finding mistakes easier since the files are smaller with less text to validate.
- If you pass, then great but if not, you’ll get these weird warning signs with words that sound like gibberish.
- It’s important to validate the notepad html file because if you don’t pass, then you most likely will have problems with the epub.
Step 24: If you want, you can take a break or a nap. We’re nearing the end now.
Step 25: Prepare the .opf file.
- On Notepad++ or in the OEBPS folder, open the blank .opf file we made earlier back in Prep Phase Step 8.
- This is the file Adobe Digital Editions will read to make the ebook.
- bbebooksthailand.com also has a template. The bbebooks template can be found by going to the bbebooksthailand.com website. Scroll to the bottom of the website, click on Developers and then on Epub 2.0.1 Boilerplate. Scroll down again and then once you find content.opf, copytheir content.opf template (click Ctrl+A and then if you want to use it, right click copy) and then paste it to your opf file.
- You’ll notice the opf file has three sections: the manifest, the spine, and the guide. Like the word ‘manifest’ suggests, the opf file is like the listing of a ship’s cargo. It’ll tell the Kindle Previewer everything the ebook should have.
- You should also erase the green notes he put in there. You don’t have to but it looks neater and you can find errors easier with the distracting green text gone.
Step 26: The Metadata Section of the .opf file.
- I know this will be confusing to a beginner, so this Paul Salvette clip will help out a little bit, starting at time 26:52.
- The metadata is the first part of the opf file before the manifest. You should have something like this:
<?xml version=”1.0″ encoding=”utf-8″ ?>
<package version=”2.0″ xmlns=”http://www.idpf.org/2007/opf” unique-identifier=”BookId”>
<metadata xmlns:dc=”http://purl.org/dc/elements/1.1/” xmlns:opf=”http://www.idpf.org/2007/opf”>
<dc:identifier id=”BookId” opf:scheme=”uuid”>urn:uuid: </dc:identifier>
<dc:creator opf:role=”aut”> </dc:creator>
<dc:rights>All rights reserved</dc:rights>
<meta name=”cover” content=”cover” />
<meta name=”output encoding” content=”utf-8″ />
- Ignore the first few lines of the metadata until you see the <dc: title>.
- List the title of your ebook in between <dc:title> and </dc:title>
- Leave the en-us for language meaning it’s US English, as is.
- For the uuid, we’ll have to come up with a number to tie to your book. If you have an ISBN, put it here. If you don’t (I don’t use ISBNs, they’re really not necessary), head to bbebooksthailand.com and scroll down to the bottom of the website. Click Developers link and then click on where it says BB Meta Pad. You should now be in the BB Meta Pad. On the top left corner of the Control Panel, under “Add Metadata,” click “Insert Required Metadata.”What you see on the Meta Pad will list the first part of an opf file with a slight difference. Look at the line where it says “dc:identifier”. Follow that line until you see urn:uuid: and then a bunch of letters and numbers. That’s your uuid for your ebook. The BB ebooks website gives you a unique number every time you press that “Insert Required Metadata” button so don’t worry about getting the same number as someone else.Remember to make sure the Kindle and epub versions have separate uuids. They are technically two different versions even though they are both ebooks.Copy your number and paste it on your opf file after urn:uuid: and before </dc:identifier>
- Add your author name, publisher name (if applicable – I just named my own publishing imprint or if you prefer, put your author name there), description, and subject in their required fields (dc:creator and dc:publisher)
- The dc:description is your sales copy like the ones you find in the back of a book. The subject fields are your keywords.
- dc:date should be your release date.
- dc:rights should be All Rights Reservedunless it’s a public domain work. If it’s your original work, then you own it so it’s all rights reserved.
- dc:type should be left alone and also leave the meta names for cover and output encoding alone too.
- dc:coverage should be Worldwide.
- You can name the cover in “meta name” however you wish but make sure the rest of the opf file follow that name instead of “cover”. I just call it cover.
- meta name=”output encoding” should be left alone.
Step 27: The Manifest Section of the opf file
Step 28: The Spine Section of the opf file
Step 29: The Guide Section of the opf file
Step 30: Prepare the .ncx file
- Clip starts on 44:00.
- Open toc.ncx file we made earlier back in Step 8 of the Prep Phase.
- A template can also be found on the bbebooksthailand website. At their website, scroll down to the bottom of page, click on Developers and then click on Epub 2.0.1 Boilerplate. Copy the toc.ncx template and paste it to your ncx file.
- The NCX file is your second table of contents. The first table of contents is for the inside of your ebook, and the ncx file is the table of contents for the device itself. That way, you will swipe on your device and see the contents of the book. You can tell when someone doesn’t have an ncx file when you swipe and you only see something like Beginning and End.
- If you copied the bbebooks template, erase their instructions in green if you want to.
- Make sure the Kindle and epub versions have separate uuids. They are technically two different versions even though they are both ebooks. You have to input the uuids in both the opf and ncx file.
Step 31: The Metadata Section of the ncx file
- Metadata section clip starts on 44:40.
- Add the uuid. Remember the epub and Kindle uuids have to be different. They’re considered two different ebooks.
- Add the title of your ebook in between <text> and </text> of <doctTitle>
- Add your author name in between <text> and </text> of <docAuthor>
- If you used the bbebooks template, erase any of ther template placeholders.
Step 32: The Navmap Section of the ncx file
- Navmap clip starts on 45:45.
- Each navpoint corresponds with each individual part of the ebook found in your Text folder. Here’s an example of a single navpoint in our ncx:
<navPoint id=”example ” playOrder=”1″>
<content src=”Text/chapter1.html” />
For navPoint id, you can name it however you wish.
For text, type how you want it to appear. This will be the words shown in your ebook so choose carefully. For example, if you want it to say Chapter One, type Chapter One instead of Chapter 1.
For content src, make sure it leads to the desired folder. The template relative path should work but check to make sure.
- Make sure the playOrder is in descending order and the linear flow of your book! playOrder should be 1 for the first part you want your reader to see, then 2, then 3 etc.
- Any last minute additions you make in the text folder needs to be added into both the opf and ncx files.
- For epubs, the playOrder should start on the cover page html.
- Always save your progress
Step 33: Open DOS program and run zip.exe
- In the Start bar search type cmd:
- The command line DOS program opens up.
Step 34: Input commands to make epub file. Remember, these commands assumes you put zip.exe in the computer file. You’re commands may be different depending on where you put zip.exe and you may have to change directory if you did that.
- Type cd Desktop. Should look like this:
- Type cd name of your folder. The command depends the name of your epub folder. For this ebook, it looks like this:
- Type dir/w. This will list the contents of the file. Example:
- Type c:\zip\zip.exe nameofebook.epub -DX0 mimetype•Type exactly except for the name of your epub book, usually the title of the ebook or a variation of it.
- Next, type c:\zip\zip.exe insertnameofebookhere.epub -rDX9 META-INF OEBPS
- If all goes well, you have now created your epub ebook. Head to Troubleshooting if you’re having trouble.
Step 35: Check the directory folder for epub book.
- The created epub book should be in the directory of your book’s epub folder.
Step 36: You did it! You made your ebook!