DPC Jargon
From DPCanadaWiki
To be created/developped and updated Template:Jargon Guide Preface
Activity Hub
To be created/developped and updated
The Activity Hub is a special page on the DPC Website which allows you access to each of the rounds and other major stages of DPC's e-text production process. Using the links on the page you can view and explore all the "home" pages for each of the five rounds and other workflow stages, even if you are not yet authorized to actually perform work in those stages. (See Access requirements to learn what is needed to work in each stage.)
Between the links on the navigation menu across the top of the page and the links in the body of the Activity Hub page itself, you can quickly visit any of the major sections of the DPC Site. Because of this valuable hub-like function, the Activity Hub is one of the very few DPC "production" Webpages which can be accessed directly from both the DPC Forums (using the link at the top of every Forum page) and the DPC Wiki (using the link in the DPC Navigation box to the left).
avatar
An avatar is a small graphic image you can add to your DPC Forum profile. It will display below your username in posts you make in the DPC Forums and on your User Details page. It can be no larger than 80 x 80 pixels and 6 kilobytes.
For information on creating and customizing an avatar for yourself, consult the page that I'm sure someone will eventually create in the Wiki, and the Avatar Creation and Help thread in the forums.
BWL: Bad Word List
A Bad Word List (BWL) contains words that will be flagged by the WordCheck tool in the Proofreading Interface. In general, Bad Words are words in a project which are likely to be stealth scannos and should thus receive extra attention from proofers.
There are two types of Bad Word Lists. There can be a site-wide BWL for each language for which WordCheck has a spell-check dictionary that applies to all projects with that language setting, and there can be a project-specific BWL created by its Project Manager.
For more information, see also Good Word List and the WordCheck FAQ.
BAE
The Bureau of American Ethnology, usually referred to as the BAE, was a government-sponsored organization that coordinated and reported anthropologic research in the Americas. Its most notable publications, Annual Reports and Bulletins, have been harvested from scans available at gallica.bnf.fr and form a large group of projects in various stages at DP (and perhaps someday at DPC).
The papers in these bulletins and reports were written by staff members at the BAE, including many early field anthropologists, as well as other collaborators supported by the BAE. Papers were written about physical findings, social and cultural traditions, and linguistic families, among other things.
The BAE projects in progress at DP tend to be a bit challenging, as they are often long, heavily footnoted and indexed, and contain many diacritical marks that are not part of the standard Latin-1 character set used at DP, necessitating the use of not only the entire diacritical markup explained in the Proofreading Guidelines, but an extended set of markup. There are also some proofing and formatting conventions that apply to all BAE projects but which are not standard at DP; for instance, indexes in BAE projects are done differently than those in most DP projects. See individual Project Comments and the BAE UberProject thread for more information.
BEGIN/BEGINNERS ONLY project
BEGINNERS ONLY, or BEGIN, projects are some EASY projects which have been set aside for our newest volunteers. These books contain most of the elements proofers deal with most frequently, such as end-of-line hyphenation, "unclothed" dashes, page headers, "spacey quotes," wonky punctuation, and scannos/stealthos you can really sink your teeth into.
New volunteers in P1 are asked to limit themselves to a small number of pages (15-20 pages) in any BEGIN project, then to move on to another BEGIN, a Newcomers Only, or any other project. After reaching a page-limit threshold (rumor has it at between 40 and 50), BEGIN projects become no longer accessible by that particular P1er.
Once a BEGIN project has completed P1, it moves to P2 where it is labeled as MENTORS ONLY (or MENTOR), and mentors proof the work and send encouraging, detailed feedback to the appropriate P1ers. This feedback process can take up to a week, since the whole project must finish P1 before moving on to P2.
To help these books move into P2 relatively quickly, they are usually separated into parts of about 100 pages each. When all of a given book's parts are finished in F2, a friendly squirrel reassembles the book into a whole and it is made available to a post-processor.
Big Eye, The
blackletter
block quotation or block quote
CP: Content Provision/Provider
Content Providing/Provision (CP) is the process of providing the page images used in proofreading, either by scanning a book or harvesting the images from an online source.
Also a person who does such work (Content Provider, or CPer).
If you are interested in becoming a CPer, visit Access requirements.
You can automate some content providing tasks by using the DPtools, which are available here.
Content providing (or CPing) is a task carried out by content providers.
If you need help with content providing, you may want to refer to one to:
- Content Providing FAQ (wiki) or Content Provider's FAQ
- Content Providing Advice.
- Content Providers seeking Project Managers
- Scanning FAQ
DG: Daily Goal
diacritical marks
Diacritical marks, sometimes referred to as diacritic marks or the "short-hand" term diacriticals, are small marks found above or below a basic character which change the pronunciation of that character. For example, the acute accent over the "e" in the character "é" is a diacritical mark.
Characters with diacritical marks are proofed two different ways in DPC projects, depending upon whether the needed character is or is not available in the UTF-8 character set:
- Proofing Guidelines for accented and other characters in UTF-8. (Basically use the characters available from the drop-down lists in the Proofreading Interface.)
- Proofing Guidelines for accented and other characters not in UTF-8 .
diffs
Diffs (short for differences) are the changes made to the text of a project's individual pages as it progresses through each round at DPC. The term can also refer to the Webpages at DP where you can view such changes. A "diff" doesn't necessarily mean there was something wrong on a page, just that the page text coming out of the subsequent round is different.
Proofreaders often help themselves improve their proofing and formatting skills by examining and analyzing their diffs. See Checking your diffs for more details.
DPC: Distributed Proofreaders
Distributed Proofreaders Canada (DPC) refers to this Website, the organization behind it, the community using it (sometimes referred to individually as DPCers, short for DPC users), etc., in any or all combinations.
For a general introduction to DPC, the organization, see Distributed Proofreaders Canada.
DPC Wiki
dp-feedback
DPCustomMono2
DPE or DP-Eu: Distributed Proofreaders Europe
Distributed Proofreaders Europe
DPF: Distributed Proofreaders Foundation
Distributed Proofreaders Foundation definition
DPCWiki
F1
F1 refers to Formatting round 1, the first round of formatting, in which markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., is added to individual pages in a project.
To see how you can qualify to work in F1, see the Access requirements article and the F1 round page.
See also F2 and proofreading.
F2
F2 refers to Formatting round 2, the second round of formatting, in which F1 markup is checked and corrected.
To see how you can qualify to work in F2, see the Access requirements article, the F2 round page, and this forum post.
DPC needs a team, like the F2 Fanatics, dedicated to moving projects through F2 towards completion more efficiently by concentrating efforts on a few projects.
FAQ
feedback
All well-designed systems have feedback mechanisms built into them, and the DP Workflow system is no exception in this regard. Accordingly, there are many kinds of feedback that are exchanged between volunteers here at DPC.
- One of the first types of feedback a new volunteer at DP is likely to receive, and the kind most likely to be the type being referred to when you see the word "feedback," will come from a mentor via the BEGINNERS ONLY projects and/or the dp-feedback mechanisms, through which experienced proofreaders send detailed and constructive comments to proofreaders regarding what they did correctly and incorrectly (with respect to the Guidelines) on specific pages in specific projects, and on ways that the proofreaders can learn to become even better proofreaders.
- As project pages progress through the rounds, each successive proofer and foofer may be creating diffs which can serve as feedback to the previous proofer or foofer. This is the primary mechanism by which proofers in Newcomers Only projects receive feedback.
- PP Mentors give feedback to new PPers; HTML Mentors give help and feedback to PPers who are new to HTML; and one of the PPVers' primary tasks is to give feedback to PPers.
In addition to these relatively formalized and routine feedback mechanisms, any volunteer can ask for feedback on any issue, question, procedure, etc., in a Project Thread or any other place in the DP Forums which may seem to be appropriate.
foofing/foofer
Foofing is an informal term for formatting; one who "foofs" is called a foofer.
Compare to proofing and proofer.
footnote
See out-of-line footnote.
formatting
Formatting is the process of adding markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., to a project. Formatting of a project's individual pages is performed in rounds F1 and F2. Other, more project-wide, formatting is done in the PP stage.
See also foofing, and compare to proofreading.
Formatting Guidelines
Formatting Guidelines refers to a document that contains all the default instructions and standards for formatting (such as markup for italics, illustrations, footnotes, and poetry) in rounds F1 and F2. These standards apply to all projects, unless specifically overridden by instructions from the Project Manager in the Project Comments or Project Discussion.
For complete clarity, the PM may make any exceptions to these Guidelines in the Project Comments, and the PPer can make any changes deemed appropriate in the PP stage. The only principle for the PPer to follow is consistency--either make the same consistent change throughout the book, or consistently stick to the original treatment of the author.
You can access the Formatting Guidelines from FAQ Central and from any Proofing Interface window.
IMPORTANT: this is a reference document--beginning formatters do not need to memorize the entire document, but everyone should consult it when problems or questions arise. If the matter is still not clear, raise an inquiry in the Project forum.
See also Proofreading Guidelines.
The Primary Rule
"Don't change what the author wrote!"
The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author puts italics, bold text or a footnote every third word, we mark them italicized, bolded or footnoted. We are proofreaders, not editors. (See Printer Errors/Misspellings for proper handling of obvious misprints.)
We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line ( End-of-line Hyphenation). Changes such as these help us produce a consistently proofed version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Formatting Guidelines with this concept in mind. There is a separate set of Proofreading Guidelines. These guidelines are intended for formatting only. The proofreaders matched the image's content, and now as a formatter you match the image's look.
To assist the next formatter and the Post-Processor, we also preserve Line Breaks. This allows them to compare the lines in the text easily to the lines in the image.
About This Document
This document is written to explain the formatting rules we use to maintain consistency when formatting a single book that is distributed among many volunteers, each of whom is working on different pages. This helps us all do formatting the same way, which in turn makes it easier for the Post-Processor who will complete the work on this e-book.
It is not intended as any kind of a general editorial or typesetting rulebook.
We've included in this document all the items that new users have asked about formatting. There is a separate set of Proofreading Guidelines. If you come across a situation and you do not find a reference in these guidelines, it is likely that it was handled in the proofreading rounds and so is not mentioned here. If you aren't sure, please ask about it in the Project Discussion.
If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.
The original version of this document was created in May of 2008; revisions since that time are indicated at the beginning and end of any changed section like this: "(Nov 08) The original version of this..."(/Nov 08)
An inportant revision to the Guideline will come into effect as of February 1st, 2009.
This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum.
Each Page is a Separate Unit
Since each project is distributed among many formatters, each of whom is working on different pages, there is no guarantee that you will see the next page of the project. With this in mind, be sure to open and close all markup tags on each page. This will make it easier for the post-processor to combine all these pages into one e-book eventually.
Project Comments
When you select a project for formatting, the Project Page is loaded. On this page, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.
Please also read the Project Thread (Forum). The Project Manager may clarify project-specific guidelines here, and it is often used by formatters to alert other volunteers to recurring issues within the project and how they can best be addressed. (See the next section).
On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other volunteers have made changes.
Forum/Discuss This Project
On the Project Page where you start formatting pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other volunteers who are working on this book.
Fixing Errors on Previous Pages
The Project Page contains links to pages from this project that you have recently worked on. (If you haven't formatted any pages yet, no links will be shown.)
Pages listed under either "DONE" or "IN PROGRESS" are available to make corrections or to finish formatting. Just click on the link to the page. Thus, if you discover that you made a mistake on a page or marked something incorrectly, you can click on that page here and reopen it to fix the error.
You may also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.
Formatting at the Character Level
Placement of Inline Formatting Markup
Inline formatting refers to markup such as <i> </i>, <b> </b>, <sc> </sc>, <f> </f>, or <g> </g>. Place punctuation outside the tags unless the markup is around an entire sentence or paragraph, or the punctuation is itself part of the phrase, title, or abbreviation that you are marking. If the formatting goes on for multiple paragraphs, put the markup around each paragraph.
The periods that mark an abbreviated word in the title of a journal such as Phil. Trans. are part of the title, so they are included within the tags, thus: <i>Phil. Trans.</i>.
Many typefaces found in older books used the same design for numbers in both regular text and italics or bold. For dates and similar phrases, format the entire phrase with one set of markup, rather than marking the words as italics (or bold) and not the numbers.
If there is a series/list of words or phrases (such as names, titles, etc.), mark each item of the list individually.
In poetry, mark each line of the poem separately if the formatting goes on for multiple lines. See the Tables section for handling markup in tables.
Examples:
Original Image: Correctly Proofread Text:
Enacted 4 July, 1776 <i>Enacted 4 July, 1776</i>
It cost 9l. 4s. 1d. It cost 9<i>l.</i> 4<i>s.</i> 1<i>d. </i>
God knows what she saw in me! I spoke <b>God knows what she saw in me!</b> I spoke in such an affected manner. in such an affected manner.
As in many other of these Studies, and As in many other of these <i>Studies</i>, and
(Psychological Review, 1898, p. 160) (<i>Psychological Review</i>, 1898, p. 160)
L. Robinson, art. "Ticklishness," L. Robinson, art. "<sc>Ticklishness</sc>,"
December 3, morning. /* 1323 Picadilly Circus <i>December 3, morning.</i> 1323 Picadilly Circus */
Volunteers may be tickled pink to read Volunteers may be tickled pink to read Ticklishness, Tickling and Laughter, <i>Ticklishness</i>, <i>Tickling and Laughter</i> Remarks on Tickling and Laughter <i>Remarks on Tickling and Laughter</i> and Ticklishness, Laughter and Humour. and <i>Ticklishness, Laughter and Humour</i>.
“That's the idea!” exclaimed Tacks. "<i>That's the idea!</i>" exclaimed Tacks.
The professor set the reading assignment The professor set the reading assignment for E r l e b n i s G e s c h i c h t e for <g>Erlebnis Geschichte D e u t s c h l a n d Deutschland s e i t 1 8 4 5. seit 1845</g>.
Italics
Format italicized text with <i> inserted at the start and </i> inserted at the end of the italics. (Note the "/" in the closing tag.)
See also Placement of Inline Formatting Markup
Bold Text
Format bold text (text printed in a heavier typeface) with <b> inserted before the bold text and </b> after it. (note the "/" in the closing tag.)
See also Placement of Inline Formatting Markup and Chapter Headings.
Underlined Text
Format underlined text as Italics with <i> and </i>. (Note the "/" in the closing tag.) Underlining was often used to indicate emphasis when the typesetter was unable to actually italicize the text, for example in a typewritten document.
See also Placement of Inline Formatting Markup
Some Project Managers may specify in the Project Comments that underlined text be marked up with the <u> and </u> tags.
Spaced Out Text (gesperrt)
Format s p a c e d o u t text with <g> inserted before the text and </g> after it. (Note the "/" in the closing tag.) Remove the extra spaces between letters in each word. This was a typesetting technique used for emphasis in some older books, especially in German.
See also Placement of Inline Formatting Markup and Chapter Headings.
Font Changes
Some Project Managers may request that you mark a change of font within a paragraph or line of normal text by inserting <f> before the change in font and </f> after it. (Note the "/" in the closing tag.) This markup may be used to identify a special font or other formatting that does not already have its own markup (such as italics and bold).
Possible uses of this markup include:
- antiqua (a variant of roman font) inside fraktur
- blackletter within a section of regular font
- smaller or larger font only if it is within a paragraph in regular font (for a whole paragraph in a different font or size, see the Block Quotation section)
- upright font inside of a paragraph of italicized text
The particular use or uses of this markup in a project will usually be spelled out in the Project Comments. Formatters should post in the Project Discussion if the markup appears to be needed and has not yet been requested.
See also Placement of Inline Formatting Markup.
Words in Small Capitals
The formatting is different for Mixed Case Small Caps and all small caps:
Format words that are printed in Mixed Small Caps as Mixed Upper and Lowercase. Format words that are printed in all small caps as ALL-CAPS. For both mixed case and all small caps, surround the text with <sc> and </sc> markup.
Headings ( Chapter Headings, Section Headings, Captions, etc.) may appear to be in all small caps, but this is usually the result of a change in font size and should not be marked as small caps. The first word of a chapter that is in small caps should be changed to mixed case without the tags.
See also Placement of Inline Formatting Markup.
Original Image: Correctly Proofread Text:
This is Small Caps <sc>This is Small Caps</sc>
You cannot be serious about aardvarks! You cannot be serious about <sc>aardvaarks</sc>!
Words in All Capitals
Format words that are printed in all capital letters as all capital letters.
The exception to this is the first word of a chapter: many old books typeset the first word of these in all caps; this should be changed to upper and lower case, so "ONCE upon a time," becomes "Once upon a time,"
Font Size Changes
Normally we do not do anything to mark changes in font size. The exceptions to this are when it indicates a block quotation or when the font size changes within a single paragraph or line of text (see Font Changes).
Extra Spaces or Tabs Between Words
Extra spaces between words are common in OCR output. You generally don't need to bother removing these—that can be done automatically during post-processing. However, extra spaces around punctuation, em-dashes, quote marks, etc. do need to be removed when they separate the symbol from the word. In addition, within the /* */ markup that preserves spacing, be sure to remove any extra spaces since they will not be automatically removed later on.
Finally, if you find any tab characters in the text you should remove them.
Superscripts
Older books often abbreviated words as contractions, and printed them as superscripts. Format these by inserting an up-arrow (^) followed by the superscripted text. If the superscript continues for more than one character, then surround the text with curly braces { and } as well. For example:
Original Image:
Genrl Washington defeated Ld Cornwall's army.
Correctly Formatted Text:
Gen^{rl} Washington defeated L^d Cornwall's army.
In scientific & technical works, format superscripted characters with curly braces { and } surrounding them even if there is only one character superscripted. For example:
Original Image:
... up to xn elements in the array.
Correctly Formatted Text:
"... up to x^{n} elements in the array.
If the superscript represents a footnote marker, then see the Footnotes section instead.
The Project Manager may specify in the Project Comments that superscripted text be marked differently.
Subscripts
Subscripted text is often found in scientific works, but is not common in other material. Format subscripted text by inserting an underline character _ and surrounding the text with curly braces { and }. For example:
Original Image:
H2O.
Correctly Formatted Text:
H_{2}O.
Page References ("See Page 123")
Format page number references within the text such as (see p. 123) as they appear in the image.
Check the Project Comments to see if the Project Manager has special requirements for page references.
Formatting at the Paragraph Level
Chapter Headings
Format chapter headings as they appear in the image. A chapter heading may start a bit farther down the page than the page header and won't have a page number on the same line. Chapter Headings are often printed all caps; if so, keep them as all caps. Mark any italics or mixed case small caps that appear in the image.
Put 4 blank lines before the "CHAPTER XXX". Include these blank lines even if the chapter starts on a new page; there are no 'pages' in an e-book, so the blank lines are needed. Then separate with a blank line each additional part of the chapter heading, such as a chapter description, opening quote, etc., and finally leave two blank lines before the start of the text of the chapter.
Old books often printed the first word or two of every chapter in all caps or small caps; change these to upper and lower case (first letter only capitalized).
While chapter headings may appear to be bold or spaced out, these are usually the result of font or font size changes and should not be marked. The extra blank lines separate the heading, so do not mark the font change as well. See the first example below.
Original Image:
Correctly Formatted Text:
GREEN FANCY
CHAPTER I
THE FIRST WAYFARER AND THE SECOND WAYFARER
MEET AND PART ON THE HIGHWAY
A solitary figure trudged along the narrow
road that wound its serpentinous way
through the dismal, forbidding depths of
the forest: a man who, though weary and footsore,
lagged not in his swift, resolute advance. Night
was coming on, and with it the no uncertain prospects
of storm. Through the foliage that overhung
the wretched road, his ever-lifting and apprehensive
eye caught sight of the thunder-black, low-lying
clouds that swept over the mountain and bore
down upon the green, whistling tops of the trees.
At a cross-road below he had encountered a small
girl driving homeward the cows. She was afraid
of the big, strange man with the bundle on his back
and the stout walking stick in his hand: to her a
remarkable creature who wore "knee pants" and
stockings like a boy on Sunday, and hob-nail shoes,
and a funny coat with "pleats" and a belt, and a
green hat with a feather sticking up from the band.
Original Image:
Correctly Formatted Text:
/#
In the United States?[A] In a railroad? In a mining company?
In a bank? In a church? In a college?
Write a list of all the corporations that you know or have
ever heard of, grouping them under the heads <i>public</i> and <i>private</i>.
How could a pastor collect his salary if the church should
refuse to pay it?
Could a bank buy a piece of ground "on speculation?" To
build its banking-house on? Could a county lend money if it
had a surplus? State the general powers of a corporation.
Some of the special powers of a bank. Of a city.
A portion of a man's farm is taken for a highway, and he is
paid damages; to whom does said land belong? The road intersects
the farm, and crossing the road is a brook containing
trout, which have been put there and cared for by the farmer;
may a boy sit on the public bridge and catch trout from that
brook? If the road should be abandoned or lifted, to whom
would the use of the land go?
#/
CHAPTER XXXV.
<sc>Commercial Paper.</sc>
<b>Kinds and Uses.</b>--If a man wishes to buy some commodity
from another but has not the money to pay for
it, he may secure what he wants by giving his written
promise to pay at some future time. This written
promise, or <i>note</i>, the seller prefers to an oral promise
for several reasons, only two of which need be mentioned
here: first, because it is <i>prima facie</i> evidence of
the debt; and, second, because it may be more easily
transferred or handed over to some one else.
If J. M. Johnson, of Saint Paul, owes C. M. Jones,
of Chicago, a hundred dollars, and Nelson Blake, of
Chicago, owes J. M. Johnson a hundred dollars, it is
plain that the risk, expense, time and trouble of sending
the money to and from Chicago may be avoided,
[Footnote A: The United States: "Its charter, the constitution. * * * Its flag the
symbol of its power; its seal, of its authority."--Dole.]
Section Headings
Some books have sections within chapters. Format these headings as they appear in the image. Leave 2 blanks lines before the heading and one after, unless the Project Manager has requested otherwise. If you are not sure if a heading indicates a chapter or a section, post a question in the Project Discussion, noting the page number.
Mark any italics or mixed case small caps that appear in the image. While section headings may appear to be bold or spaced out, these are usually the result of font or font size changes and should not be marked. The extra blank lines separate the heading, so do not mark the font change as well.
Original Image:
Correctly Formatted Text:
and numerous, found in collections of well-authenticated
specimens. The suggested caution implied
is not unnecessary, for the periods overlap, and there
is but little to show when such things as lamps and
lanterns were actually made.
RUSHLIGHTS AND HOLDERS.
In tracing the development of lighting from quite
homely beginnings, rushlights, prepared by the
cottager and the farm hand for the winter supply,
seem to come first on the list. Rushlights, however,
Other Major Divisions in Texts
Major Divisions in the text such as Preface, Foreword, Table of Contents, Introduction, Prologue, Epilogue, Appendix, References, Conclusion, Glossary, Summary, Acknowledgements, Bibliography, etc., should be formatted in the same way as Chapter Headings, i.e. 4 blank lines before the heading and 2 blank lines before the start of the text.
Paragraph Spacing/Indenting
Put a blank line before the start of a paragraph, even if it starts at the top of a page. You should not indent the start of the paragraph, but if it is already indented don't bother removing those spaces—that can be done automatically during post-processing.
See the Chapter Headings image/text for an example.
Extra Spacing/Stars/Line Between Paragraphs
In the image, most paragraphs start on the line immediately after the end of the previous one. Sometimes two paragraphs are separated to indicate a "thought break." A thought break may take the form of a line of stars, hyphens, or some other character, a plain or floridly decorated horizontal line, a simple decoration, or even just an extra blank line or two.
A thought break may represent a change of scene or subject, a lapse in time, or a bit of suspense. This is intended by the author, so we preserve it by putting a blank line, <tb>, and then another blank line.
Sometimes printers used decorative lines to mark the ends of chapters or sections. These are not thought breaks so they should not be marked with <tb>.
Please check the Project Comments as the Project Manager may request that additional information be retained in the thought break markup, such as <tb stars> for a row of stars.
Original Image:
Correctly Formatted Text:
like the gentleman with the spiritual hydrophobia
in the latter end of Uncle Tom's Cabin.
Unconsciously Mr. Dixon has done his best to
prove that Legree was not a fictitious character.
<tb>
Joel Chandler Harris, Harry Stillwell Edwards,
George W. Cable, Thomas Nelson Page,
James Lane Allen, and Mark Twain are Southern
men in Mr. Griffith's class. I recommend
Illustrations
Text for an illustration should be surrounded by an illustration tag [Illustration: and ], with the caption text placed in between. Format the caption text as it is printed, preserving the line breaks, italics, etc. Treat lines such as "See Page 66" as part of the caption.
If an illustration has no caption, add a tag [Illustration]. (Be sure to remove the colon and space before the ] in this case.)
If the illustration is in the middle of or at the side of a paragraph, move the illustration tag to before or after the paragraph and leave a blank line to separate them. Rejoin the paragraph by removing any blank lines left by doing so.
If there is no paragraph break on the page, mark the illustration tag with an * like so *[Illustration: (text of caption)], move it to the top of the page, and leave a blank line after it.
Original Image:
Correctly Formatted Text:
[Illustration: Martha told him that he had always been her ideal and
that she worshipped him.
/*
<i>Frontispiece</i>
<i>Her Weight in Gold</i>
*/
]
Original Image:
Correctly Formatted Text:
such study are due to Italians. Several of these instruments
have already been described in this journal, and on the present
occasion we shall make known a few others that will
serve to give an idea of the methods employed.
[Illustration: <sc>Fig. 1.</sc>--APPARATUS FOR THE STUDY OF HORIZONTAL
SEISMIC MOVEMENTS.]
For the observation of the vertical and horizontal motions
of the ground, different apparatus are required. The
Footnotes/Endnotes
Format footnotes by leaving the text of the footnote at the bottom of the page and placing a tag where it is referenced in the text. This means:
1. In the main text, the character that marks a footnote location should be surrounded with square brackets ([ and ]) and placed right next to the word being footnoted[1] or its punctuation mark,[2] as shown in the image and the two examples in this sentence. Footnote markers may be numbers, letters, or symbols. When footnotes are marked with a symbol or a series of symbols (*, †, ‡, §, etc.) we replace these with Capital letters in order (A, B, C, etc.).
2. At the bottom of the page, a footnote should be surrounded by a footnote tag [Footnote #: and ], with the footnote text placed in between and the footnote number or letter placed where the # is shown in the tag. Format the footnote text as it is printed, preserving the line breaks, italics, etc. Be sure to use the same tag in the footnote as you used in the text where the footnote was referenced. Place each footnote on a separate line in order of appearance. Separate each footnote with a blank line if there is more than one.
If a footnote is incomplete at the end of the page, leave it at the bottom of the page and just put an asterisk * where the footnote ends, like this: [Footnote 1: (text of footnote)]*. The * will bring it to the attention of the post-processor, who will eventually join the parts of the footnote together.
If a footnote started on a previous page, leave it at the bottom of the page and surround it with *[Footnote: (text of footnote)](without any footnote number or marker). The * will bring it to the attention of the post-processor, who will eventually join the parts of the footnote together.
If a continued footnote ends or starts on a hyphenated word, mark both the footnote and the word with *, thus: [Footnote 1: This footnote is continued and the last word in it is also con-*]* for the leading fragment, and *[Footnote: *tinued onto the next page.].
In some books, footnotes are separated from the main text by a horizontal line. We don't keep this so please just leave a blank line between the main text and the footnotes.
Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of on the bottom of each page. These are formatted in the same manner as footnotes. Where you find an endnote reference in the text, just surround it with [ and ]. If you are formatting one of the pages with endnotes, surround the text of each note with [Footnote #: (text of endnote)], with the endnote text placed in between, and the endnote number or letter placed where the # is. Put a blank line after each endnote so that they remain separate paragraphs when the text is rewrapped during post-processing.
Footnotes in Tables should remain where they are in the original image.
Original Image:
Correctly Formatted Text:
The principal persons involved in this argument were Caesar[A], former military
leader and Imperator, and the orator Cicero[B]. Both were of the aristocratic
(Patrician) class, and were quite wealthy.
[Footnote A: Gaius Julius Caesar.]
[Footnote B: Marcus Tullius Cicero.]
Original Footnoted Poetry:
Correctly Formatted Text:
/*
Mary had a little lamb[1]
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!
*/
[Footnote 1: This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.]
Paragraph Side-Descriptions (Sidenotes)
Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. Move sidenotes to just above the paragraph that they belong to. A sidenote should be surrounded by a sidenote tag [Sidenote: and ], with the text of the sidenote placed in between. Format the sidenote text as it is printed, preserving the line breaks, italics, etc. (while handling end-of-line hyphenation and dashes normally). Leave a blank line after the sidenote so that it does not get merged into the paragraph when the text is rewrapped during post-processing.
If there are multiple sidenotes for a single paragraph, put them one after another at the start of the paragraph. Leave a blank line separating each of them.
If the paragraph began on a previous page, put the sidenote at the top of the page and mark it with * so that the post-processor can see that it belongs on the previous page, like this: *[Sidenote: (text of sidenote)]. The post-processor will move it to the appropriate place.
Sometimes a Project Manager will request that you put sidenotes next to the sentence they apply to, rather than at the top or bottom of the paragraph. In this case, don't separate them out with blank lines.
Original Image:
Correctly Formatted Text:
*[Sidenote: Burning
discs
thrown into
the air.]
that such as looked at the fire holding a bit of larkspur
before their face would be troubled by no malady of the
eyes throughout the year.[1] Further, it was customary at
Würzburg, in the sixteenth century, for the bishop's followers
to throw burning discs of wood into the air from a mountain
which overhangs the town. The discs were discharged by
means of flexible rods, and in their flight through the darkness
presented the appearance of fiery dragons.[2]
[Sidenote: The Midsummer
fires in
Swabia.]
[Sidenote: Omens
drawn from
the leaps
over the
fires.]
[Sidenote: Burning
wheels
rolled
down hill.]
In the valley of the Lech, which divides Upper Bavaria
from Swabia, the midsummer customs and beliefs are, or
used to be, very similar. Bonfires are kindled on the
mountains on Midsummer Day; and besides the bonfire
a tall beam, thickly wrapt in straw and surmounted by a
cross-piece, is burned in many places. Round this cross as
it burns the lads dance with loud shouts; and when the
flames have subsided, the young people leap over the fire in
pairs, a young man and a young woman together. If they
escape unsmirched, the man will not suffer from fever, and
the girl will not become a mother within the year. Further,
it is believed that the flax will grow that year as high as
they leap over the fire; and that if a charred billet be taken
from the fire and stuck in a flax-field it will promote the
growth of the flax.[3] Similarly in Swabia, lads and lasses,
hand in hand, leap over the midsummer bonfire, praying
that the hemp may grow three ells high, and they set fire
to wheels of straw and send them rolling down the hill.
Among the places where burning wheels were thus bowled
down hill at Midsummer were the Hohenstaufen mountains
in Wurtemberg and the Frauenberg near Gerhausen.[4]
At Deffingen, in Swabia, as the people sprang over the mid-*
[Footnote 1: <i>Op. cit.</i> iv. 1. p. 242. We have
seen (p. 163) that in the sixteenth
century these customs and beliefs were
common in Germany. It is also a
German superstition that a house which
contains a brand from the midsummer
bonfire will not be struck by lightning
(J. W. Wolf, <i>Beiträge zur deutschen
Mythologie</i>, i. p. 217, § 185).]
[Footnote 2: J. Boemus, <i>Mores, leges et ritus
omnium gentium</i> (Lyons, 1541), p.
226.]
[Footnote 3: Karl Freiherr von Leoprechting,
<i>Aus dem Lechrain</i> (Munich, 1855),
pp. 181 <i>sqq.</i>; W. Mannhardt, <i>Der
Baumkultus<i>, p. 510.]
[Footnote 4: A. Birlinger, <i>Volksthümliches aus
Schwaben</i> (Freiburg im Breisgau, 1861-1862),
ii. pp. 96 <i>sqq.</i>, § 128, pp. 103
<i>sq.</i>, § 129; <i>id.</i>, <i>Aus<nowiki> Schwaben</i> (Wiesbaden,
1874), ii. 116-120; E. Meier,
<i>Deutsche Sagen, Sitten und Gebräuche
aus Schwaben</i> (Stuttgart, 1852), pp.
423 <i>sqq.</i>; W. Mannhardt, <i>Der Baumkultus</i>,
p. 510.]
Placement of Out-of-Line Formatting Markup
Out-of-line formatting refers to the /# #/ and /* */ markup tags. The /# #/ "rewrap" markup indicates text that is printed differently, but can still be rewrapped during post-processing. The /* */ "no-wrap" markup indicates text that should not be rewrapped later on during post-processing—where the line breaks, indentation, and spacing need to be preserved.
On any page where you use an opening marker, be sure to include the closing markup tag as well. After the text is rewrapped during post-processing, each marker will be removed along with the entire line that it is on. Because of this, leave a blank line between the regular text and the opening marker, and similarly leave a blank line between the closing marker and the regular text.
Block Quotations
Block quotations are blocks of text (typically several lines and sometimes several pages) that are distinguished from the surrounding text by wider margins, a smaller font size, different indentation, or other means. Surround block quotations with /# and #/ markers. See Placement of Out-of-Line Formatting Markup for details on this markup.
Apart from adding the markers, block quotations should be formatted as any other text.
Original Image:
Correctly Formatted Text:
later day was welcomed in their home on the Hudson.
Dr. Bakewell's contribution was as follows:[24]
/#
The uncertainty as to the place of Audubon's birth has been
put to rest by the testimony of an eye witness in the person
of old Mandeville Marigny now dead some years. His repeated
statement to me was, that on his plantation at Mandeville,
Louisiana, on Lake Ponchartrain, Audubon's mother was
his guest; and while there gave birth to John James Audubon.
Marigny was present at the time, and from his own lips, I have,
as already said, repeatedly heard him assert the above fact.
He was ever proud to bear this testimony of his protection
given to Audubon's mother, and his ability to bear witness as
to the place of Audubon's birth, thus establishing the fact that
he was a Louisianian by birth.
#/
We do not doubt the candor and sincerity of the
excellent Dr. Bakewell, but are bound to say that the
incidents as related above betray a striking lapse of
Lists of Items
Surround lists with /* and */ markers. See Placement of Out-of-Line Formatting Markup for details on this markup.
Original Image:
Correctly Formatted Text:
/*
Andersen, Hans Christian
Bach, Johann Sebastian
Balboa, Vasco Nunez de
Bierce, Ambrose
Carroll, Lewis
Churchill, Winston
Columbus, Christopher
Curie, Marie
Daguerre, Louis J. M.
Darwin, Charles
Descartes, René
Earhart, Amelia
Einstein, Albert
Freud, Sigmund
Lewis, Sinclair
Magellan, Ferdinand
Melville, Herman
Newton, Isaac
Pasteur, Louis
Poe, Edgar Allan
Ponce de Leon, Juan
Pulitzer, Joseph
Shakespeare, William
Tesla, Nikola
*/
Tables
Surround tables with /* and */ markers. See Placement of Out-of-Line Formatting Markup for details on this markup. Format the table with spaces (not tabs) to look approximately like the original table. Don't make the table wider than 75 characters. Project Gutenberg's guidelines go on to say "...except where it can't be helped. Never, ever longer than 80...".
Do not use tabs for formatting—use space characters only. Tab characters will line up differently between computers, and your careful formatting will not always display the same way.
If inline formatting (italics, bold, etc.) is needed in the table, mark up each table cell separately. When aligning the text, keep in mind that inline markup will appear differently in the final text version. For example, <i>italics markup</i> normally becomes _underscores_, and most other inline markup will be treated similarly. On the other hand, <sc>Small Caps Markup</sc> is removed completely.
It's often hard to format tables in plain text; just do your best. Be sure to use a mono-spaced font, such as #DPCustomMono or Courier. Remember that the goal is to preserve the Author's meaning, while producing a readable table in an e-book. Sometimes this requires sacrificing the original format of the table on the printed page. Check the Project Comments and discussion thread because other volunteers may have settled on a specific format. If there is nothing there, you might find something useful in the Gallery of Table Layouts forum thread.
Footnotes in tables should remain where they are in the image. See footnotes for details.
Original Image:
Correctly Formatted Text:
Original Image:
Correctly Formatted Text:
Poetry/Epigrams
Mark poetry or epigrams with /* and */ so that the line breaks and spacing will be preserved. See Placement of Out-of-Line Formatting Markup for details on this markup.
Preserve the relative indentation of the individual lines of the poem or epigram by adding 2, 4, 6 (or more) spaces in front of the indented lines to make them resemble the image. If the entire poem is centered on the printed page, don't try to center the lines of poetry during formatting. Move the lines to the left margin, and preserve the relative indentation of the lines.
When a line of verse is too long for the printed page, many books wrap the continuation onto the next printed line and place a wide indentation in front of it. These continuation lines should be rejoined with the line above. Continuation lines usually start with a lower case letter. They will appear randomly unlike normal indentation, which occurs at regular intervals in the meter of the poem.
If a row of dots appears in a poem, treat this as a thought break.
Footnotes in poetry should be treated the same as regular footnotes during formatting. Line Numbers in poetry should be kept.
Check the Project Comments for the specific project you are formatting. Books of poetry often have special instructions from the Project Manager. Many times, you won't have to follow all these formatting guidelines for a book that is mostly or entirely poetry.
Original Image:
Correctly Formatted Text:
to the scenery of his own country:
/*
Oh, to be in England
Now that April's there,
And whoever wakes in England
Sees, some morning, unaware,
That the lowest boughs and the brushwood sheaf
Round the elm-tree bole are in tiny leaf,
While the chaffinch sings on the orchard bough
In England--now!
And after April, when May follows,
And the whitethroat builds, and all the swallows!
Hark! where my blossomed pear-tree in the hedge
Leans to the field and scatters on the clover
Blossoms and dewdrops--at the bent spray's edge--
That's the wise thrush; he sings each song twice over,
Lest you should think he never could recapture
The first fine careless rapture!
And though the fields look rough with hoary dew,
All will be gay, when noontide wakes anew
The buttercups, the little children's dower;
--Far brighter than this gaudy melon-flower!
*/
So it runs; but it is only a momentary memory;
and he knew, when he had done it, and to his
Line Numbers
Line numbers are common in books of poetry, and usually appear near the margin every fifth or tenth line. Keep line numbers, placing them at least six spaces past the right hand end of the line, even if they are on the left side of the poetry/text in the original image. Since poetry will not be reformatted in the e-book version, the line numbers will be useful to readers.
Letters/Correspondence
Format letters and correspondence as you would paragraphs. Put a blank line before the start of the letter; do not duplicate any indenting.
Surround consecutive heading or footer lines (such as addresses, date blocks, salutations, or signatures) with /* and */ markers. See Placement of Out-of-Line Formatting Markup for details on this markup.
Don't indent the heading or footer lines, even if they are indented or right justified in the image—just put them at the left margin. The post-processor will format them as needed.
If the correspondence is printed differently than the main text, see Block Quotations.
Original Image:
Correctly Formatted Text:
<i>John James Audubon to Claude François Rozier</i>
[Letter No. 1, addressed]
/*
<sc>M. Fr. Rozier</sc>,
Merchant-Nantes.
<sc>New York</sc>, <i>10 January, 1807</i>.
<sc>Dear Sir</sc>:
*/
We have had the pleasure of receiving by the <i>Penelope</i> your
consignment of 20 pieces of linen cloth, for which we send our
thanks. As soon as we have sold them, we shall take great
pleasure in making our return.
Original Image:
Correctly Formatted Text:
/#
lack of memory which <i>baffles belief</i>, I have a certain
"uptaking" knack. My preachment will bore you, but you
will (if you read it) detect an <i>ensemble</i>; but, for goodness'
sake, <i>zitti</i>! They'll think, when they hear the P.R.A., that,
Lor' bless him! he'd known it all his life. Nevertheless,
enough for the day, &c. Best love to Gussey.--Affect. bro.,
/*
<sc>Fred.</sc>
*/
#/
I remember--when my husband and I were
sitting with him one afternoon after his return
home that autumn--his saying, "I feel distinctly I
Right-aligned Text
Surround lines of right-justified text with /* and */ markers. See Placement of Out-of-Line Formatting Markup for details on this markup.
Formatting at the Page Level:
Blank Page
Format as [Blank Page] if both the text and the image are blank.
If there is text in the formatting text area and a blank image, or if there is text in the image but none in the text box, follow the directions for a Bad Image or Bad Text.
Front/Back Title Page
Format all the text just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.
Older books often show the first letter as a large ornate graphic—format this as just the letter.
Original Image:
Correctly Formatted Text:
/*
GREEN FANCY
BY
GEORGE BARR McCUTCHEON
AUTHOR OF "GRAUSTARK," "THE HOLLOW OF HER HAND,"
"THE PRINCE OF GRAUSTARK," ETC.
<i>WITH FRONTISPIECE BY
C. ALLAN GILBERT</i>
[Illustration]
NEW YORK
DODD, MEAD AND COMPANY
1917
*/
Table of Contents
Format the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. and surround it with /* and */. See Placement of Out-of-Line Formatting Markup for details on this markup.
Page number references should be placed at least six spaces past the end of the text. Remove any periods or asterisks (leaders) used to align the page numbers.
Original Image:
Correctly Formatted Text:
CONTENTS
/*
CHAPTER PAGE
I. <sc>The First Wayfarer and the Second Wayfarer
Meet and Part on the Highway</sc> 1
II. <sc>The First Wayfarer Lays His Pack Aside and
Falls in with Friends</sc> 15
III. <sc>Mr. Rushcroft Dissolves, Mr. Jones Intervenes,
and Two Men Ride Away</sc> 33
IV. <sc>An Extraordinary Chambermaid, a Midnight
Tragedy, and a Man Who Said "Thank You"</sc> 50
V. <sc>The Farm-boy Tells a Ghastly Story, and an
Irishman Enters</sc> 67
VI. <sc>Charity Begins Far from Home, and a Stroll in
the Wildwood Follows</sc> 85
VII. <sc>Spun-gold Hair, Blue Eyes, and Various Encounters</sc> 103
VIII. <sc>A Note, Some Fancies, and an Expedition in
Quest of Facts</sc> 120
IX. <sc>The First Wayfarer, the Second Wayfarer, and
the Spirit of Chivalry Ascendant</sc> 134
X. <sc>The Prisoner of Green Fancy, and the Lament of
Peter the Chauffeur</sc> 148
XI. <sc>Mr. Sprouse Abandons Literature at an Early
Hour in the Morning</sc> 167
XII. <sc>The First Wayfarer Accepts an Invitation, and
Mr. Dillingford Belabors a Proxy</sc> 183
XIII. <sc>The Second Wayfarer Receives Two Visitors at
Midnight</sc> 199
XIV. <sc>A Flight, a Stone-cutter's Shed, and a Voice
Outside</sc> 221
*/
Indexes
Surround the index with /* and */ tags. See Placement of Out-of-Line Formatting Markup for details on this markup. You don't need to align the numbers as they appear in the image; just put a comma followed by the page numbers.
Indexes are often printed in 2 columns; this narrower space can cause entries to split onto the next line. Rejoin these back onto a single line. This may create long lines, but they will be rewrapped to the proper width and indentation during post-processing.
Place one blank line before each entry in the index. For sub-topic listings in an index, start each one on a new line, indented 2 spaces.
Treat each new section in an index (A, B, C...) the same as a section heading by placing 2 blank lines before it.
Old books sometimes printed the first word of each section in the index in all caps or small caps; change this to match the style used for the rest of the index entries.
Please check the Project Comments as the Project Manager may request different formatting, such as treating the index like a Table of Contents instead.
Original Image:
Correctly Formatted Text:
/*
Elizabeth I, her royal Majesty the Queen, 123, 144-155.
birth of, 145.
christening, 146-147.
death and burial, 152.
Ethelred II, the Unready, 33.
*/
Original Image:






















