DP Jargon

From DPCanadaWiki

Jump to: navigation, search

To be created/developped and updated Template:Jargon Guide Preface

Contents

Activity Hub

To be created/developped and updated

The Activity Hub is a special page on the DPC Website which allows you access to each of the rounds and other major stages of DPC's e-text production process. Using the links on the page you can view and explore all the "home" pages for each of the five rounds and other workflow stages, even if you are not yet authorized to actually perform work in those stages. (See Access requirements to learn what is needed to work in each stage.)

Between the links on the navigation menu across the top of the page and the links in the body of the Activity Hub page itself, you can quickly visit any of the major sections of the DPC Site. Because of this valuable hub-like function, the Activity Hub is one of the very few DPC "production" Webpages which can be accessed directly from both the DPC Forums (using the link at the top of every Forum page) and the DPC Wiki (using the link in the DPC Navigation box to the left).


avatar

An avatar is a small graphic image you can add to your DPC Forum profile. It will display below your username in posts you make in the DPC Forums and on your User Details page. It can be no larger than 80 x 80 pixels and 6 kilobytes.

For information on creating and customizing an avatar for yourself, consult the page that I'm sure someone will eventually create in the Wiki, and the Avatar Creation and Help thread in the forums.


BWL: Bad Word List

A Bad Word List (BWL) contains words that will be flagged by the WordCheck tool in the Proofreading Interface. In general, Bad Words are words in a project which are likely to be stealth scannos and should thus receive extra attention from proofers.

There are two types of Bad Word Lists. There can be a site-wide BWL for each language for which WordCheck has a spell-check dictionary that applies to all projects with that language setting, and there can be a project-specific BWL created by its Project Manager.

For more information, see also Good Word List and the WordCheck FAQ.


BAE

The Bureau of American Ethnology, usually referred to as the BAE, was a government-sponsored organization that coordinated and reported anthropologic research in the Americas. Its most notable publications, Annual Reports and Bulletins, have been harvested from scans available at gallica.bnf.fr and form a large group of projects in various stages at DP (and perhaps someday at DPC).

The papers in these bulletins and reports were written by staff members at the BAE, including many early field anthropologists, as well as other collaborators supported by the BAE. Papers were written about physical findings, social and cultural traditions, and linguistic families, among other things.

The BAE projects in progress at DP tend to be a bit challenging, as they are often long, heavily footnoted and indexed, and contain many diacritical marks that are not part of the standard Latin-1 character set used at DP, necessitating the use of not only the entire diacritical markup explained in the Proofreading Guidelines, but an extended set of markup. There are also some proofing and formatting conventions that apply to all BAE projects but which are not standard at DP; for instance, indexes in BAE projects are done differently than those in most DP projects. See individual Project Comments and the BAE UberProject thread for more information.


BEGIN/BEGINNERS ONLY project

BEGINNERS ONLY, or BEGIN, projects are some EASY projects which have been set aside for our newest volunteers. These books contain most of the elements proofers deal with most frequently, such as end-of-line hyphenation, "unclothed" dashes, page headers, "spacey quotes," wonky punctuation, and scannos/stealthos you can really sink your teeth into.

New volunteers in P1 are asked to limit themselves to a small number of pages (15-20 pages) in any BEGIN project, then to move on to another BEGIN, a Newcomers Only, or any other project. After reaching a page-limit threshold (rumor has it at between 40 and 50), BEGIN projects become no longer accessible by that particular P1er.

Once a BEGIN project has completed P1, it moves to P2 where it is labeled as MENTORS ONLY (or MENTOR), and mentors proof the work and send encouraging, detailed feedback to the appropriate P1ers. This feedback process can take up to a week, since the whole project must finish P1 before moving on to P2.

To help these books move into P2 relatively quickly, they are usually separated into parts of about 100 pages each. When all of a given book's parts are finished in F2, a friendly squirrel reassembles the book into a whole and it is made available to a post-processor.


Big Eye, The

Big eye

blackletter

Blackletter

block quotation or block quote

Block quotation

CP: Content Provision/Provider

Content Providing/Provision (CP) is the process of providing the page images used in proofreading, either by scanning a book or harvesting the images from an online source.

Also a person who does such work (Content Provider, or CPer).

If you are interested in becoming a CPer, visit Access requirements.


You can automate some content providing tasks by using the DPtools, which are available here.

Content providing (or CPing) is a task carried out by content providers.

If you need help with content providing, you may want to refer to one to:

DG: Daily Goal

DG

diacritical marks

Diacritical marks, sometimes referred to as diacritic marks or the "short-hand" term diacriticals, are small marks found above or below a basic character which change the pronunciation of that character. For example, the acute accent over the "e" in the character "é" is a diacritical mark.

Characters with diacritical marks are proofed two different ways in DPC projects, depending upon whether the needed character is or is not available in the UTF-8 character set:


diffs

Diffs (short for differences) are the changes made to the text of a project's individual pages as it progresses through each round at DPC. The term can also refer to the Webpages at DP where you can view such changes. A "diff" doesn't necessarily mean there was something wrong on a page, just that the page text coming out of the subsequent round is different.

Proofreaders often help themselves improve their proofing and formatting skills by examining and analyzing their diffs. See Checking your diffs for more details.


DPC: Distributed Proofreaders Canada

Distributed Proofreaders Canada (DPC) refers to this Website, the organization behind it, the community using it (sometimes referred to individually as DPCers, short for DPC users), etc., in any or all combinations.

For a general introduction to DPC, the organization, see Distributed Proofreaders Canada.


DPC Wiki

The DPC Wiki

dp-feedback

Dp-feedback

DPCustomMono

In order to help proofers detect OCR errors, Big Bill developed a custom font for DPC called DPCustomMono2. You can read about the history of the font and why it was developed in the Custom DP proofing font thread.


DPE or DP-Eu: Distributed Proofreaders Europe

Distributed Proofreaders Europe

DPCF: Distributed Proofreaders Canada Foundation

Distributed Proofreaders Canada Foundation definition

DPCWiki

DPCWiki

F1

F1 refers to Formatting round 1, the first round of formatting, in which markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., is added to individual pages in a project.

To see how you can qualify to work in F1, see the Access requirements article and the F1 round page.

See also F2 and proofreading.


F2

F2 refers to Formatting round 2, the second round of formatting, in which F1 markup is checked and corrected.

To see how you can qualify to work in F2, see the Access requirements article, the F2 round page, and this forum post.

DPC needs a team, like the F2 Fanatics, dedicated to moving projects through F2 towards completion more efficiently by concentrating efforts on a few projects.


FAQ

FAQ

feedback

All well-designed systems have feedback mechanisms built into them, and the DP Workflow system is no exception in this regard. Accordingly, there are many kinds of feedback that are exchanged between volunteers here at DPC.

  • One of the first types of feedback a new volunteer at DP is likely to receive, and the kind most likely to be the type being referred to when you see the word "feedback," will come from a mentor via the BEGINNERS ONLY projects and/or the dp-feedback mechanisms, through which experienced proofreaders send detailed and constructive comments to proofreaders regarding what they did correctly and incorrectly (with respect to the Guidelines) on specific pages in specific projects, and on ways that the proofreaders can learn to become even better proofreaders.
  • As project pages progress through the rounds, each successive proofer and foofer may be creating diffs which can serve as feedback to the previous proofer or foofer. This is the primary mechanism by which proofers in Newcomers Only projects receive feedback.
  • PP Mentors give feedback to new PPers; HTML Mentors give help and feedback to PPers who are new to HTML; and one of the PPVers' primary tasks is to give feedback to PPers.

In addition to these relatively formalized and routine feedback mechanisms, any volunteer can ask for feedback on any issue, question, procedure, etc., in a Project Thread or any other place in the DP Forums which may seem to be appropriate.


foofing/foofer

Foofing is an informal term for formatting; one who "foofs" is called a foofer.

Compare to proofing and proofer.


footnote

See out-of-line footnote.

formatting

Formatting is the process of adding markup for italics, boldface, SMALL CAPITALS, chapter and section headers, footnotes, etc., to a project. Formatting of a project's individual pages is performed in rounds F1 and F2. Other, more project-wide, formatting is done in the PP stage.

See also foofing, and compare to proofreading.


Formatting Guidelines

Formatting Guidelines refers to a document that contains all the default instructions and standards for formatting (such as markup for italics, illustrations, footnotes, and poetry) in rounds F1 and F2. These standards apply to all projects, unless specifically overridden by instructions from the Project Manager in the Project Comments or Project Discussion.

For complete clarity, the PM may make any exceptions to these Guidelines in the Project Comments, and the PPer can make any changes deemed appropriate in the PP stage. The only principle for the PPer to follow is consistency--either make the same consistent change throughout the book, or consistently stick to the original treatment of the author.

You can access the Formatting Guidelines from FAQ Central and from any Proofing Interface window.

See also Proofreading Guidelines.


The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author puts italics, bold text or a footnote every third word, we mark them italicized, bolded or footnoted. We are proofreaders, not editors. (See Printer Errors/Misspellings for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line ( End-of-line Hyphenation). Changes such as these help us produce a consistently proofed version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Proofreading Guidelines with this concept in mind. There is a separate set of Formatting Guidelines. These guidelines are intended for proofreading only. A second group of volunteers will be working on the formatting of the text.

To assist the next proofreader, the formatter, and the Post-Processor, we also preserve Line Breaks. This allows them to easily compare the lines in the text to the lines in the image.

About This Document

This document is written to explain the formatting rules we use to maintain consistency when formatting a single book that is distributed among many volunteers, each of whom is working on different pages. This helps us all do formatting the same way, which in turn makes it easier for the Post-Processor who will complete the work on this e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about while proofreading. There is a separate set of Proofreading Guidelines. If you come across a situation and you do not find a reference in these guidelines, it is likely that it was handled in the proofreading rounds and so is not mentioned here.

If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum.

Each Page is a Separate Unit

Since each project is distributed among many formatters, each of whom is working on different pages, there is no guarantee that you will see the next page of the project. With this in mind, be sure to open and close all markup tags on each page. This will make it easier for the post-processor to eventually combine all these pages into one e-book.

Project Comments

When you select a project for formatting, the Project Page is loaded. On this page, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.

Please also read the Project Thread (Forum). The Project Manager may clarify project-specific guidelines here, and it is often used by formatters to alert other volunteers to recurring issues within the project and how they can best be addressed. (See the next section).

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other volunteers have made changes.

Forum/Discuss This Project

On the Project Page where you start formatting pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other volunteers who are working on this book.

Fixing Errors on Previous Pages

The Project Page contains links to pages from this project that you have recently worked on. (If you haven't formatted any pages yet, no links will be shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make corrections or to finish formatting. Just click on the link to the page. Thus, if you discover that you made a mistake on a page or marked something incorrectly, you can click on that page here and reopen it to fix the error.

You may also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.

Formatting at the Character Level

Placement of Inline Formatting Markup

Inline formatting refers to markup such as <i> </i>, <b> </b>, <sc> </sc>, <f> </f>, or <g> </g>. Place punctuation outside the tags unless the markup is around an entire sentence or paragraph, or the punctuation is itself part of the phrase, title, or abbreviation that you are marking. If the formatting goes on for multiple paragraphs, put the markup around each paragraph.

The periods that mark an abbreviated word in the title of a journal such as Phil. Trans. are part of the title, so they are included within the tags, thus: <i>Phil. Trans.</i>.

Many typefaces found in older books used the same design for numbers in both regular text and italics or bold. For dates and similar phrases, format the entire phrase with one set of markup, rather than marking the words as italics (or bold) and not the numbers.

If there is a series/list of words or phrases (such as names, titles, etc.), mark each item of the list individually.

In poetry, mark each line of the poem separately if the formatting goes on for multiple lines. See the Tables section for handling markup in tables.

Examples:

 Original Image:                                               Correctly Proofread Text:
 Enacted 4 July, 1776                      <i>Enacted 4 July, 1776</i>
 It cost 9l. 4s. 1d.                       It cost 9<i>l.</i> 4<i>s.</i> 1<i>d.  </i>
 God knows what she saw in me! I spoke  <b>God knows what she saw in me!</b> I spoke 
 in such an affected manner.                  in such an affected manner.
 As in many other of these Studies, and    As in many other of these <i>Studies</i>, and
 (Psychological Review, 1898, p. 160)     (<i>Psychological Review</i>, 1898, p. 160)
 L. Robinson, art. "Ticklishness,"         L. Robinson, art. "<sc>Ticklishness</sc>,"
                  December 3, morning.     /*
                  1323 Picadilly Circus    December 3, morning.
                                           1323 Picadilly Circus
                                           */
 Volunteers may be tickled pink to read    Volunteers may be tickled pink to read
 Ticklishness, Tickling and Laughter,      <i>Ticklishness</i>, <i>Tickling and Laughter</i>
 Remarks on Tickling and Laughter          <i>Remarks on Tickling and Laughter</i>
 and Ticklishness, Laughter and Humour.    and <i>Ticklishness, Laughter and Humour</i>.
That's the idea!” exclaimed Tacks.       "<i>That's the idea!</i>" exclaimed Tacks.
 The professor set the reading assignment   The professor set the reading assignment
 for  E r l e b n i s  G e s c h i c h t e  for <g>Erlebnis Geschichte
 D e u t s c h l a n d                      Deutschland
 s e i t  1 8 4 5.                          seit 1845</g>.

Italics

Format italicized text with <i> and </i> inserted at the end of the italics. (Note the "/" in the closing tag.)

See also Placement of Inline Formatting Markup

Bold Text

Format bold text (text printed in a heavier typeface) with <b> inserted before the bold text and </b> after it. (note the "/" in the closing tag.)

See also Placement of Inline Formatting Markup and Chapter Headings.

Underlined Text

Format underlined text as Italics with <i> and </i>. (Note the "/" in the closing tag.) Underlining was often used to indicate emphasis when the typesetter was unable to actually italicize the text, for example in a typewritten document.

See also Placement of Inline Formatting Markup

Some Project Managers may specify in the Project Comments that underlined text be marked up with the <u> and </u> tags.

Spaced Out Text (gesperrt)

Format s p a c e d o u t text with <g> inserted before the text and </g> after it. (Note the "/" in the closing tag.) Remove the extra spaces between letters in each word. This was a typesetting technique used for emphasis in some older books, especially in German.

See also Placement of Inline Formatting Markup and Chapter Headings.

Font Changes

Some Project Managers may request that you mark a change of font within a paragraph or line of normal text by inserting <f> before the change in font and </f> after it. (Note the "/" in the closing tag.) This markup may be used to identify a special font or other formatting that does not already have its own markup (such as italics and bold).

Possible uses of this markup include:

  • antiqua (a variant of roman font) inside fraktur
  • blackletter within a section of regular font
  • smaller or larger font only if it is within a paragraph in regular font (for a whole paragraph in a different font or size, see the block quotation section)
  • upright font inside of a paragraph of italicized text

The particular use or uses of this markup in a project will usually be spelled out in the Project Comments. Formatters should post in the Project Discussion if the markup appears to be needed and has not yet been requested.

See also Placement of Inline Formatting Markup.

Words in Small Capitals

The formatting is different for Mixed Case Small Caps and all small caps:

Format words that are printed in Mixed Small Caps as Mixed Upper and Lowercase. Format words that are printed in all small caps as ALL-CAPS. For both mixed case and all small caps, surround the text with <sc> and </sc> markup.

Headings ( Chapter Headings, Section Headings, Captions, etc.) may appear to be in all small caps, but this is usually the result of a change in font size and should not be marked as small caps. The first word of a chapter that is in small caps should be changed to mixed case without the tags.

See also Placement of Inline Formatting Markup.

forum

A forum is an asynchronous communication mechanism in which public messages called posts are made (posted), usually in topical threads. On other sites, this type of communication software can be called a newsgroup or a bulletin board. Here at DPC, we have the DPC Forums.

To see information about the complementary roles of the DPC Wiki and the DPC Forums, see the Forums vs Wiki page.


Fraktur

Fraktur

FTC: Fine-Toothed Comb

Fine-Toothed Comb

ftealth fcanno

Ftealth fcanno

GWL: Good Word List

A Good Word List (GWL) contains words that will not be flagged by the WordCheck tool in the Proofreading Interface even if they aren't recognized by the spell-checking component of WordCheck. In general, Good Words are words in a project which are likely to be proper nouns or frequently-occurring project-specific terminology.

There are two types of Good Word Lists. There can be a site-wide GWL for each language for which WordCheck has a spell-check dictionary that applies to all projects with that language setting, and will be a project-specific GWL created by its Project Manager.

For more information, see also Bad Word List and the WordCheck FAQ.


GoEG: Grammar of English Grammars

Grammar of English Grammars

groofing

Groofing refers to group-proofing, in which we cooperate and coordinate our efforts through the DPC groupchat window on the Jabber network, the "Anyone for leapfrog?" forum thread, or the Groofers and Gfoofers team thread, proofreading the same text, encouraging each other, and sharing wisdom about the project.

A list of scheduled and past Groofs can be found at Scheduled groofs.

Hopefully, a DPC Groofers and Gfoofers team willbe established for those who enjoy Groofing and Gfoofing.


Guidelines, The

The Guidelines

Handy Guides

Handy Guides

HTML: HyperText Markup Language

HTML is the abbreviation for Hyper-Text Markup Language. HTML text is normal (e.g. ASCII) plaintext but with certain parts of the text marked up to denote special formatting or layout or other properties, or to link it with other texts (hence the term hyper-text). A browser uses this information to render the text accordingly (for example with portions in bold or italics).


Jabber

Jabber is an open-source instant messaging system commonly used by DPCers. Members meet regularly in the conference rooms available through Jabber to socialize, ask/answer DPC-related questions, and participate in group activities such as groofing and grentoring.



Jabber ID

Jabber ID

Latin-1

Latin-1 (or more formally, ISO-8859-1) is an 8-bit extension of the 7-bit ASCII character set (and a subset of the Unicode character set). It mainly adds common (mostly accented) characters (like ä, á, å, ç, ß) needed for several western European languages based on the Latin alphabet. Latin-1 is the character set used in the Proofing Interface.

More information can be found at Wikipedia.


ligature

A ligature is one typographical character which combines two letters which are usually separate characters.

Here at DP, if the ligature is available in the UTF-8 character set, we usually proof it that way. For example, the "ae" and "oe" ligatures are normally proofed as the æ, Æ, œ, or Œ characters, as appropriate.

Most other ligatures are proofed as their component characters with no special markup, but if a particular ligature will be seen frequently in a given project, the PM will usually have addressed the issue specifically in the Project Comments.

For more information related specifically to æ and œ, including how to distinguish between the two in italics fonts, see Ae and oe ligatures.


LoC/LOC

LoC

LOTE: Language-Other-Than-English

Language-Other-Than-English

markup

Here at DPC, the term markup generally refers to the various tags that are or have been inserted into documents to format or otherwise designate data for some type of special handling.

Different styles of markup are used in different types of documents. For example, to indicate a reference to footnote "number 1" in

  • an HTML document, use markup like this: <sup>[<a name="1" href="#1">1</a>]</sup>
  • page text in the Proofing Interface, use markup like this: [1]

And to bold text in

  • a BBCode forum posting, use markup like this: [b]bold text[/b]
  • a DPC Wiki article, use markup like this: '''bold text'''
  • an HTML document, use markup like this: <b>bold text</b>

For italic text in

  • a BBCode forum posting, use markup like this: [i]italic text[/i]
  • a DPC Wiki article, use markup like this: ''italic text''
  • an HTML document, use markup like this: <i>italic text</i>
  • a plain text document, use markup like this: _italic text_


mentoring

Mentoring definition

Newcomers Only project

Newcomers Only projects are (usually EASY) projects that have been set aside for our newest volunteers. These books contain most of the elements proofers need to deal with most frequently, such as "spacey quotes" and other wonky punctuation, a few diacritical marks here and there, an occasional "unclothed" dash or hyphen, and scannos/stealthos you can really sink your teeth into.

New volunteers in P1 are asked to limit themselves to about 10 pages in each Newcomer project, and then move on to a BEGIN or any other project.

Once a Newcomers project has been completed in P1, it is usually transformed into a Proof-only Mentoring project in round P3 (for now anyway), but it may occasionally become a P3 Qual project in round P2. Once the second proofing pass (in either P3 or P2) is done on the project, the new proofreaders who worked on the project in round P1 are informed that their "diff files" for the project are ready, and they are giving instructions on how to check their diffs, and ask questions about and interpret them.

In order to help ensure that the "turnaround time" on diffs is as short as possible (usually around two-three days), Newcomers Only projects are released into P1 more slowly than are Beginners Only projects. Thus, there will very likely be times when no Newcomers project is available in that round. Also, to help these books move into their second round of proofing relatively quickly, they are usually separated into parts of approximately 100 pages each. When all of a given book's parts are finished in F2, a friendly squirrel reassembles the book into a whole and it is made available to a post-processor.


OCR

OCR definition

OED: Oxford English Dictionary

Oxford English Dictionary

out-of-line footnote

Out-of-line footnote

P1

P1 refers to Proofreading round 1, which is the first of two or three rounds of proofing that each project goes through at DP. The raw OCR-ed text from the scan is checked and corrected (except in the relatively infrequent Type-in projects).

See also P2, P3, and formatting.


P2

P2 refers to Proofreading round 2, which is the second round of proofing. The page-texts have already been proofread, and now need to have the text spellchecked and carefully compared to the image.

Because P2 proofreaders are more experienced then most P1 proofreaders, P2 is expected to fix a variety of mistakes and oversights common in the P1 round. During P2 proofing, proofers can mentor P1 proofers by providing encouraging, helpful feedback via a PM.

To see how you can qualify to work in P2, see the Access requirements article and the P2 round page.

See also P1, P3 and formatting, and a discussion of the differences between the rounds.


P2alt

P2alt definition

P3

P3 refers to Proofing Round 3, which is the optional third round of proofing, in which the version of the page text produced in P2 is checked and corrected. See also P1 and formatting.

If you want to work in P3, you must satisfy the numerical requirements, and then apply for P3 qualification.

There is a team, P3 Junkies, dedicated to moving projects through P3 towards completion more efficiently by concentrating their efforts on a few projects. The P3 Junkies project list shows the team's current and previous projects.


P3 Quals

Some projects in P2 have (P3 Qual) after their titles. After finishing P2, these projects will move quickly into P3, skipping the queue, and will be proofed quickly once there. They are a way for proofers who have requested (or will request) P3 access to get diffs on their pages without having to wait a long time due to the length of the P3 queues. See more information at P3 qualification.


PCs: Project Comments

PCs

PF: Project Facilitator

A Project Facilitator (PF) is an administrative position at DPC, below that of Site Administrator (SA). The Project Facilitators' primary function is to help Project Managers, but only when the PMs need it. A PF can do anything a PM can do, with the difference that the PF can do it for all projects.

Visit DPC Administrators to see a list of current Project Facilitators.

PF's primary function is to help Project Managers, but only when PMs need it. A PF can do anything a PM can do, with the difference that the PF can do it for ALL projects.

So, a PF can look at an image or proofed page to answer a specific question in a project thread. A PF can move a project into Unavailable if there are serious problems (such as text/image mismatch) so that the PM can fix it later. A PF can edit project comments to add clarifying detail if the need for it becomes apparent in the discussion thread.

PF's have been instructed to be VERY careful about doing things to projects without prior permission from the PM.

What PF's can't do: move projects from one PM to another; change a project state in a manner the system doesn't expect; or grant PM, PP, or PPV status. For these things you will need to contact a site administrator.

There are currently 4 PFs, although if this function proves useful I expect that there will be more as the site grows.


PG: Project Gutenberg

You know: that place where all of the finished DP projects go.

Project Gutenberg (PG) is DP's "parent site," which hosts a growing online archive of public-domain electronic texts available freely to all. See gutenberg.org.


PGC: Project Gutenberg Canada

You know: that place where all of the finished DPC projects go.

Project Gutenberg Canada (PGC) is DPC's "sister site," which hosts a growing online archive of public-domain electronic texts available freely to all. See gutenberg.ca.


PM

  1. Project Manager
  2. Private Message


png: Portable Network Graphics

PNG, or png, which stands for Portable Network Graphics, is a lossless compressed image file format commonly used for the project page images on this site. See a detailed explanation at Wikipedia.

In Project Discussions, individual scan pages in a project are usually referred to by the "png numbers" found in their filenames.



PP: Post-Processing

Post-Processing (PP) is the process of formatting and reassembling the pages of a project after it has completed the rounds of proofing and formatting. (Also called Post-Proofing.)

Also, a person who does such work (also Post-Proofer, or PPer).

If you are interested in becoming a PPer, visit Access requirements.

See also the Post-Processing FAQ, and Hands-on PPer.


Proof-only Mentoring project

Proof-only Mentoring projects, which are found only in P3 (at least for right now), are the "reincarnation" of some of the Newcomers Only projects from P1. Except for the fact that these projects move straight from P1 to the active P3 list, the pages in these projects are proofed just as they would be in any other project.

Once all the pages in a given Proof-only Mentoring project have been proofed, the proofers who worked on the project in round P1 are notified via PM that their "diff files" for the project are ready, and they are giving instructions on how to check their diffs, and ask questions about and interpret them. The projects themselves move into F1 and complete the rest of their time at DP following the same process as all other projects.

Anyone with P3 status can do this type of mentoring. See the main Mentoring page for more information on the mentoring process.


PPV: Post-Processing Verification

Post-Processing Verification (PPV) is the process of final checking a post-processed text, done by a very experienced PPer. This is the last stage a project goes through at DPC before being sent to the PGC Whitewashers. The PPV Guidelines are available at FAQ Central.

Also, a person who does such work (also PPVer).


Pre-processing

Pre-processing is the process of preparing a book (which becomes known as a "project") for proofreading here at DPC. Steps include scanning the book (or "book-like thing"), running the OCR software (which generally includes some WordChecking function), and uploading the files to the DPC servers. These tasks are performed by a person known as the Content Provider (CP), who may also serve as the Project Manager (PM).

Private Message (PM)

A Private Message (PM), sometimes called Personal Message, can be sent through DPC's phpBB forum. This is basically the same as e-mail, but can be sent only from one DPCer to another.

Check your DP Inbox here.


project

A project at DPC is basically a book (or book-like thing) that is making its way through the DPC system for creating e-texts. (This term is used to avoid the unfortunate abbreviation "BM" for Book Manager.)


Project Discussion

See Project Thread.

Project Hospital

Project hospital definition

Project Manager (PM)

The Project Manager (PM) is the person in charge of a project and its progress through the rounds. The ultimate goal of the PM is to help the project be as consistently proofed and formatted as possible for the PPer. One way the PM (usually) does this is by writing Project Comments.

Different PMs have different styles. Some provide a handful of books that they pre-process themselves, then during proofreading monitor the project discussions closely, and finally post-process the project themselves; others provide large quantities of books and rely on others to PP them. Other PMs fall somewhere between, perhaps closely following some books, while only glancing in on others, as questions are asked in the project thread.

If you are interested in becoming a PM, visit Access requirements.


Project Page

Each project going through DPC has a sort of "home page," called its Project Page, which serves as a nexus to the various resources on the DPC site related to the project. The Project Page provides basic information about each project, including its PM, PPer (if assigned), difficulty level, genre, Special Days (if any), its current stage (round, etc.), the date it was last worked on, its Project Comments, a link to its Project Thread, and other information. The page can be displayed in four different levels of detail.

Project Pages are customized for each individual DPCer, providing easy access to the last five pages that each proofreader has started but not completed, and the last five pages each proofreader has finished processing in that project's current round. Access can also be gained to other pages in the project, including the "diffs" for the project, via the Page Details.


Project Thread

A Project Thread or Project Discussion is a thread in the forums dedicated to a specific project.

On each Project Page (where you start proofreading pages), there is a line labeled "Forum," which contains a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to the discussion thread that deals specifically with that one project. That is the place to ask questions about the book, inform the Project Manager about problems, ask questions about ambiguous items, etc. Using the project's dedicated forum thread is the recommended way to communicate with the project's PM, PP, and other proofreaders who are working on the project. Once the Discussion Thread has been created for a project, the thread follows the project through all its various queues, rounds, and other processing stages at DPC.

To post a new question or make a new comment in the Project Thread, you must click on the Post Reply button (not New Topic). This is because each Project Discussion is a dedicated thread in the Projects Waiting, Projects Being Proofed or Formatted, Projects Being Post-Processed or Verified, or Archive of Posted Projects forum, as appropriate, and not a separate forum of its own.

This fact also means that while a Project Thread is sometimes referred to as a Project Forum, that name is not really technically accurate, although no one is likely to complain if you use it.


proofing/proofer

Proofing (Proofer)

Proofing/Proofreading Interface

The Proofreading Interface, or Proofing Interface, is the part of the DPC site where users can proofread a single page in a project. It shows an image of the page, and a textbox containing the text for that page (as produced by other DPCers up to that point). The user compares the two and attempts to ensure that the text correctly reflects the content or formatting shown in the image.

The Proofing Interface comes in two versions, Standard and Enhanced, each of which can be toggled between horizontal and vertical layouts. For more information, see Working with the Proofing Interface.


proofreading

  1. In a specific sense, proofreading is the process of carefully correcting the OCR characters to match the text shown on the scanned pages of a project. This is often called "proofing", and is normally performed in rounds P1, P2, and P3. Compare to formatting.
  2. When used in a more generic sense, proofreading can refer to the entire process of getting a project ready for posting to the PG site. This is the sense in which the term is used in the name Distributed Proofreaders Canada.


Proofreading Guidelines

Proofreading Guidelines refers to a document which contains all the "default" instructions and standards for proofreading (such as how to handle hyphenated words and letters with diacriticals) in rounds P1, P2, and P3. These standards apply to all projects, unless specifically over-ridden by instructions from the Project Manager in the Project Comments or Project Discussion.

For complete clarity, the PM may make any exceptions to these Guidelines in the Project Comments, and the PPer can make any changes deemed appropriate in the PP stage. The only principle for the PPer to follow is consistency--either make the same consistent change throughout the book, or consistently stick to the original treatment of the author.

Please note that, where it may seem that the Guidelines encourage something other than a strict "match the scan" approach, it is because we are converting paper pages into unpaginated and rewrapped e-text.

You can access the Proofreading Guidelines from FAQ Central and from any Proofing Interface window.

IMPORTANT: this is a reference document--beginning proofers do not need to memorize the entire document, but everyone should consult it when problems or questions arise. If the matter is still not clear, raise an inquiry in the Project forum.

The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author puts italics, bold text or a footnote every third word, we mark them italicized, bolded or footnoted. We are proofreaders, not editors. (See Printer Errors/Misspellings for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line ( End-of-line Hyphenation). Changes such as these help us produce a consistently proofed version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Proofreading Guidelines with this concept in mind. There is a separate set of Formatting Guidelines. These guidelines are intended for proofreading only. A second group of volunteers will be working on the formatting of the text.

To assist the next proofreader, the formatter, and the Post-Processor, we also preserve Line Breaks. This allows them to easily compare the lines in the text to the lines in the image.

About This Document

This document is written to explain the proofreading rules we use to maintain consistency when proofreading a single book that is distributed among many proofreaders, each of whom is working on different pages. This helps us all do proofreading the same way, which in turn makes it easier for the formatter and for the Post-Processor who will complete the work on this e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about while proofreading. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum.

The original version of this document was created in May of 2008; revisions since that time are indicated at the beginning and end of any changed section like this: "(Nov 08) The original version of this..."(/Nov 08)

An inportant revision to the Guideline comes into effect as of November 15th, 2008.

Project Comments

On the main Project Page, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.

Please also read the Project Thread (Forum). The Project Manager may clarify project-specific guidelines here, and it is often used by proofreaders to alert other proofreaders to recurring issues within the project and how they can best be addressed. (See the next section).

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other proofreaders have made changes.

Forum/Discuss This Project

On the proofreading interface page (Project Page) where you start proofreading pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other proofreaders who are working on this book.

Fixing your own errors on Previous Pages

When you select a project for proofreading, the main Project Page is loaded. This page contains links to pages from this project that you have recently proofread. (If you haven't proofread any pages yet, there will be no links shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make proofreading corrections or to finish proofreading. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and re-open it to fix the error.

You can also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Comments page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.

For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you prefer.

Marking Errors and/or Comments

Throughout this document you will see instructions about marking possible errors and/or your comments with [** comment]. Unless the Project Manager instructs otherwise, the best place to insert your comment is directly after the word or character or other item that you have a concern with or question about; that will allow the next person who sees the page as well as the Post-Processor to know precisely what you are commenting on.

At the very least, put your comment on the same line as the item in question, but be clear as to where, in the line, the problem is. Placing your comment on the line above or below or somewhere else on the page can be confusing for others unless your comment relates to the whole page--in which case, place it at the top of the page.

Even when you post a note in the Project Discussion Forum, it's a good idea to mark the item that you would like someone to look at for you.

There is a [** ] button at the bottom of the Proofreading Interface screen that you can use for inserting your comments.

How To Proof

Paragraph and Line Spacing

Paragraph Spacing/Indenting

Use a blank line to separate paragraphs. You should not indent the start of paragraphs, but if all paragraphs are already indented, don't bother removing those spaces—that can be done automatically during post-processing.

If the page starts with a new paragraph, place a blank line at the top of the page. However, remove any extra blank lines between paragraphs or at the top or bottom of the pages. It is not necessary to leave blank lines at the bottom of the pages.

See the Page Headers/Page Footers image/text for an example.

Line Breaks

Leave all line breaks in so that later in the process other volunteers can easily compare the lines in the text to the lines in the image. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.

Extra Spaces or Tabs Between Words

Extra spaces and tab characters between words are common in OCR output. You don't need to bother removing these—that can be done automatically during post-processing.

However, extra spaces around punctuation, em-dashes, quote marks, ellipses, etc. do need to be removed when they separate the punctuation mark from the word.

For example, in the sentence A horse ;   my kingdom for a horse, the space between the word "horse" and the semicolon should be removed. But the 2 spaces after the semicolon are fine—you don't have to delete one of them.

Trailing Space at End-of-line

Do not bother inserting spaces at the ends of lines of text. It is a waste of your time for something that we can take care of automatically later. Similarly do not waste your time removing extra spaces at the ends of lines.

Punctuation

Punctuation

In general, there should be no space before punctuation characters except opening quotation marks. If scanned text has a space before punctuation, remove it.

Spaces before punctuation sometimes appear because books typeset in the 1700's & 1800's often used partial spaces before punctuation such as a semicolon or comma. As well, many OCR applications frequently interpret partial spaces as full spaces which will result in spaces appearing where they shouldn't be.

For example, proofread:

and so it goes ; ever and ever.

as:

 and so it goes; ever and ever.

End of Sentence Periods

Proofread periods that end sentences with a single space after them.

You do not need to remove extra spaces after periods if they're already in the scanned text—that can be done automatically during post-processing. See the Sidenotes or paragraph Side-Descriptions image and text for an example.

Dashes, Hyphens, and Minus Signs

There are generally four such marks you will see in books (see the examples below):

Hyphens

These are used to join words together, or sometimes to join prefixes or suffixes to a word. Leave these as a single hyphen, with no spaces on either side. Note that there is a common exception to this shown in the second example below.

En-dashes

These are just a little longer, and are used for a range of numbers, or for a mathematical minus sign. Proofread these as a single hyphen, too. Spaces before or after are determined by the way it was done in the book: generally there are no spaces in number ranges; usually there are spaces around mathematical minus signs, sometimes both sides, sometimes just before.

Em-dashes & long dashes

These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat——! Proofread these as two hyphens if the em-dash is short and four hyphens if the em-dash is long. Don't leave a space before or after, even if it looks as if there was a space in the original book image. An easy "rule of thumb", to use if you are unsure if the dash should be "two" or "four" hyphens long, is to look at