Back to Edition

Editorial policies

 

Screen presentation in the edition and in text files

There are tooltip clarifications to explain such features as unclear and supplied text, and (in diplomatic transcriptions only) deleted and added text, appearing on hover or touch, according to device. Other features with tooltips are signalled by blue text, especially biographical information on persons. Diplomatic and normalised screen versions have corresponding red text to show where they differ elsewhere, apart from ſ ~ s. Tooltips for annotations in the diplomatic version give information on the annotator.

Conventions in screen and file versions are summarised in the table below. Editorial footnotes only appear in the diplomatic transcription, numbered afresh on each page.

handwritten originalTEI/XML file
filename.xml
diplomatic text
on-screen
normalised text
on-screen
diplomatic plain
text file
filename.txt
normalised plain
text file
filename-n.txt
long sſ
(UTF-8 long s)
ʃ
(UTF-8 esh)
s
(normal s)
ſ
(long s)
s
(normal s)
any symbol for and&&&&&
other charactersnormal print equivalentnormal print equivalentnormal print equivalentnormal print equivalentnormal print equivalent
&c. = 'et cetera', No. = 'Number',
Dr = 'Doctor' (as title),
Mr, Mrs, Messrs, PS, St = 'Saint'
as written,1 untaggedas written1as written1 (inline)as written1 (inline)as written1 (inline)
other abbrev'ns, incl. Dr = 'Dear', 'Doctor' (as common noun), 'Dowager', St = 'Street'tagged 1
<abbr> ~ <expan>
as written1expanded (if known)as written1 (inline)expanded (if known)
initial for nametagged
<abbr> ~ <expan>
as writtenexpanded (if known)as writtenexpanded (if known)
dash as punctuation 2 --  --  --  --  -- 
obsolete spelling known at the period (acc. to OED)tagged
<orig> ~ <reg>
as writtennormalisedas writtennormalised
idiosyncratic spelling or errortagged
<sic> ~ <corr>
as written[sic]normalisedas writtennormalised
initial capital 3as written, untaggedas writtenas writtenas writtenas written
(non)use of possessive apostrophe, incl. e.g. gen. sg. any bodiesas writtenas writtenas writtenas writtenas written
verb with d, 'd for -edtagged
<orig> ~ <reg>
as writtennormalisedas writtennormalised
foreign word or phrase4taggedunmarkedunmarkedunmarkedunmarked
obsolete morphology (e.g. had wrote, She eat some chicken)tagged
<orig> ~ <reg>
as writtennormalisedas writtennormalised
text supplied by editorstagged[supplied text]
+ tooltip
[supplied text]
+ tooltip
supplied text, unmarkedsupplied text, unmarked
text added by writertagged|added text|
+ tooltip
added text, unmarkedadded text, unmarkedadded text, unmarked
substitution by writertaggeddeleted text
+ tooltip
+ |substitute|
+ tooltip
substitute only, unmarkedsubstitute only, unmarkedsubstitute only, unmarked
deleted texttaggeddeleted text
+ tooltip
text absenttext absenttext absent
deleted text, unreadable or uncertaintagged
<del> + <gap>
[------]
+ tooltip
text absenttext absenttext absent
unreadable or uncertain texttagged
<gap>
[------]
+ tooltip
[------]
+ tooltip
<GAP: nn units> (characters, words, lines)<GAP: nn units> (characters, words, lines)
unclear or damaged but reasonably certain texttaggedunclear text
(wavy underline)
+ tooltip
unclear text
(wavy underline)
+ tooltip
unmarkedunmarked
superscript, subscript, position above/below linetagged
<hi> or <add>
formatting displayedformatting absentformatting absentformatting absent
underline
(various styles)
tagged
<emph>
formatting displayed
(single underline)
formatting absentformatting absentformatting absent
boundary stroke or line
(≠ word underline)
(some) tagged
<milestone>
[in progress]
thin horizontal linethin horizontal lineignoredignored
new linetaggedas writtenas writtenas writtenas written
word split across lines 5tagged
<orig> ~ <reg>
as
written5
reassembled on first line without internal punctuationas written5reassembled on first line without internal punctuation
new paragraph at linebreak ± indenttaggedas writtenas writtenno indentno indent
centred text 6taggedcentred, on new linecentred, on new lineleft-aligned, on new lineleft-aligned, on new line
right-aligned text 6taggedright-aligned, on new lineright-aligned, on new lineleft-aligned, on new lineleft-aligned, on new line
new column or pagetaggedruled lineruled lineblank lineblank line
catchwordtagged
<fw>
catchword
+ tooltip
text absent<CATCHWORD: word>text absent
surplus wordtaggedsurplus word
+ tooltip
word absent<SURPLUS: word>word absent
editorial footnotetagged
<note/@resp>
lemmanumeral
+ tooltip
note absentnote absentnote absent
quoted speechtagged
<q>
unmarkedunmarkedunmarkedunmarked
literary or biblical quotationtagged
<cit/quote + bibl>
quoted textquoted textquoted textquoted text
line of versetagged
<l>
unmarkedunmarkedunmarkedunmarked
change of hand in letter as senttagged
<handShift>
unmarked unless footnote neededunmarked<HANDSHIFT><HANDSHIFT>
annotation not present in letter as senttagged
<note/@hand>
annotation
+ tooltip
annotation absent<ANNOTATION: annotation>annotation absent
moved section 7original and destination locations tagged
<anchor>, <ref>
at original location
+ tooltip,
footnote at destination
at original locationno indication at original location, <MOVED> at destinationno indication at original location, <MOVED> at destination
Notes to table

1 Any punctuation under superscripted letter(s) in abbreviations is placed last, regardless of relative left-right orientation in the original. Thus, Mr. Mr: Mr– Mr may occur (inline versions Mr. Mr: Mr- Mr), but M.r M:r M-r will not. A letter+macron abbreviation (ac̄ept, com̄and, etc.; Bāloon) is generally expanded as doubling of that letter (accept, command) or an adjacent one (Balloon), but note Com̄ps,thrō, wc̄h (Compliments, through, which).

2 The dash as punctuation, represented by two hyphens, always has a space on either side. By contrast, a single unspaced hyphen character is used for normal hyphen (well-known) and horizontal stroke under superscript abbreviation (Mrs). Unspaced double em-dash is used for a dash that suppresses all or part of a name or place (Miſs —— = ‘Miss Goldsworthy’, their —— = ‘their Majesties’, to —— = ‘to Windsor’, Mr. H—— = ‘Mr. Hodges’, Ly– S.—— = ‘Lady Stormont’, the K——g = ‘the King’), shortens a word (by T——w = ‘by Tomorrow') or euphemistically blanks all or part of a profanity (D——d = ‘Damned’).

3 In some hands it can be difficult to distinguish upper and lower case in word-initial position. Decisions are based on close comparison with other letter-forms in the same hand, but some arbitrariness is inevitable.

4 French and other foreign languages are not normalised – neither corrected nor regularised to present-day grammar and orthography. Place-names and personal names are not generally normalised either.

5 Words split across two lines may have a hyphen on the first, the second or both fragments (reco-|ver, imperfect|-ly, satisfacti-|-on); or a double hyphen (pur=|port, dan|=ger, qua=|=litys); or none (respect|ing).

6 Centred text and right alignment are simulated on-screen by extra indentation.

7 Insertions that interrupt the text are moved to their logical point or to the start or end of a letter; address panels are placed at the end.

Project files

The master-copy of each document in the project is an XML file conforming to TEI P5. End-of-line is LF only.

Two different TXT files are derived from each XML file: plain and (partially) normalised. The main purpose of normalisation is to facilitate research and improve part-of-speech tagging; coverage is subject to change. EOL is CR + LF.

The corpus edited and released to date, with each TXT format in a separate zip file, is freely available for non-profit use to anyone who registers. Just fill in our simple online form here.