ISO/IEC JTC/1 SC/2 WG/2 N2474. Xerox Research Center Europe. 25 April 2002, marked revisions 17 May 2002

Similar documents
Xerox Research Center Europe. 25 April at the earliest opportunity to include four additional characters,

The Unicode Standard Version 8.0 Core Specification

The Unicode Standard Version 7.0 Core Specification

The Unicode Standard Version 11.0 Core Specification

Proposal to encode Grantha Chillu Marker sign in Unicode/ISO 10646

Response to the Proposal to Encode Phoenician in Unicode. Dean A. Snyder 8 June 2004

Minnesota Academic Standards for Language Arts Kindergarten

Proposal to add two Tifinagh characters for vowels in Tuareg language variants

The Book of Mormon: The Earliest Text

Summary. Background. Individual Contribution For consideration by the UTC. Date:

Proposal to Encode the Typikon Symbols in Unicode: Part 2 Old Rite Symbols

The Deseret Alphabet as Contrasted with Other Spelling Reforms in America

Scott Foresman Reading Street Common Core 2013

Proposal to encode Al-Dani Quranic marks used in Quran published in Libya. For consideration by UTC and ISO/IEC JTC1/SC2/WG2

The Deseret Alphabet Experiment

Proposal to Encode the Typikon Symbols in Unicode

Proposal to encode svara markers for the Jaiminiya Archika. 1. Background

Published in the Journal of Mormon History 38:3 (Summer 2012): Used by permission of author.

Scott Foresman Reading Street Common Core 2013

Treasures Reading/Language Arts Program

Comments on Grantha OM

Revelations of God. In April 1831, early Church convert Thomas B. Marsh wrote GREAT AND MARVELOUS ARE THE

The Deseret Alphabet and Other American Spelling Reform Movements

New Discoveries in the Joseph Smith Translation of the Bible

Some comments on the Arabic block in Unicode

ISO/IEC JTC1/SC2/WG2 N2972

Reading Standards for the Archdiocese of Detroit Kindergarten

Macmillan/McGraw-Hill. Treasures. Grades K - 6. Correlated with. Oklahoma Priority Academic Student Skills (PASS) Language Arts.

The Civil War Years In Utah: The Kingdom Of God And The Territory That Did Not Fight

This document requests an additional character to be added to the UCS and contains the proposal summary form.

ISO/IEC JTC1/SC2/WG2 N3816

The original text of Joseph Smith s New Translation of the Bible

Translation of the Book of Mormon: Interpreting the Evidence

A Study of the Text of Joseph Smith s Inspired Version of the Bible. BYU Studies copyright 1968

L2/ Background. Proposal

Proposal to Encode the Mark's Chapter Glyph in theunicode Standard

Arizona Common Core Standards English Language Arts Kindergarten

Request for editorial updates to Indic scripts

Application for Membership

The 400-year Prophecies of Nephite Destruction and Extinction

De Nomine Sancto (Concerning the Holy Name)

Papers: The Manuscript Revelation Books

N3976R L2/11-130R

Old Slavonic and Church Slavonic in TEX and Unicode

This is a preliminary proposal to encode the Mandaic script in the BMP of the UCS.

Karen Lynn Davidson, David J. Whittaker, Mark-Ashurst-McGee, and Richard L. Jensen, eds., Histories, Volume 1: Joseph Smith Histories,

Introducing A Book of Commandments and Revelations, A Major New Documentary "Discovery"

A Correlation of Scott Foresman Reading Street Common Core Edition Kindergarten, 2013

LIBRARY CHURCH HISTORY. Church History Library. Local History Sources at the. Selected LDS Family and JESUS CHRIST OF LATTER-DAY SAINTS THE CHURCH OF

THE SOURCE OF THE BOOK OF ABRAHAM IDENTIFIED

Typographic Concerns and the Hebrew Nomina Sacra

Isaiah in the Bible and the Book of Mormon

Authorship of the History of Brigham Young: A Review Essay

CONTENTS LIST OF MAPS PREFACE NOTE ON TRANSLITERATION AND ABBREVIATIONS 1. HISTORICAL SETTING 1

The Pearl of Great Price

Request to encode South Indian CANDRABINDU-s. Shriramana Sharma, jamadagni-at-gmail-dot-com, India 2010-Oct Background

How We Got the Book of Moses

This document requests an additional character to be added to the UCS and contains the proposal summary form.

Proposal to Encode the Typikon Symbols in Unicode

Title Review of Revelations and Translations, Volume 3, Parts 1 and 2: Printer s Manuscript of the Book of Mormon, by Royal

Wilson Fundations Scope and Sequence

A Correlation of. Scott Foresman. Reading Street. Common Core. to the. Arkansas English Language Arts Standards Kindergarten

Proposal to encode Quranic marks used in Quran published in Libya (Narration of Qaloon with script Aldani)

Because of the central 72 position given to the Tetragrammaton within Hebrew versions, our

James H. Hart's Contribution to Our Knowledge of Oliver Cowdery and David Whitmer

Responses to Several Hebrew Related Items

FOR RELEASE FEB. 6, 2019

AUTOBIOGRAPHY WARREN FOOTE ( )

Syllables In Tashlhiyt Berber And In Moroccan Arabic (International Handbooks Of Linguistics) By F. Dell;M. Elmedlaoui READ ONLINE

FARMS Review 16/1 (2004): (print), (online)

Registrar DEFINITIONS OF DAUGHTERS OF UTAH PIONEERS (DUP)

N3976 L2/11-130)

ISO/IEC JTC1/SC2/WG2 N4283 L2/12-214

Correlation to Georgia Quality Core Curriculum

Sioux City Standards by Quarter Kindergarten

A Ready Defense for Christianity. 1 Peter 3:13-16

RELIGION C 324 DOCTRINE & COVENANTS, SECTIONS 1-76

The Printer s Manuscript

D O C T R I N E & C O V E N A N T S & 1 3 3

DELANOS AND SURPRISES IN THE OLD DARTMOUTH TOWN MEETING RECORDS

The First Vision. The Restoration of the fulness KEY TO TRUTH

Reviewed by H. Michael Marquardt

Daughters of Utah Pioneers Daughters of the Future Keepers of the Past

Follow-up to Extended Tamil proposal L2/10-256R. 1. Encoding model of Extended Tamil and related script-forms

My Fellow Servants. Essays on the History of the Priesthood. William G. Hartley. BYU Studies Provo, Utah

Guide to the Republican Women of Las Vegas, Nevada Records

Review of Books on the Book of Mormon

Developing Database of the Pāli Canon

To: Physical Review Letters Re: LBK1086 Parrott. Summary of Letter:

ISO/IEC JTC1/SC2/WG2 N25xx

Scriptural Promise The grass withers, the flower fades, but the word of our God stands forever, Isaiah 40:8

Sariah in the Elephantine Papyri

SECTION 4. A final summary and application concerning the evidence for the Tetragrammaton in the Christian Greek Scriptures.

Guide to the Syphus-Bunker Papers

The MORMONS THE STORY OF. By William A. Linn (1902) Book I Book II Book III Book IV Book V Book VI Index

Lesson 2 History of the Doctrine and Covenants

From the Archives: UTAH STATE HISTORICAL SOCIETY 300 Rio Grande Salt Lake City, UT (801)

OSSA Conference Archive OSSA 8

Reading and Writing with Sources

Response to Earl Wunderli's critique of Alma 36 as an Extended Chiasm

Transcription:

ISO/IEC JTC/1 SC/2 WG/2 N2474 2002-05-17 Proposal to Modify the Encoding of Deseret Alphabet in Unicode Kenneth R. Beesley Xerox Research Center Europe Ken.Beesley@xrce.xerox.com 25 April 2002, marked revisions 17 May 2002 1 Summary It is proposed that the encoding of Deseret Alphabet in Unicode be augmented at the earliest opportunity to include four additional characters, used in some versions of the Alphabet, being 1. DESERET CAPITAL LETTER OI (IPA /O I /) 2. DESERET SMALL LETTER OI (IPA /O I /) 3. DESERET CAPITAL LETTER EW (IPA / j u/) 4. DESERET SMALL LETTER EW (IPA / j u/) [Revision 17 May 2002: Citation glyphs for these characters are shown in N2473.] 2 A Short History of Deseret Alphabet Versions The Deseret Alphabet went through a number of versions during its history. Several of the versions were used to write signicant, and still extant, manuscripts that are of interest to historians for their content and to linguists for the phonological clues they provide to the speech of the writers. Some of the versions of the Deseret Alphabet had only 38 letters, each in uppercase and lowercase other versions had 40 letters, the two extra letters being the ones proposed herein for addition to the Unicode encoding. Summary of known Deseret Alphabet versions: 1

1. Stout Version. Printed on a four-page broadside (copies are extant) on or shortly before 21 March 1854, when it was seen by Hosea Stout and copied into his journal. 38 letters. No known manuscript texts exist. 2. Remy Version. Seen in Salt Lake City in 1855 by travel writers Jules Remy and Julius Brenchley and printed as a plate in their published account (A Journey to Great-Salt-Lake City, London:W. Jes, 1861). 40 letters. No known manuscript texts exist, but Remy wrote, \The new characters, intended for the printing-presses of the Salt Lake, were cast at St. Louis but up to this day nothing has been published, as far as we know, with these singular types. We have known them used in private correspondence, and seen them on some shop signs." [Revision 17 May 2002: The Remy Version was used in a letter from George D. Watt to Brigham Young, 21 Aug 1854, Brigham Young Collection, Archives, Church of Jesus Christ of Latter-Day Saints. Watt's letter contains parallel example texts in the Remy Version and in a radical proposed revision of the Alphabet.] 3. Haskell Version. Used for several months in 1859 by Thales H. Haskell to write his journal, which is now held in the library of Brigham Young University. 40 letters. This version was also used briey, again in an extant journal, by M.J. Shelton. 4. Speller Version. Used in an extant undated manuscript, \The Deseret Phonetic Speller", in the Archives of the Church of Jesus Christ of Latter-Day Saints, Salt Lake City, Utah. 40 letters. The glyphs of this version suggest that it was written circa 1859. [Revision 17 May 2002: On reexamination, the Speller looks like it could have been written anytime between 1855 and 1859.] 5. Deseret News Version. Used in 1859, 1860 and 1864 to print articles in the Deseret News newspaper. 38 letters. 6. Book Version. Used in 1868 and 1869 to print four books: The Deseret First Book (a primer), The Deseret Second Book (a primer), The Book of Mormon, Part I (intended as an advanced reader), and The Book of Mormon (full text). 38 letters. At least four versions of the Alphabet, the Haskell Version (40 letters), the Speller Version (40 letters), the Deseret News Version (38 letters) and the Book Version (38 letters), are used in signicant extant texts that historians and linguists might want to encode in Unicode. Other texts may come to light. 2

There is one other known version, the Watt Version, proposed and briey used in a single extant letter from George D. Watt to Brigham Young, dated 21 August 1854. However, this was a radical proposal for changing the Alphabet, and there is no evidence (so far) that it was ever used outside the original letter of proposal. [Revision 17 May 2002: As noted above, this letter does contain an example text written in the 40-character Remy Version of the Alphabet.] 3 History of Deseret Alphabet in Unicode Currently only the Book Version (38 letters, 76 characters) of the Deseret Alphabet is accommodated in Unicode. John Jenkins of Apple, who championed the addition of the Deseret Alphabet, was honestly unaware that earlier 40-letter versions of the Alphabet had really been used. 4 Justication for the Augmentation 4.1 Practical Need for 40-Letter Encoding It has been proposed that the two extra letters in the 40-character versions could be handled as ligature glyphs, thus existing only in the realm of fonts rather than in the underlying encoding. I believe that this would be a mistake. Phonological arguments for the phonemic status of /O I / and / j u/ are presented below. In the 40-letter versions of the Alphabet, the two letters in question were presented and used in a manner parallel to the other 38 letters. In practical use today, if the two extra letters were treated as ligatures, then the underlying encoding would presumably be something like the sequences corresponding to OI (or oi) and ju when using 40-letter fonts, special rendering routines would presumably need to detect these sequences and render them with single glyphs. This would preclude the possibility of encoding a distinction between the /O I / and the / j u/diphthongs, on one hand, and the sequences /OI/ (or /oi/) and /ju/ on the other. Such sequences of vowels might appear in 40-letter texts as spelling mistakes 1 or as useful and accurate encodings of words such assawing /soin/, drawing /droin/, cawing /koin/ orblowing /bloin/, throwing /TroIN/ etc. where there really is a sequence of vowels, separated by a morpheme boundary, rather than a diphthong. It would be highly annoying to have a rendering engine collapse an intentionally written sequence of vowels as in /soin/or/bloin/ into an unintended diphthong glyph. I reiterate my belief that treating OI and EW as ligature glyphs would be a mistake. 1 I believe it should be possible to encode spelling mistakes accurately. 3

4.2 Historical Practice and Arguments In texts written in the 38-letter Book Version, the phonemes represented by OI and EW were necessarily written with two letters. However, the variations in the alphabets and statements from the time indicate real doubt and debate about the status of EW and OI. In the Deseret News Version, which is almost identical to the Book Version, the editors (at least in the article printed 23 February 1859) oered the following apology: \Since the arrival of the matrices, &c, for casting the Deseret Alphabet, it has been determined to adopt another character to represent the sound of EW, but until we are prepared to cast that character, the characters [corresponding to] IU will be used to represent the sound of EW in NEW." 2 The Pitman \phonotypy" alphabets of the day, which were the inspiration for the Deseret Alphabet, also vacillated on the issue of 38 vs. 40 letters. 4.3 Modern Phonological Justication In the 19th century, \phoonetics" was a hot science, but the concepts of the phoneme and phonology were not fully understood. Isaac Pitman was constantly modifying his phonotypy alphabets to capture new \sounds" that he thought he was hearing. From a modern phonological point of view, the Deseret Alphabet was clearly intended to be a phonemic alphabet, and the question properly reduces to this: Are OI and EW really phonemes in (standard dialects of) English? For OI, the answer is uncontroversially yes. The diphthong in boy /O I / is parallel phonetically to the other English diphthongs in high /a I / and how /a U /. The vowels in hey and hoe and also diphthongized in most cases: /e I / and /o U /. It is of course possible to represent all such diphthongs orthographically using sequences of two separate characters, but this does not reect the reality of their single-phoneme status in English. A diphthong, as described by phonologist Peter Ladefoged 3 is a single vowel phoneme that \involves a change in quality within the one vowel" (op. cit. p. 76). In a phonemic alphabet like Deseret Alphabet, where /a I /, /e I /, /a U / and /o U / are (properly) encoded as single characters and rendered with single glyphs, there is no good justication for encoding or rendering the diphthong /O I / dierently. The status of EW, i.e. / j u/, is rather more interesting, and still somewhat controversial. Peter Ladefoged (op. cit., pp. 77-78) argues that it should be treated as a diphthong (i.e. a single phoneme) in English. The last diphthong, [ju] as in \cue", diers from all the other diphthongs in that the more prominent part occurs at the end. 2 The editorial introduction is otherwise rather confused, suggesting also that a new letter would also be added for representing the vowel in hair. 3 A Course in Phonetics, 2nd ed. New York: Harcourt Brace Jovanovich, 1982. 4

Because it is the only vowel of this kind, many books on English phonetics do not even consider it as a diphthong they treat it as a sequence of a consonant followed by a vowel and symbolize it by [ju] (or [yu], in the case of books not using the IPA system of transcription). I have considered it to be a diphthong because of the way it patterns in English. Historically, it is a vowel, just like the other vowels we have been considering. Furthermore, if it is not a vowel, then we have tosay that there is a whole series of consonant clusters in English that can occur before only one vowel. The sounds at the beginning of \pew, beauty, cue, spew, skew" and (for most speakers of British English) \tune, dune, sue, Zeus, new, lieu, stew" occur only before /u/. There are no English words beginning with /pje/ or /kj /, for example. In stating the distributional properties of English sounds, it seems much simpler to recognize /ju/ as a diphthong and thus reduce the complexity of the statements one has to make about the English consonant clusters. 4.4 Comparison with the Shavian Alphabet The Shavian Alphabet was a 20th-century attempt at promoting a phonemic alphabet for English. It includes single symbols for the diphthongs / j u/ and /O I /, parallel to all the other diphthongs. 4.5 Fonts and the Deseret Alphabet The versions of the Deseret Alphabet dier not only in their phonemic inventory (38 vs. 40 letters) but also in the glyphs. Even the Deseret News Version and the Book Version, both 38 letters and both of which were cast into printing type, dier in the shape of the letter used for /e I / in the Deseret News version, it opens to the left much like the digit 3 in the Book Version, it opens to the right. The glyphs used (in 40-letter versions of the Alphabet) for EW and OI varied, and there were other variations in short vowels and even consonants. The handling of variant glyph shapes is almost certainly best handled by using dierent fonts to render phonemically standardized underlying encodings. 5 Summary Based on the arguments above, I urge that the encoding of Deseret Alphabet in Unicode be augmented to provide single-character encodings for the 5

following diphthongs: 1. DESERET CAPITAL LETTER OI /O I / 2. DESERET SMALL LETTER OI /O I / 3. DESERET CAPITAL LETTER EW / j u/ 4. DESERET SMALL LETTER EW / j u/ I understand that sucient code space was left after the current encoding for just such an eventuality. This expanded encoding will allow faithful (and phonological justied) encoding of Deseret Alphabet documents encoded in 40-letter versions of the Alphabet without prejudicing the encoding of 38-letter versions. It will allow, with 40-letter fonts, a justiable distinction between the encodings of diphthongs in words like boy /bo I / and non-diphthong vowel sequences in words like sawing /soin/ or blowing /bloin/. [Revision 17 May 2002: There are two reasonable candidates for the Unicode citation glyphs for OI and EW: 1. The glyphs used in the letter from George D. Watt to Brigham Young, 21 Aug 1854, and which were seen by Remy & Brenchley (op. cit.) in 1855. The same OI and EW glyphs are used in the (undated) Deseret Phonetic Speller Version. 2. Or the glyphs used in the Haskell and Shelton journals of 1859. See N2473 for the recommended citation glyphs.] 6