Date: Mon, 13 Jan 1997 03:57:38 -1000
From: Norman Roberts nroberts[AT SYMBOL GOES HERE]HAWAII.EDU
Subject: Re: IPA to Internet?
Does anybody know where I can find an IPA to Internet legend?
Here is one I downloaded some time ago. It's rather long so I hope your
mail server accepts it.
sci.lang #38113 (11 more) [1]
From: Georgy Pruss georgy[AT SYMBOL GOES HERE]zs.kiev.ua
[1] Repost: FAQ: Representing IPA Phonetics in ASCII
Date: Wed May 10 10:22:59 HST 1995
Organization: Zest Systems
Lines: 524
Distribution: world
NNTP-Posting-Host: render.gu.kiev.ua
X-Return-Path: zs!zs.kiev.ua!georgy[AT SYMBOL GOES HERE]figaro.gu.kiev.ua
Some people asked me to re-send it. Here you are.
Newsgroups: sci.lang,alt.usage.english
From: evan[AT SYMBOL GOES HERE]hplerk.hpl.hp.com (Evan Kirshenbaum)
Subject: FAQ: Representing IPA Phonetics in ASCII
Sender: news[AT SYMBOL GOES HERE]hplabsz.hpl.hp.com (News Subsystem (Rigel))
Message-ID: D25JCv.DzA[AT SYMBOL GOES HERE]hplabsz.hpl.hp.com
Date: Mon, 9 Jan 1995 18:58:07 GMT
Reply-To: kirshenbaum[AT SYMBOL GOES HERE]hpl.hp.com
Nntp-Posting-Host: hplerk.hpl.hp.com
Organization: Hewlett-Packard Laboratories
Lines: 502
Xref: lyra.csx.cam.ac.uk sci.lang:16066 alt.usage.english:38835
[Last Modified, 4 Jan 1993]
This article describes a standard scheme for representing IPA
transcriptions in ASCII for use in Usenet articles and email. The
following guidelines were kept in mind:
o It should be usable for both phonemic and narrow phonetic
transcription.
o It should be possible to represent *all* symbols and
diacritics in the IPA.
o The previous guideline notwithstanding, it is expected that
(as in the past) most use will be in transcribing English,
so where tradeoffs are necessary, decisions should be made
in favor of ease of representation of phonemes which are
common in English.
o The representation should be readable.
o It should be possible to mechanically translate from the
representation to a character set which includes IPA. The
reverse would also be nice.
In order to be able to represent a wide range of segments while making
common segments easy to type, we allow more than one representation
for a given segment. Each segment has an "explicit" representation,
which is a set of features between curly braces ("{" and "}"). Each
feature is represented as a three letter abbreviation taken from a
standardized set. The phoneme /b/ (a voiced, bilabial stop) could be
represented as /{vcd,blb,stp}/. A first cut at the feature set
appears in appendix A below.
The word "tag" could thus be represented phonemically as
/{vls,alv,stp}{low,fnt,unr,vwl}{vcd,vel,stp}/
and phonetically as
[{vls,asp,alv,stp}{low,fnt,lng,unr,vwl}{unx,vcd,vel,stp}]
This works, but it's a bit of a pain. To simplify transcription, we
allow an "implicit" representation for a segment which consists of a
(generally alphabetic) symbol followed by diacritics. Thus /b/ stands
for /{vcd,blb,stp}/. Case is significant (/n/ and /N/ are different
segments). The segment symbols are given in appendix B below.
The word "tag" can thus be represented phonemically as
/t&g/
The diacritics for a segment are represented between angle brackets
(" " and " ") and consist of symbols or features. (In the common case
where the diacritic symbol is a single character which does not encode
a segment, the brackets may be removed.) The features which the
diacritics map to override those of the segment.
The word "tag" thus becomes narrowly
[t asp & lng g unx ]
or
[t h & : g o ]
or
[t h &:g o ]
Some diacritic symbols encode more than one feature set. Which one is
meant should be apparent from context. For example, "." stands for
"{rnd}" when attached to a vowel, but "{rfx}" when attached to a
consonant.
Clicks are common to many languages (especially in Africa), but there
is no IPA diacritic that means "click". Rather than use up several
characters for clicks (which are infrequent in the languages most
often discussed), we instead use the diacritic "!" after the
homorganic unvoiced stop. Thus /t!/ (= /t clk / = /{alv,clk}/) is the
sound commonly written "tsk" and used in English to show disapproval.
The complete set of diacritic symbols appears in appendix C below.
Appendices D and E contain representations of segments more or less
ordered by feature (appendix D in tabular form, appendix E as a list).
Appendix F contains a list of all of the ASCII characters and the uses
they have been pressed to.
For transcription of any specific language a group can by convention
alter the character mappings (as an example, for Spanish /R/ may be
better used to represent /{alv,trl}/ than /{mid,cnt,rzd,vwl}/). An
author may also press a little used symbol (for the language under
consideration) into service to highlight a distinction. Such an
alteration should be made explicitly to avoid confusion.
The diacritics "+" and "=" and the segment symbols "$" and "%" are
explicitly left unspecified so that they can be used to mark
language-specific features (that are otherwise cumbersome to mark).
Such symbols can be assigned either by convention for a specific
language or in an ad-hoc manner by an individual author.
Stress marks are prepended to the syllable they attach to. "'"
signals primary stress, "," signals secondary stress. Spaces should
be employed to separate words (cliticized words may be written
unseparated). When discussing single words, it may be helpful to
insert a space before each syllable that doesn't carry a
suprasegmental marker.
The "I hear the secretary" for an American might be something like
/aI hir D[AT SYMBOL GOES HERE] 'sEkrI,t&ri/
while to an Englishman it might be more like
/aI hi[AT SYMBOL GOES HERE] DI 'sEkr^tri/
Transcribing tone is harder. Here's an attempt. For register tone
languages (e.g., Hausa, Navajo), numbers should be used with one being
the lowest. Thus in Navajo, "1" is low tone and "2" is high. In
Yoruba "1" is low, "2" is mid, and "3" is high. The language's
"default" tone need not be specified. For contour tone languages
(e.g., Mandarin, Thai), there is generally a numeric system in place
(Mandarin: "1" is high, "2" is rising, "3" is falling rising, "4" is
falling). The tone indication should follow the syllable (vowel?).
The symbol "#" is used to represent a syllable or word boundary.
Appendix A. Feature Abbreviations
----