Date: Tue, 1 Oct 1996 14:04:32 -0400
From: Allan Metcalf AAllan[AT SYMBOL GOES HERE]AOL.COM
Subject: ADS Workshops at Annual Meeting
WORKSHOPS IN STATISTICAL METHODS FOR LINGUISTIC ANALYSIS
Sponsored by the AMERICAN DIALECT SOCIETY
January 2, 1997
Chicago, Illinois
Sheraton Chicago Hotel and Towers
LINGUISTIC SOCIETY OF AMERICA ANNUAL MEETING
The American Dialect Society, to celebrate its first general meeting held
jointly with LSA, is sponsoring six workshops on the quantificational
(statistical) treatment of a variety of kinds of linguistic data. Each
workshop, conducted by an internationally-recognized authority, will be
presented twice, and participants may attend the full day's sessions,
attending as many as four different workshops.
These workshops are open to all who register for the LSA meeting and are free
of charge (except for a small fee for some workshops in which materials are
distributed).
There will be a limit on participation in these workshops. If you want to be
assured a place, please send a letter, enclosing a self-addressed stamped
post card, to
American Dialect Society
Allan Metcalf, Executive Secretary
English Department
MacMurray College
Jacksonville, Illinois 62650
or an e-mail message to him at
AAllan[AT SYMBOL GOES HERE]aol.com
For each workshop you wish to attend, please list the name of the presenter
and the time (e.g., Kretzschmar 8:00, Finegan 1:30). Do not forget the time,
since each workshop will be given twice.
The workshops were organized by Dennis Preston of Michigan State University.
Schedule:
8:00-10:00 Kretzschmar Cichocki Berdan
10:30-12:30 Bayley Labov Berdan
1:30-3:30 Bayley Finegan Cichocki
4:00-6:00 Labov Kretzschmar Finegan
THE WORKSHOPS:
1) VARBRUL analysis of linguistic variation
Robert Bayley
University of Texas, San Antonio
This session will provide a rationale for and demonstration of the VARBRUL
computer programs (Pintzuk 1988; Rand and Sankoff, 1990; Sankoff 1988). The
demonstration uses data from a study of consonant cluster reduction in
Mexican-American English (Bayley 1994) and relative pronoun choice in speech
and writing (Guy and Bayley, 1995) to show the steps in the heuristic process
of hypothesis generation, testing, and revision as it is carried out with the
help of VARBRUL, including the following: 1) generating initial hypotheses to
account for observed variation; 2) coding the data for the potentially large
number of independent factors affecting variation; 3) conducting the initial
VARBRUL run and interpreting the factor probabilities generated; 4) recoding
the data to refine hypotheses on the basis of factor probabilities generated
in step 3; 5) testing of significance of individual factors and factor groups
by means of log likelihood estimation. In addition, the workshop will
consider several questions that are likely to arise when conducting a VARBRUL
analysis, including dealing with suspected interaction among factors and
choosing between competing analyses.
2) The analysis of vowel systems
William Labov
University of Pennsylvania
This workshop will deal with the display and analysis of vowel formant data,
with particular emphasis on the study of change in progress, through use of
the Macintosh program PLOTNIK 03. Workshop participants should have a body of
formant measurements in hand, or the opportunity to acquire them, through the
use of such programs as Kay Elemetric CSL, Eric Keller's Signalyze, GSW
Soundscope, or Cornell Ornithology Lab's Canary. The workshop will show how
vowel tokens are plotted, normalized, and automatically analyzed for
segmental environment; how relevant sub-sets of vowels may be selected,
plotted or highlighted; how means and standard deviations are plotted; how to
carry out t-tests on the difference of any two means; how subsets of vowels
may be plotted or highlighted by any combination of segmental environment,
stress, or style. Particular attention will be given to methods for
determining the extent to which vowel systems participate in the Northern
Cities Shift, the Southern Shift, the Canadian Shift, or the low back merger.
Participants will receive copies of PLOTNIK 03 along with tutorial
and full documentation. PLOTNIK 03 includes several dozen features introduced
following the NWAVE 24 workshop with PLOTNIK 02, including adaptation to
other languages, shift from color to black and white, and the addition of
vectors from nuclei to glide targets. In addition, methods for superimposing
large numbers of vowel systems will be introduced through the use of the
program PLOTNIK MAJOR
3) Computer plotting and mapping of areal linguistic data
William A. Kretzschmar, Jr.
University of Georgia
This session will present a discussion of methods of computer plotting and
mapping of linguistic data drawn from American Linguistic Atlas surveys. We
will begin with the basic issues of the possible relationships between
linguistic data and geographical locations, and of the nature of GIS
(Geographical Information Systems). Computer plotting, and generalizations to
be made from observation of plots, will be illustrated with the Graphic
Plotter Grid from the Linguistic Atlas of the Gulf States, the LAMSASplot
program from the Linguistic Atlas of the Middle and South Atlantic States
(LAMSAS), and the LAMSAS Internet plotter. We will then consider use of
statistical procedures to assess geographical distribution of linguistic
features drawn from LAMSAS: t-test, chi-square, and multiple comparison for
fixed regions; spatial autocorrelation; and density estimation. Finally, we
will consider uses of GIS software to assist in visualization of
distributions.
4) Advanced multivariate analyses of linguistic data
Robert Berdan
California State University Long Beach.
This session will focus principally on logistic regression, the general
statistical approach underlying VARBRUL analyses. The generalized
application is particularly useful for data sets that are well described by
both categorical and continuous variables, a frequent situation both for
language acquisition and for historical data sets, in which time is best
considered as a continuous variable, but various linguistic and demographic
characteristics are categorical (or continuous). The SPSS implementation of
logistic regression will be demonstrated in the workshop. The workshop will
demonstrate the progression of analysis from text files to reportable
graphics and statistics. Topics considered will be the optimizing coding to
the data set, hypothesis developing and testing, evaluating competing
analyses, treatment of interactions among factors, and the interpretation of
error and reliability. We will also compare assumptions of continuous change
over time, versus discontinuities and restructuring. The SPSS graphics tools
will be explored both as analytic techniques and for reporting findings.
Where comparable, SPSS reporting will be converted to VARBRUL terms.
5) Factor analytic procedures in language analysis
Ed Finegan
University of Southern California
In its linguistic applications, the statistical technique called factor
analysis can be used to uncover patterned variation by deriving a relatively
small set of underlying variables (called 'factors') from large sets of
variable linguistic features. The workshop demonstrates the use of this
technique for identifying factors that underlie large-scale variation of
linguistic features across texts and for interpreting those factors as
linguistic constructs (usually called 'dimensions'). Also included: the
Promax rotation technique for minimizing the number of factors on which any
linguistic feature loads; appropriateness of factor analysis to different
kinds of linguistic investigations; pros and cons of factor analysis for
linguistic inquiry in general.
6) Correspondence (Dual Scaling) Analysis
Wladyslaw Cichocki
University of New Brunswick
This session demonstrates correspondence analysis (CA), a statistical
technique which is closely related to multidimensional scaling and factor
analysis. CA is particularly helpful in studying the type of categorical,
ordinal and frequency data commonly found in empirical linguistic
investigations. While CA is predominantly a data exploratory technique, it
can be used to formulate hypotheses. The presentation will avoid complicated
algebraic formulas and will emphasize instead the simple graphical displays
that are used to interpret and understand data structure. Applications will
be chosen from dialectology, phonetics, sociolinguistics and syntax.
Discussion will include issues of interpretation, stability and statistical
significance as well as a review of available computer software.