Concepedia

Publication | Closed Access

Models, forests, and trees of York English:<i>Was/were</i>variation as a case study for statistical practice

736

Citations

51

References

2012

Year

TLDR

Prior work on was/were variation has relied on generalized linear models, but recent statistical advances such as mixed‑effects models, random forests, and conditional inference trees offer new avenues for analysis. This study seeks to explain the strong variation between was and were in plural existential constructions and to identify the most effective analytical tool. We demonstrate how mixed‑effects models assess random‑effect factors, random forests rank predictor importance even in unbalanced, multicollinear designs, and conditional inference trees visualize interactions among predictors. The analysis shows that polarity, verb‑DP distance, and DP type are significant predictors, with ongoing linguistic change and social reallocation evident, and suggests testable predictions that broaden the methodological toolkit for variationist research.

Abstract

Abstract What is the explanation for vigorous variation between was and were in plural existential constructions, and what is the optimal tool for analyzing it? Previous studies of this phenomenon have used the variable rule program, a generalized linear model; however, recent developments in statistics have introduced new tools, including mixed-effects models, random forests, and conditional inference trees that may open additional possibilities for data exploration, analysis, and interpretation. In a step-by-step demonstration, we show how this well-known variable benefits from these complementary techniques. Mixed-effects models provide a principled way of assessing the importance of random-effect factors such as the individuals in the sample. Random forests provide information about the importance of predictors, whether factorial or continuous, and do so also for unbalanced designs with high multicollinearity, cases for which the family of linear models is less appropriate. Conditional inference trees straightforwardly visualize how multiple predictors operate in tandem. Taken together, the results confirm that polarity, distance from verb to plural element, and the nature of the DP are significant predictors. Ongoing linguistic change and social reallocation via morphologization are operational. Furthermore, the results make predictions that can be tested in future research. We conclude that variationist research can be substantially enriched by an expanded tool kit.

References

YearCitations

Page 1