Concepedia

Abstract

In this article, we provide the multivariate generating function counting texts according to their length and to the number of occurrences of words from a finite set. The application of the inclusion-exclusion principle to word counting due to Goulden and Jackson [1979, 1983] is used to derive the result. Unlike some other techniques which suppose that the set of words is reduced (i.e., where no two words are factor of one another), the finite set can be chosen arbitrarily. Noonan and Zeilberger [1999] already provided a Maple package treating the nonreduced case, without giving an expression of the generating function or a detailed proof. We provide a complete proof validating the use of the inclusion-exclusion principle. Some formulæ for expected values, variance, and covariance for number of occurrences when considering two arbitrary sets of finite words are given as an application of our methodology.

References

YearCitations

Page 1