Algorithmic exploration of American English dialects
Abstract
In this paper, we use a novel algorithmic approach to explore dialectal variation in American English speech. Without the need for human phonemic annotations, we are able to use an existing corpus transcribed in text form only. Our results show that, in general, American English dialects can be divided into two larger groups: dialects of the South (Texas to North Carolina except for peninsular Florida), and the rest of the country. Our results confirm some well-known results from dialectology, such as the pin-pen merger, but show that some other ones, such as the cot-caught merger, may be losing their isogloss boundaries. Moreover, we demonstrate that our algorithm can extend to dialectal features in other languages.