Date Approved


Degree Type

Open Access Thesis

Degree Name

Master of Arts (MA)

Department or School

English Language and Literature

Committee Member

Anthony Aristar

Committee Member

Helen Aristar-Dry


The electronic age has increased the range of human capabilities to such an extent that the expectations about appropriate empirical linguistic analysis are changing. A hundred years ago, linguistics was largely an empirical manual process that produced information intended for humans. Today, the world is different as inexpensive computing power and the prevalence of information in electronic format encourages that, whenever possible, information be processed by automated and scalable means and the results be usable and understandable by computers. Creating sustainable and usable observations is best achieved through a standards-based approach that meets long term persistence and usability goals. This thesis presents a scalable architecture for creating linguistic observations in the form of string frequencies measurements and instantiates those measurements in a machine-readable standards-based format called Resource Descriptive Framework (RDF).

Included in

Linguistics Commons