Date Approved

11-5-2013

Date Posted

4-4-2014

Degree Type

Open Access Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

College of Technology

Committee Member

Dr. Ali Eydgahi, Ph.D., (Chair)

Committee Member

Dr. Daniel Fields, Ph.D.

Committee Member

Dr. Huei Lee, Ph.D.

Committee Member

Dr. Alphonso Bellamy, Ph.D.

Abstract

Web robots also known as crawlers or spiders are used by search engines, hackers and spammers to gather information about web pages. Timely detection and prevention of unwanted crawlers increases privacy and security of websites. In this research, a novel method to identify web crawlers is proposed to prevent unwanted crawler to access websites. The proposed method suggests a five-factor identification process to detect unwanted crawlers. This study provides the pretest and posttest results along with a systematic evaluation of web pages with the proposed identification technique versus web pages without the proposed identification process. An experiment was performed with repeated measures for two groups with each group containing ninety web pages. The outputs of the logistic regression analysis of treatment and control groups confirm the novel five-factor identification process as an effective mechanism to prevent unwanted web crawlers. This study concluded that the proposed five distinct identifier process is a very effective technique as demonstrated by a successful outcome.

Share

COinS