What can be Found on the Web and How: A Characterization of Web Browsing Patterns
Abstract
In this paper, we suggest a novel approach to studying user
browsing behavior, i.e., the ways users get to different pages
on the Web. Namely, we classified all user browsing paths
leading to web pages into several types or browsing patterns.
In order to define browsing patterns, we consider several im-
portant points of the browsing path: its origin, the last page
before the user gets to the domain of the target page, and
the target page referrer. Each point can be of several types,
which leads to 56 possible patterns. The distribution of the
browsing paths over these patterns forms the navigational
profile of a web page.
We conducted a comprehensive large-scale study of naviga-
tional profiles of different web pages. First, we demonstrated
that the navigational profile of a web page carry crucial in-
formation about the properties of this page (e.g., its pop-
ularity and age). Second, we found that the Web consists
of several typical non-overlapping clusters formed by pages
of similar ranges of incoming traffic. These clusters can be
characterized by the functionality of their pages.
browsing behavior, i.e., the ways users get to different pages
on the Web. Namely, we classified all user browsing paths
leading to web pages into several types or browsing patterns.
In order to define browsing patterns, we consider several im-
portant points of the browsing path: its origin, the last page
before the user gets to the domain of the target page, and
the target page referrer. Each point can be of several types,
which leads to 56 possible patterns. The distribution of the
browsing paths over these patterns forms the navigational
profile of a web page.
We conducted a comprehensive large-scale study of naviga-
tional profiles of different web pages. First, we demonstrated
that the navigational profile of a web page carry crucial in-
formation about the properties of this page (e.g., its pop-
ularity and age). Second, we found that the Web consists
of several typical non-overlapping clusters formed by pages
of similar ranges of incoming traffic. These clusters can be
characterized by the functionality of their pages.