3.1 Key Trends in Long-Term Search Engine Usage
We
first analyze the data using dimensionality reduction techniques to
develop
insights
about key trends. In particular, we use non-negative matrix
factorization
(NNMF)
[10] of the data to summarize key behavioral patterns using a small
number
of
basis vectors. NNMF is a method to obtain a representation of data
using
non-negativity
constraints. This leads to part-based representation such that the
original
data can be represented as an additive combination of a small number
of
basis
vectors.
Formally,
we construct A,
an M×N
matrix,
where the columns represent users (N=
10,000
users) and the rows are the vector representation of 26 weeks of
search engine
usage
statistics. The observations we used for our initial analysis are the
proportion
of
queries that each user issued to each of the three search engines,
thus M=
26
×
3.