A query facet is a set of items which describe and summarize one important aspect of a
query. Here a facet item is typically a word or a
phrase. A query may have multiple facets that summarize the information about the query from different perspectives. Table 1 shows sample facets for
some queries. Facets for the query “watches” cover
the knowledge about watches in five unique aspects, including brands, gender categories, supporting features, styles, and colors. The query “visit
Beijing” has a query facet about popular resorts
in Beijing ( tiananmen square, forbidden city,
summer palace, ...) and a facet on travel related
topics ( attractions, shopping, dining, ...).
Query facets provide interesting and useful knowledge about a query and thus can be used to improve
search experiences in many ways. First, we can display query facets together with the original search
results in an appropriate way. Thus, users can understand some important aspects of a query without
browsing tens of pages. For example, a user could
learn different brands and categories of watches. We
can also implement a faceted search based on the mined query facets. User can clarify
their specific intent by selecting facet items. Then
search results could be restricted to the documents
that are relevant to the items. A user could drill down
to women’s watches if he is looking for a gift for his wife. These multiple groups of query facets are
in particular useful for vague or ambiguous queries,
such as “apple”. We could show the products of
Apple Inc. in one facet and different types of the fruit
apple in another. Second, query facets may provide
direct information or instant answers that users are
seeking. For example, for the query “lost season 5”,
all episode titles are shown in one facet and main
actors are shown in another. In this case, displaying
query facets could save browsing time. Third, query
facets may also be used to improve the diversity of the
ten blue links. We can re-rank search results to avoid
showing the pages that are near-duplicated in query
facets at the top. Query facets also contain structured
knowledge covered by the query, and thus they can
be used in other fields besides traditional web search,
such as semantic search or entity search.
We assume that the important aspects of a query are usually presented and repeated
in the query’s top retrieved documents in the style of lists, and query facets can be mined out by aggregating these significant lists.
We propose a systematic solution, which we refer to as QDMiner, to automatically mine query facets by extracting and grouping
frequent lists from free text, HTML tags, and repeat regions within top search results.