Analysis of Search and Browsing Behavior of Young Users on the Web

by Sergio Duarte Torres, Ingmar Weber and Djoerd Hiemstra 

In this journal paper we expanded the study presented in paper What and How Children Search on the Web in two directions. Firstly, We provide a more detailed analysis of the topics that are searched by children on a state-of-the-art search engine by using novel classification based on fine-grained topics derived from the categories of the Yahoo! Answers service. The findings obtained through this analysis allow us to provide concrete recommendations for the development of modern IR systems for young users in specific age ranges.


Secondly, we employed toolbar logs from the Yahoo! search engine to characterize the browsing behavior of young users, particularly to understand the activities on the Internet that trigger search. We quantified the proportion of browsing and search activity in the toolbar sessions and we estimated the likelihood of a user to carry out search on the Web vertical and multimedia verticals (i.e.\ videos and images) given that the previous event is another search event or a browsing event. We found that certain group of young users are more likely to carried out multimedia search and that certain browsing events are more likely to trigger web search, such as knowledge related websites (e.g. Wikipedia).

Published at TWEB ACM, March 2014, Volume 8 Issue 2. Read the paper.

Query Recommendation in the Domain of Information for Children

by Sergio Duarte Torres, Djoerd Hiemstra, Ingmar Weber, Pavel Serdyukov. 

Children represent an increasing part of web users. One of the key problems that hamper their search experience is their limited vocabulary, their difficulty to use the right keywords, and the inappropriateness of general-purpose query suggestions. In this journal paper, we expanded the biased random walk introduced in our paper Query recommendation for Children by combining the score of the random walk with topical and language modeling features to emphasize even more the child-related aspects of the query suggestions.


We evaluate our methods using a large query log sample of queries submitted by children (from the Yahoo! Search logs). We show that our method outperforms by a large margin the query suggestions of modern search engines and state-of-the art query suggestions based on random walks.

Published at JASIST, February 2014. Read the paper.

Vertical Selection in the Information Domain of Children

by Sergio Duarte Torres, Djoerd Hiemstra and Theo Huibers 

In this paper we explore the vertical selection methods in aggregated search in the specific domain of topics for children between 7 and 12 years old. A test collection consisting of 25 verticals, 3.8K queries and relevant assessments for a large sample of these queries mapping relevant verticals to queries was built. We gather relevant assessment by envisaging two aggregated search systems: one in which the Web vertical is always displayed and in which each vertical is assessed independently from the web vertical. We show that both approaches lead to a di?erent set of relevant verticals and that the former is prone to bias of visually oriented verticals. In the second part of this paper we estimate the size of the verticals for the target domain. We show that employing the global size and domain specific size estimation of the verticals lead to significant improvements when using state-of-the art methods of vertical selection. We also introduce a novel vertical and query representation based on tags from social media and we show that its use lead to significant performance gains. Read the paper

This paper has been nominated for the best student paper award at JCDL 2013.