Improving Search in Social Tagging Systems by Constructing Hierarchical Structures of Concepts

Type: 
Thesis
Authors: 
JooHee Song
Authors: 
Joseph Davis
Date: 
2012, September
Published in: 
The University of Sydney
Abstract: 
Conventional tools for organizing information, such as taxonomies and directories, are becoming impractical for handling huge volume of data created in an increasing pace such as information explosion in the Internet; they require highly trained professionals and/or experts for organizing information. This approach to information organization is time-consuming even by such experts as well. Meanwhile, social tagging systems, such as Flickr and Delicious, have attracted significant attention and gained much popularity due to its practicality in the aspects of information retrieval as well as information organization, especially when handling a huge volume of information in the Internet. Despite its practicality, social tagging systems have suffered from the lack of hierarchical information structures and uncontrolled vocabulary. This thesis investigates such problems and proposes solutions, with the goal of improving social tagging systems from an information retrieval perspective. This thesis proposes a combined strategy to overcome the limitations that can be observed in traditional information organization systems and social tagging systems; in particular we focus on the latter. We approach the existing problems including the flat structure of information and uncontrolled vocabulary, by exploiting some of the features of social tagging systems in combination with a more traditional information organization tool, i.e., directories. Our solutions to such problems include constructing hierarchical structures using sets and collections, directory-like tools in Flickr, a social tagging system and pre-processing noisy terms in social tagging systems to improve the search facility. Based on the prototype of our system model that reflects our solutions, we constructed hierarchical structures of concepts using real-world data set, and we conducted a carefully designed user case study to evaluate the resulting structures. We addressed other problems that results from the uncontrolled vocabulary. To support our approach, we conducted a series of experiments, and the results validated that our approach is promising. We can improve the search facility in social tagging systems based on hidden context in social tagging systems. By conducting an experiment as well as illustrating examples, we proved that the context in social tagging systems can be beneficial to improving the information retrieval. Finally, we concluded the thesis by suggesting that possible future work such as exploiting user information to improve the search and the visualization of information in social tagging systems.