A Multi-view Approach for the Quality Assessment of Wiki Articles

Daniel Hasan Dalip, Marcos André Gonçalves, Thiago Cardoso, Marco Cristo, Pável Calado

Abstract


Wikipedia is a great example of a very large repository of information with free access and open edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its quality. To deal with this problem, some studies attempt to assess the quality of articles in Wikipedia automatically. In these studies, a large number of quality indicators is usually collected and then combined in order to obtain a single value representing the quality of the article. In this work, we propose to group these indicators in semantically meaningful views of quality and investigate a new approach to combine these views based on a meta-learning method, known as stacking. Particularly, we grouped the indicators into three views (textual, review history and citation graph), and demonstrated that it is possible to use this approach in collaborative encyclopedias such as Wikipedia and Wikia. In our experimental evaluation, we obtained gains of up to 18% compared the state-of-the-art quality assessment method that  considers all indicators at once.

Keywords


Quality Assessment, Wikipedia, Machine Learning, SVM, Multi-View

Full Text:

PDF


An official publication of the Brazilian Computer Society Special Interest Group on Databases.