Reducing Fragmentation in Incremental Author Name Disambiguation

Luciano Vilas Boas Esperidião, Anderson A. Ferreira, Alberto H. F. Laender, Marcos André Gonçalves, David Menotti Gomes, Andrea Iabrudi Tavares, Guilherme Tavares de Assis


Author name ambiguity is a hard problem that occurs when several authors publish articles with the same name or when a same author publishes their articles under different names. Traditionally, automatic disambiguation methods process the author names of all citation records in a repository. Aiming efficiency, incremental methods disambiguate author names only when new citation records are inserted into the repository. As a side effect, several citation records of a same author may be associated with different authors, aka, the fragmentation problem. To diminish this problem, we propose a new merge-oriented incremental method capable of reducing such side effect, without the need to apply a traditional disambiguation method on the whole repository. Our experimental evaluation shows that our method produces significant improvements when compared to an incremental baseline and is very competitive with batch-mode methods.


author name ambiguity; bibliographic citation; incremental disambiguation

Full Text:


An official publication of the Brazilian Computer Society Special Interest Group on Databases.