Skema > Faculty and Research > Publication-details
 

FACULTY AND RESEARCH

 

 

Publication

Excavating the mother lode of human-generated text: A systematic review of research that uses the Wikipedia corpus
Mohamad Mehdi
, ,
Mostafa Mesgari
,
Finn Nielsen
,
Arto Lanamäkie
2017, Information Processing and Management, 53(2), pp.505-529
Abstract
Although primarily an encyclopedia, Wikipedia’s expansive content provides a knowledge base that has been continuously exploited by researchers in a wide variety of domains. This article systematically reviews the scholarly studies that have used Wikipedia as a data source, and investigates the means by which Wikipedia has been employed in three main computer science research areas: information retrieval, natural language processing, and ontology building. We report and discuss the research trends of the identified and examined studies. We further identify and classify a list of tools that can be used to extract data from Wikipedia, and compile a list of currently available data sets extracted from Wikipedia.

Why choose SKEMA?
At the top of French and international rankings SEE RANKINGS
A global business school SEE SKEMA NEWS
A wide range of programmes COMPARE