Inferring standard name form, gender and nobility from historical texts using stable model semantics


作者:Davor Lauc, Chair of Logic, Department of Philosophy, Faculty of Humanities and Social Sciences, University of Zagreb
Darko Vitek, Department of History, Centre of Croatian Studies, University of Zagreb

转载来源:Digital Humanities Quarterly, 2021, Volume 15 Number 1,

本文从连续的历史来源出发,探讨了人名表达式解析和标准姓名形式、性别和贵族地位推断的问题。这是建模历史学家分析此类来源的一个小但重要的部分,因为他们从文本中的名字中提取了大量信息,而这些信息限制了他们的搜索。解析专有名称的任务似乎很容易,但即使对现代语言来说也是一个困难的问题,对历史来源的语言来说更是一个挑战。该研究使用的测试案例来自19世纪中期萨格勒布老城中心的人口普查。为了评估和比较概率模型和规则模型在标准名称形式推理任务中的适用性,分别开发了条件随机场(CRF)和基于稳定模型语义的规则模型(Answer Set Programming Rules)。我们的结果表明,基于规则的方法比更广泛的统计方法更适合从历史文本推断标准名称形式。


Davor Lauc 

Davor Lauc is an associate professor at the Faculty of Humanities and Social Sciences at the University of Zagreb, where he serves as chair for logic, and Chief Data Scientist at He graduated in Information Science and Philosophy at the University of Zagreb in 1994, and attained his PhD in logic in 2004 at the Faculty of Humanities and Social Sciences. He has published over 30 papers and three books in the general area of Applied Logic, Philosophy of Science and Data Science. He also worked at the Institute for Business Intelligence (now Bisnode Croatia) as a founder, CTO and member of the supervisory board (1997-2009), and at the National Genealogical Institute as the founder and CEO.

Darko Vitek 

Darko Vitek was born in 1970 in Vukovar. He graduated from the Faculty of Philosophy in Zagreb, where he graduated from history. The subject of his scientific interest is the history of the early modern period, urban history, theory of history and digital humanities. He is employed at the Croatian Studies at the University of Zagreb, where as an associate professor works on several courses in the field of his scientific interest.
