Data Completeness – Knowledge Bases

Simon Razniewski

Research Scientist
NLP and Neuro-Symbolic AI group
Bosch Center for AI
Renningen, Germany
srazniew@mpi-inf.mpg.de

About

Simon Razniewski is a research scientist in the NLP and Neuro-Symbolic AI group at the Bosch Center for AI, where he develops novel methods for extracting and consolidating knowledge.

He was previously a senior researcher at the Max Planck Institute for Informatics (2017-2021), where he was heading the Knowledge Base Construction and Quality area, and assistant professor at the Free University of Bozen Bolzano¹ (2014-2017). He holds a PhD from the Free University of Bozen-Bolzano (2014), and a Diplom (MSc.) from TU Dresden (2010; not this Dresden). He spent time as visitor at the Max-Planck Institute for Informatics (2016), the University of Queensland (2015), AT&T Labs-Research (2013), the University of California, San Diego (2012), and has previous industrial experience from Globalfoundries (2010) and Siemens IT (2009). He has published 20 papers at premier² conferences in the area of data science and management (and more than 60 papers in total).

¹ 2018 world’s 9th best small university according to THE

² A* or A in the CORE 2018 ranking

Tweets by SimonRazniewski

Publications

Premier conference publications¹

[23] Blerta Veselhi, Sneha Singhania, Simon Razniewski, Gerhard Weikum. Evaluating Language Models for Knowledge Base Completion, ESWC, 2023 – Acceptance rate: 19%
[22] Tuan-Phong Nguyen, Simon Razniewski, Aparna Varde, Gerhard Weikum. Extracting Cultural Commonsense Knowledge at Scale, WWW, 2023 – Acceptance rate: 19%
[21] Julien Romero and Simon Razniewski. Do Children Texts Hold The Key To Commonsense Knowledge?, EMNLP, 2022
[20] Hiba Arnaout, Simon Razniewski, Gerhard Weikum and Jeff Z. Pan. UnCommonSense: Informative Negative Knowledge about Everyday Concepts, CIKM, 2022 – Acceptance rate: 23%
[19] Shrestha Ghosh, Simon Razniewski, Gerhard Weikum. Answering Count Query with Explanatory Evidence , SIGIR, 2022 – Acceptance rate: 25%
[18] Vinh Thinh Ho, Koninika Pal, Simon Razniewski, Klaus Berberich and Gerhard Weikum. Extracting Contextualized Quantity Facts from Web Tables, WWW, 2021 – Acceptance rate: 21%
[17] Tuan-Phong Nguyen, Simon Razniewski and Gerhard Weikum. Advanced Semantics for Commonsense Knowledge Extraction, WWW, 2021 – Acceptance rate: 21%
[16] Sreyasi Nag Chowdhury, Simon Razniewski and Gerhard Weikum. SANDI: Story-and-Images Alignment, EACL, 2021
[15] Julien Corman, Davide Lanti, Diego Calvanese, Simon Razniewski. Counting Query Answers over a DL-Lite KB, IJCAI, 2020 – Acceptance rate: 13%
[14] Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum. ENTYFI: Entity Typing in Fictional Texts, WSDM, 2020 – Acceptance rate: 15%
[13] Simon Razniewski, Nitisha Jain, Paramita Mirza, Gerhard Weikum. Coverage of Information Extraction from Sentences and Paragraphs, EMNLP, 2019 – Acceptance rate: 20%
[12] Julien Romero, Simon Razniewski, Koninika Pal, Jeff Z. Pan, Archit Sakhadeo, Gerhard Weikum. Commonsense Properties from Query Logs and Question Answering Forums, CIKM, 2019 – Acceptance rate: 20%
[11] Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum. TiFi: Taxonomy Induction for Fictional Domains, WWW, 2019 – Acceptance rate: 20%
[10] Paramita Mirza, Simon Razniewski, Fariz Darari and Gerhard Weikum. Enriching Knowledge Bases with Counting Quantifiers, ISWC, 2018 – Acceptance rate: 23%
[9] Thomas Pellissier Tanon, Daria Stepanova, Simon Razniewski, Paramita Mirza and Gerhard Weikum. Completeness-aware Rule Learning from Knowledge Graphs, ISWC, 2017 – Acceptance rate: 22%
[8] Paramita Mirza, Simon Razniewski, Fariz Darari, Gerhard Weikum. Cardinal Virtues: Extracting Relation Cardinalities from Text, ACL, 2017 – Acceptance rate: 18%
[7] Luis Galárraga, Simon Razniewski, Antoine Amarilli, Fabian M. Suchanek. Predicting Completeness in Knowledge Bases, WSDM, 2017 – Acceptance rate: 16%
[6] Simon Razniewski. Optimizing Update Frequencies for Decaying Information, CIKM, 2016 – Acceptance rate: 23%
[5] Simon Razniewski, Flip Korn, Werner Nutt, Divesh Srivastava. Identifying the Extent of Completeness of Query Answers over Partially Complete Databases, SIGMOD, 2015 – Acceptance rate: 26%
[4] Simon Razniewski, Marco Montali and Werner Nutt. Verification of Query Completeness over Processes, BPM, 2013 – Acceptance rate: 14%
[3] Fariz Darari, Werner Nutt, Giuseppe Pirro, Simon Razniewski. Completeness Statements about RDF Data Sources und Their Use for Query Answering, ISWC, 2013 – Acceptance rate: 22%
[2] Werner Nutt and Simon Razniewski. Completeness of Queries over SQL Databases, CIKM, 2012 – Acceptance rate: 13%
[1] Simon Razniewski and Werner Nutt. Completeness of Queries over Incomplete Databases, VLDB, 2011 – Acceptance rate: 18%

¹Rank A* or A in the Core 2018 ranking (http://www.core.edu.au/conference-portal)

Journal publications

[8] Shrestha Ghosh, Simon Razniewski, Gerhard Weikum. Answering Count Questions with Structured Answers from Text, JWS, 2023
[7] Tuan-Phong Nguyen, Simon Razniewski, Julien Romero and Gerhard Weikum. Refined Commonsense Knowledge from Large-Scale Web Contents , TKDE, 2022
[6] Sneha Singhania, Simon Razniewski, Gerhard Weikum. Predicting Document Coverage for Relation Extraction, TACL, 2022
[5] Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan. Negative Statements Considered Useful, JWS, 2021
[4] Gerhard Weikum, Xin Luna Dong, Simon Razniewski, Fabian Suchanek. Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases , FnT, 2021
[3] Shrestha Ghosh, Simon Razniewski, Gerhard Weikum. Uncovering Hidden Semantics of Set Information in Knowledge Bases, JWS, 2020
[2] Fariz Darari, Werner Nutt, Simon Razniewski, Sebastian Rudolph. Completeness and Soundness Guarantees for Conjunctive SPARQL Queries over RDF Data Sources with Completeness Statements, SWJ, 2018
[1] Fariz Darari, Werner Nutt, Giuseppe Pirro, Simon Razniewski. Completeness Management for RDF Data Sources, TWEB, 2018

Demo publications²

[11] Hiba Arnaout, Tuan-Phong Nguyen, Simon Razniewski, Gerhard Weikum. UnCommonSense in Action! Informative Negations for Commonsense Knowledge Bases , WSDM, 2023
[10] Shrestha Ghosh, Simon Razniewski, Gerhard Weikum. CoQEx: Entity Counts Explained, WSDM, 2023
[9] Aditya Bikram Biswas, Hiba Arnaout, Simon Razniewski. Neguess: Wikidata-entity guessing game with negative clues, ISWC, 2021
[8] Tuan-Phong Nguyen, Simon Razniewski and Gerhard Weikum. Inside ASCENT: Exploring a Deep Commonsense Knowledge Base and its Usage in Question Answering, ACL, 2021
[7] Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan. Wikinegata: A Knowledge Base with Interesting Negative Statements, VLDB, 2021
[6] Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum. ENTYFI: A System for Fine-grained Entity Typing in Fictional Texts, EMNLP, 2020
[5] Yohan Chalier, Simon Razniewski, Gerhard Weikum. Dice: A Joint Reasoning Framework for Multi-Faceted Commonsense Knowledge, ISWC, 2020
[4] Julien Romero and Simon Razniewski. Inside Quasimodo: Exploring Construction and Usage of Commonsense Knowledge, CIKM, 2020
[3] Shrestha Ghosh, Simon Razniewski, Gerhard Weikum. CounQER: A System for Discovering and Linking Count Information in Knowledge Bases, ESWC, 2020
[2] William Cheng, Sreyasi Nag Chowdhury, Gerard de Melo, Simon Razniewski and Gerhard Weikum. SANDI: A Tool for Alignment of Images within Text, WSDM, 2020
[1] Fariz Darari, Radityo Eko Prasojo, Simon Razniewski and Werner Nutt. COOL-WD: A Completeness Tool for Wikidata, ISWC, 2017

² These tracks require fully-fledged implemented systems, and typically come with acceptance rates around 30%.

Other peer-reviewed publications

[37] Sneha Singhania, Tuan-Phong Nguyen, Simon Razniewski. Knowledge Base Construction from Pre-trained Language Models, ISWC, 2022
[36] Tuan-Phong Nguyen and Simon Razniewski. Materialized Knowledge Bases from Commonsense Transformers, CSRR@ACL, 2022
[35] Hiba Arnaout, Trung-Kien Tran, Daria Stepanova, Mohamed H. Gad-Elrab, Simon Razniewski and Gerhard Weikum. Utilizing LM Probes for KG Repair, Wiki Workshop@WWW, 2022
[34] Simon Razniewski, Andrew Yates, Nora Kassner, Gerhard Weikum. Language Models As or For Knowledge Bases, DL4KG@ISWC, 2021
[33] Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum. KnowFi: Knowledge Extraction from Long Fictional Texts, AKBC, 2021
[32] Simon Razniewski, Hiba Arnaout, Shrestha Ghosh, Fabian Suchanek. Completeness, Recall, and Negation in Open-World Knowledge Bases, VLDB tutorial, 2021
[31] Sreyasi Nag Chowdhury, Rajarshi Bhowmik, Hareesh Ravi, Gerard de Melo, Simon Razniewski and Gerhard Weikum. Exploiting Image–Text Synergy for Contextual Image Captioning, Lantern@EACL, 2021
[30] Hiba Arnaout, Simon Razniewski, Gerhard Weikum, Jeff Z. Pan. Discovering the boundaries of open-world Wikidata with Wikinegata, Wiki Workshop@WWW, 2021
[29] Nadyah Hani Ramadhana, Fariz Darari, Panca O. Hadi Putra, Werner Nutt, Simon Razniewski, and Refo Ilmiya Akbar. User-Centered Design for Knowledge Imbalance Analysis: A Case Study of ProWD, VOILA@ISWC, 2020
[28] Hiba Arnaout, Simon Razniewski and Gerhard Weikum. Enriching Knowledge Bases with Negative Statements, ISWC best sister conference track, 2020
[27] Simon Razniewski and Priyanka Das. Structured knowledge: Have we made progress? An empirical study of KB coverage over 19 years, CIKM poster, 2020
[26] Julien Corman, Davide Lanti, Diego Calvanese, Simon Razniewski. Rewriting Count Queries over DL-Lite TBoxes with Cardinality Restrictions, DL, 2020
[25] Yohan Chalier, Simon Razniewski and Gerhard Weikum. Joint Reasoning for Multi-Faceted Commonsense Knowledge, AKBC, 2020
[24] Hiba Arnaout, Simon Razniewski and Gerhard Weikum. Enriching Knowledge Bases with Negative Statements, AKBC, 2020
[23] Avicenna Wisesa, Fariz Darari, Adila Krisnadhi, Werner Nutt and Simon Razniewski. Wikidata Completeness Profiling Using ProWD, K-CAP, 2019
[22] Ioannis Dikeoulias, Jannik Strötgen, Simon Razniewski. Epitath or Breaking News? Analyzing and Predicting the Stability of Knowledge Base Properties, TempWeb@WWW, 2019
[21] Simon Razniewski and Gerhard Weikum. Knowledge Base Recall: Detecting and Resolving the Unknown Unknowns, SIGWEB newsletter, 2018
[20] Thomas Pellissier Tanon, Daria Stepanova, Simon Razniewski, Paramita Mirza and Gerhard Weikum. Completeness-aware Rule Learning from Knowledge Graphs, IJCAI best sister conference track, 2018
[19] Fariz Darari, Werner Nutt, Simon Razniewski. Comparing Index Structures for Completeness Reasoning, IWBIS, 2018
[18] Vevake Balaraman, Simon Razniewski and Werner Nutt. Recoin: Relative Completeness in Wikidata, Wiki Workshop@WWW, 2018
[17] Simon Razniewski, Vevake Balaraman, Werner Nutt. Doctoral Advisor or Medical Condition: Towards Entity-specific Rankings of Knowledge Base Properties, ADMA, 2017
[16] Luis Galárraga, Katja Hose, Simon Razniewski. Enabling Completeness-aware Querying in SPARQL, WebDB@SIGMOD, 2017
[15] Albin Ahmeti, Simon Razniewski, Axel Polleres. Assessing the Completeness of Entities in Knowledge Bases, ESWC poster, 2017
[14] Paramita Mirza, Simon Razniewski and Werner Nutt. Expanding Wikidata’s Parenthood Information by 178%, or How To Mine Relation Cardinalities, ISWC poster, 2016
[13] Simon Razniewski, Shazia Sadiq, and Xiaofang Zhou. Exploiting Hierarchies for Efficient Detection of Completeness in Stream Data, ADC, 2016
[12] Radityo Eko Prasojo, Fariz Darari, Simon Razniewski, and Werner Nutt. Managing and Consuming Completeness Information for Wikidata Using COOL-WD, COLD@ISWC, 2016
[11] Simon Razniewski, Fabian Suchanek and Werner Nutt. But What Do We Actually Know?, AKBC, 2016
[10] Simon Razniewski, Ognjen Savkovic and Werner Nutt. Turning The Partial-closed World Assumption Upside Down, AMW, 2016
[9] Fariz Darari, Simon Razniewski, Radityo Eko Prasojo, Werner Nutt. Enabling Fine-grained RDF Data Completeness Assessment, ICWE, 2016
[8] Simon Razniewski and Werner Nutt. Long-term Optimization of Update Frequencies for Decaying Information, WebDB@SIGMOD, 2015
[7] Vincenzo Del Fatto, Gabriella Dodero, Rosella Gennari, Alessandra Melonio, Marco Montali, Simon Razniewski, Santina Torello, Xiaofeng Wang, Floriano Zini. Gamified Children Universities: An Exploratory Study, CHI PLAY, 2014
[6] Simon Razniewski and Werner Nutt. Adding Completeness Information to Query Answers over Spatial Data, SIGSPATIAL, 2014
[5] Simon Razniewski and Werner Nutt. Databases under the Partial Closed-world Assumption: A Survey, GvDB, 2014
[4] Fariz Darari, Simon Razniewski and Werner Nutt. Bridging the Semantic Gap between RDF and SPARQL using Completeness Statements, ISWC poster, 2014
[3] Simon Razniewski and Werner Nutt. Assessing the Completeness of Geographical Data, BNCOD, 2013
[2] Werner Nutt, Simon Razniewski and Gil Vegliach. Incomplete Databases: Missing Records and Missing Values, DQDI@DASFAA, 2012
[1] Simon Razniewski and Werner Nutt. Checking Query Completeness over Incomplete Data, LID@EDBT/ICDT, 2011

Patent applications

[1] Simon Razniewski, Ioannis Dikeoulias, Jannik Stroetgen. Method for Predicting a Persistence Over Time of Entries of a Knowledge Base, US Patent application 16/667,673, 2020

Thesis

Query-driven Data Completeness Management, PhD Thesis, Free University of Bozen-Bolzano, 2014

Demos

Ascent (2021): Advanced semantics for commonsense knowledge
Wikinegata (2021): Negative knowledge for Wikidata
Entyfi (2020): Entity typing in fiction
Sandi (2020): Story-image alignment
Dice (2020): Joint reasoning for multifaceted commonsense knowledge
CounQER (2020): Counting queries and entity valued relations
ReCoin (2017): A user script for adding relative completeness annotations to Wikidata. Developed by Vevake Balaraman and Albin Ahmeti in the context of the TaDaQua project (Video in Danish)
COOL-WD (2017): A completeness tool for Wikidata. Developed by Radityo Eko Prasojo and Fariz Darari
MAGIC (2012): A tool for reasoning about the completeness of relational databases, developed by Ognjen Savkovic, Paramita Mirza and Alex Tomasi
Lookslikescanned: A website to make PDFs look appear like scanned

Other scripts and datasets

Cinex: Code, experiment data and SPARQL endpoint with Wikipedia results for counting quantifier extraction
Property Ranking: Dataset of 350 (entity, property1, property2) pairs for humans in Wikidata, along with a preference judgment [O18]
A dataset of about 2000 crowdsourced completeness assertions for YAGO and Wikidata.

Media appearances

Research

Topics

My research is centered around the theme of knowledge base construction and curation. It is rooted in foundations of logics/data management, machine learning and natural language processing, and finds current application in KB recall assessment, and encyclopedic, fictional and common-sense knowledge bases (sample slides). KB recall assessment also unifies much of my research, with further details on this project page, and sample slides here.

Acquired grants (selection)

My research is supported by:

DFG individual research grant of 312,000 € for compiling negative knowledge at web scale, 2021
Diffbot grant of $30,000 for KB construction in fiction, 2020
Google Cloud Credit grant of $5,000 for research on commonsense knowledge, 2019
NVIDIA hardware grant (~$1,100) for research on information extraction, 2018
Free University of Bozen-Bolzano, acquired projects grants Recall (20,000 €, 2016-17), TQTK (20,000 €, 2016-17), TaDaQua (50,000 €, 2016-18)
Province of South Tyrol, open research call, coauthored project MAGIC (250,000 €, 2013-16)

Other grant proposals

ERC starting grant 2020: Proposal received grade A but not funded due to budget constraints

Organization

Guest editor of SWJ special issue on Wikidata (2023), on commonsense knowledge (2021)
Co-organizer, Wikidata workshop at ISWC 2023, 2022, 2021
Co-organizer, KBC-LM workshop at ISWC 2023
Co-organizer, LM-KBC challenge at ISWC 2024, 2023, 2022
Website chair, ESSLLI 2016 summer school

Reviewing

2024: senior PC member of CIKM, PC member of ACL RR, IJCAI demos, KaLLM workshop@ACL, KI, AKR³ workshop
2023: Area chair of EMNLP and EACL, senior PC member of CIKM, ESWC, PC member of ACL ARR, neurIPS, VLDB demos, IJCAI demos, LDK, VLDB tutorials, board member of TGDK, reviewer for TODS
2022: Senior PC member of WWW, CIKM, PC member of WSDM, ESWC, EMNLP, VLDB demos, reviewer for neurIPS
2021: Area chair of IJCAI, senior PC member of CIKM, PC member of WSDM, ACL, NAACL, EMNLP, ESWC, WWW, TKDE, CSKGs workshop@AAAI, LDK, QOD
2020: Senior PC member of CIKM, IJCAI, ISWC, PC member of WSDM, ACL, ESWC, QOD, Wikidata workshop@ISWC, SemDeep-6@IJCAI
2019: PC member of WWW, ACL, AAAI, NAACL, ISWC, K-CAP, AKBC, LDK, Data Quality in Wikidata workshop, Quality of Open Data workshop, Commonsense Inference workshop@EMNLP
2018: PC member of ACL, CIKM, EMNLP, DASFAA, SemDeep-4, DL4KGs, QOD, RoD. Reviewer for VLDB Journal
2017: PC member of CIKM, WebDB. Reviewer for SWJ
2011-2016: External reviewer for AAMAS 2016, SEBD 2015, CIKM 2014, COOPIS 2013, TbiLLC 2013, CIKM 2011, BNCOD 2011.

Grant reviewing

DFG (2023, 2020)
GACR (2021)

Teaching

Saarland University

Advanced lecture “Automated Knowledge Base Construction”, summer term 2022
Seminar “Commonsense Knowledge Extraction and Consolidation”, winter term 2020/21
Advanced lecture “Information Extraction”, winter term 2019/20
Seminar “Advanced Topics in Knowledge Bases”, winter term 2018/19
Seminar “Knowledge Bases”, winter term 2017/18

FU Bozen-Bolzano

Lecture+Lab “Distributed Systems”, 2017
Lecture+Lab “Distributed Systems”, 2016
Lecture+Lab “Distributed Systems”, 2015
Lab “Data Structures and Algorithms”, 2013
Lab “Data Structures and Algorithms”, 2012

Tutorials on commonsense knowledge

Information to Wisdom: Commonsense Knowledge Extraction and Compilation, tutorial at WSDM 2021 (w/ Niket Tandon, Aparna S. Varde)
Commonsense Knowledge Acquisition and Representation, tutorial at AAAI 2021 (w/ Filip Ilievski, Antoine Bosselut, Mayank Kejriwal) [Youtube] [slides]
Commonsense Knowledge Extraction and Consolidation, tutorial at KI 2020

Tutorials on completeness, coverage and negation in KBs

Completeness, Recall, and Negation in Open-World Knowledge Bases, tutorial at ISWC 20201 (w/ Hiba Arnaout, Shrestha Ghosh, Fabian Suchanek)
At VLDB’21
At KR’21
At WWW’22

Theses and internships

*** As of February 2022, I currently have no capacity for hosting internships, research assistants (HiWis), or BSc./MSc./PhD theses. The only exception are internships of PhD students. ***

Other

I hold a glider pilot license, which I used until recently to fly over the Eastern Alps with the Aeroclub Bozen-Bolzano. My flightbook: [1]. Some favourite videos (not by me): [2], [3]. Very old one by myself: [4].
If not replying to email, I might occasionally be traveling in places like Kyrgyzstan, Iran, or Uganda.
I’ve circled the globe (crossed the international date line without return) 3 times.
Curious fact: The extent of Indonesia (main islands only) at 5300 km is more than the distance from Germany to China, Kenya, or the North Pole (4630, 5200, 3920 km, respectively).
Apologies for any typos in my emails, I didn’t write them