Large-scale KB construction from LLMs


Multiple works [1,2,3] have explored how to, in principle, construct KBs from LLMs. Yet none has actually done that, at scale, while striving for quality. The goal of this thesis is to construct a KB for 1 Million entities in Wikidata, from an LLM alone. The output might serve as a complementary resource, and the endeavour would be an enriching research experience, towards what really works, and what doesn’t, in terms of the previously presented academic ideas.


[1] Veseli, Blerta, et al. “Evaluating language models for knowledge base completion.” European Semantic Web Conference. Cham: Springer Nature Switzerland, 2023.

[2] Sun, Kai, et al. “Head-to-tail: How knowledgeable are large language models (llm)? AKA will llms replace knowledge graphs?.” arXiv preprint arXiv:2308.10168 (2023).

[3] Cohen, Roi, et al. “Crawling The Internal Knowledge-Base of Language Models.” Findings of the Association for Computational Linguistics: EACL 2023. 2023.