Reverse-engineering reality from biased knowledge bases


Knowledge bases (KBs) provide interesting, yet incomplete samples from reality, and therefore generally do not properly represent it (e.g., Wikidata contains many more professors than janitors). These notability-based biases are somewhat systematic, however, and therefore raise the question of whether one can reverse-engineer (de-bias) parts of reality from such KBs.
The thesis shall develop and formalize a parametric process how data about reality enters KBs, and then use stochastic optimization to estimate the parameters of this process. All this should be evaluated on selected subsets of reality.