Open world probabilistic databases semantics, algorithms, complexity
➤ Gửi thông báo lỗi ⚠️ Báo cáo tài liệu vi phạmNội dung chi tiết: Open world probabilistic databases semantics, algorithms, complexity
Open world probabilistic databases semantics, algorithms, complexity
Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity 0004-3702(21)00025-4DOI:https://doi.org/IO. IOl6/j.artim.2O21.103474Reference:ARTINT 103474To appear in:Artificial IntelligenceReceived dale:Revised date:Accepted date;438584418444237Please cite this article as: i.i. Ccylan. A. Darwichc and G. Van den Brocck. Open-world probabilistic databases: Sema Open world probabilistic databases semantics, algorithms, complexity ntics, algorithms, complexity. ArtificialIntelligence, 103474. doi: https://dc>i.oig/IO.IÕIÍx'j.artint.2021.103474.This is a PDF file of an article thOpen world probabilistic databases semantics, algorithms, complexity
at has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet tJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity ut we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.Ộ 2021 Published by Elsevier.Journal Pre-proofOpen-World Pro Open world probabilistic databases semantics, algorithms, complexity babilistic Databases: Semantics, Algorithms, ComplexityIsmail ilkan Ceylana'*, Adnan Darwicheb, Guy Van den Broeckb“Department of Computer Science, UnOpen world probabilistic databases semantics, algorithms, complexity
iversity of Oxford. Wolfson liullding. t’arts Hoad. Oxford oxt 3QD. UK ^Computer Science Department, University of California, £Open world probabilistic databases semantics, algorithms, complexity
e closed-world assumption of probabilistic databases, i.e., the assumption that facts not appearing in the database have the probability zero. conflicJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity a new probabilistic data model. In this new data model, the probabilities of unknown facts, also called open facts, can be assigned any probability value from a default probability interval. Our analysis entails that our model aligns better with many real-world tasks such as query answering, relatio Open world probabilistic databases semantics, algorithms, complexity nal learning, knowledge base completion, and rule mining. We make various technical contributions. We show that the data complexity dichotomy, betweenOpen world probabilistic databases semantics, algorithms, complexity
polv normal time ami #p. for evaluating unions of conjunctive queries on probabilistic databases can be lifted to our open-world model. This result iJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity ng safe queries is in linear time for probabilistic databases, under reasonable assumptions. This remains true in open-world probabilistic datalwises for a more restricted class of safe queries. We extend our dat a complexity analysis beyond unions of conjunctive queries, and obtain a host of comple Open world probabilistic databases semantics, algorithms, complexity xity results for both classical and open-world probabilistic databases. We conclude our analysis with an in-depth investigation of the combined compleOpen world probabilistic databases semantics, algorithms, complexity
xity in the respective models.Keywords: knowledge bases, probabilistic databases, semantics, closed-world assumption, open-world assumption, inferenceJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity throughout information extraction, natural language processing (e.g.. question answering), relational learning, knowledge representation and reasoning. and databases are coming together to build large-scale knowledge bases and reason over them.6 Academic systems such as NELL [1]. DeepDive [2], Reve Open world probabilistic databases semantics, algorithms, complexity rb [3], and YAGO pl] continuously crawl the Web to extract structured information. Industry projects such as Microsoft's Probase [5], or Google’s KnowOpen world probabilistic databases semantics, algorithms, complexity
ledge Vault [6] similarly learn structured data from text, and thus populate their databases with millions of entities and billions of facts. Thus, reJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity eylan@cs.ox.ac.uk (Ismail llkan Ceylíin), darxicheộcs.uela.cdu (Adnan Diirwichc), quyvdbPcs.ucla.edu (Guy Van den Brocck)Prvpnnf submitted to Artificial Intelligence44239https://khothuvien.cori!Journal Pre-proof10 Systems such as DeepDive have been used by scientists to build knowledge bases of 0 Open world probabilistic databases semantics, algorithms, complexity e interactions, paleobiology, and geoscience, all by reading scientific publications and extracting structured knowledge [7, 8]. One of the most visibOpen world probabilistic databases semantics, algorithms, complexity
le applications of these probabilistic knowledge bases is in search engines (see, e.g., Google search results), i.e.. the standard list of relevant weJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity .15 Knowledge bases contain data which is necessarily uncertain. To go from the raw text to structmed data, information extraction systems employ a sequence of statistical machine learning techniques, from part-of-speech tagging until relation extraction [!)]. For knowledge-l»ase completion the task Open world probabilistic databases semantics, algorithms, complexity of deriving new facts from existing knowledge statistical relational learning algorithms make use of embeddings (10. 11] or probabilistic rules [12,Open world probabilistic databases semantics, algorithms, complexity
13). In both settings, the output is a predicted fact with a probability, or confidence, a value. It is therefore common to interpret such knowledge tJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity es (PDBs) [14], which indeed underlies some of these systems [6, 2], Probabilistic databases, however, lack a suitable handling of incompleteness. In particular, each of the above systems encodes only a portion of the real-world, and this description is necessarily incomplete. For example, according Open world probabilistic databases semantics, algorithms, complexity to YAGO, the average 26 number of children per person is 0.02 [15]. However, when it comes to querying, most of these systems employ the closed-worldOpen world probabilistic databases semantics, algorithms, complexity
assumption (CWA) [16] according to the tuple-independent PDB semantics, each database atom Ls an independent Bernoulli random variable, and all otherJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity paper, we revisit the CWA of probabilistic databases, and observe that the CIF.4 is violated in » the deployment of these systems, which makes it problematic to reason, learn, or mine on top of these databases. We will argue the following salient points in detail. First, query answering under the CW Open world probabilistic databases semantics, algorithms, complexity A does not take into account the effect the open-world can have on the query probability. This makes it impossible to distinguish queries whose probabOpen world probabilistic databases semantics, algorithms, complexity
ility should intuitively differ. Second, knowledge bases are part of a larger machine learning loop that continuously updates belief^ about facts baseJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity probability. The CWA does not accurately represent this mode of operation and puts it on weak footing. Third, the CWA is problematic for higher level tasks that one Is usually interested in performing on probabilistic databases, including some principled approaches to knowledge base completion and Open world probabilistic databases semantics, algorithms, complexity mining. Finally, we note that these issues are not temporary: it will never be possible to complete 40 probabilistic knowledge bases of even the mostOpen world probabilistic databases semantics, algorithms, complexity
trivial relations, as the memory requirements quickly become excessive. This already manifests itself today: statistical classifiers output facts at aJournal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0 Open world probabilistic databases semantics, algorithms, complexity or example, 99% of the facts in NELL have a probability larger than 0.91.We propose a new semantics for probabilistic knowledge liases to address these problems, based on the 46 open-world assumption (OWA). In contrast to the CWA, the OWA does not presume that the knowledge of a domain is complete, Open world probabilistic databases semantics, algorithms, complexity and as a consequence, all open atoms remain possible. Our proposal for open-world probabilistic databases (OpenPDBs) builds on the theory of impreciseOpen world probabilistic databases semantics, algorithms, complexity
probabilities, and credal sets [18], to allow interval-based probabilities for open atoms. OpenPDBs define a probability threshold to determine whichGọi ngay
Chat zalo
Facebook