KHO THƯ VIỆN 🔎

Open world probabilistic databases semantics, algorithms, complexity

➤  Gửi thông báo lỗi    ⚠️ Báo cáo tài liệu vi phạm

Loại tài liệu:     PDF
Số trang:         46 Trang
Tài liệu:           ✅  ĐÃ ĐƯỢC PHÊ DUYỆT
 













Nội dung chi tiết: Open world probabilistic databases semantics, algorithms, complexity

Open world probabilistic databases semantics, algorithms, complexity

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity 0004-3702(21)00025-4DOI:https://doi.org/IO. IOl6/j.artim.2O21.103474Reference:ARTINT 103474To appear in:Artificial IntelligenceReceived dale:Revised d

ate:Accepted date;438584418444237Please cite this article as: i.i. Ccylan. A. Darwichc and G. Van den Brocck. Open-world probabilistic databases: Sema Open world probabilistic databases semantics, algorithms, complexity

ntics, algorithms, complexity. ArtificialIntelligence, 103474. doi: https://dc>i.oig/IO.IÕIÍx'j.artint.2021.103474.This is a PDF file of an article th

Open world probabilistic databases semantics, algorithms, complexity

at has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet t

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity ut we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered wh

ich could affect the content, and all legal disclaimers that apply to the journal pertain.Ộ 2021 Published by Elsevier.Journal Pre-proofOpen-World Pro Open world probabilistic databases semantics, algorithms, complexity

babilistic Databases: Semantics, Algorithms, ComplexityIsmail ilkan Ceylana'*, Adnan Darwicheb, Guy Van den Broeckb“Department of Computer Science, Un

Open world probabilistic databases semantics, algorithms, complexity

iversity of Oxford. Wolfson liullding. t’arts Hoad. Oxford oxt 3QD. UK ^Computer Science Department, University of California, £

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity They are continuously extended with new data, powered by modem information extraction tools that associate probabilities with knowledge base facts. T

he state of the art to store and process such data is founded on probabilistic databases. Many systems based on probabilistic databases, however, stil Open world probabilistic databases semantics, algorithms, complexity

l have certain semantic deficiencies, which limit their potential applications. We revisit the semantics of probabilistic databases, and argue that th

Open world probabilistic databases semantics, algorithms, complexity

e closed-world assumption of probabilistic databases, i.e., the assumption that facts not appearing in the database have the probability zero. conflic

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity a new probabilistic data model. In this new data model, the probabilities of unknown facts, also called open facts, can be assigned any probability va

lue from a default probability interval. Our analysis entails that our model aligns better with many real-world tasks such as query answering, relatio Open world probabilistic databases semantics, algorithms, complexity

nal learning, knowledge base completion, and rule mining. We make various technical contributions. We show that the data complexity dichotomy, between

Open world probabilistic databases semantics, algorithms, complexity

polv normal time ami #p. for evaluating unions of conjunctive queries on probabilistic databases can be lifted to our open-world model. This result i

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity ng safe queries is in linear time for probabilistic databases, under reasonable assumptions. This remains true in open-world probabilistic datalwises

for a more restricted class of safe queries. We extend our dat a complexity analysis beyond unions of conjunctive queries, and obtain a host of comple Open world probabilistic databases semantics, algorithms, complexity

xity results for both classical and open-world probabilistic databases. We conclude our analysis with an in-depth investigation of the combined comple

Open world probabilistic databases semantics, algorithms, complexity

xity in the respective models.Keywords: knowledge bases, probabilistic databases, semantics, closed-world assumption, open-world assumption, inference

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity throughout information extraction, natural language processing (e.g.. question answering), relational learning, knowledge representation and reasonin

g. and databases are coming together to build large-scale knowledge bases and reason over them.6 Academic systems such as NELL [1]. DeepDive [2], Reve Open world probabilistic databases semantics, algorithms, complexity

rb [3], and YAGO pl] continuously crawl the Web to extract structured information. Industry projects such as Microsoft's Probase [5], or Google’s Know

Open world probabilistic databases semantics, algorithms, complexity

ledge Vault [6] similarly learn structured data from text, and thus populate their databases with millions of entities and billions of facts. Thus, re

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity eylan@cs.ox.ac.uk (Ismail llkan Ceylíin), darxicheộcs.uela.cdu (Adnan Diirwichc), quyvdbPcs.ucla.edu (Guy Van den Brocck)Prvpnnf submitted to Artifici

al Intelligence44239https://khothuvien.cori!Journal Pre-proof10 Systems such as DeepDive have been used by scientists to build knowledge bases of 0 Open world probabilistic databases semantics, algorithms, complexity

e interactions, paleobiology, and geoscience, all by reading scientific publications and extracting structured knowledge [7, 8]. One of the most visib

Open world probabilistic databases semantics, algorithms, complexity

le applications of these probabilistic knowledge bases is in search engines (see, e.g., Google search results), i.e.. the standard list of relevant we

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity .15 Knowledge bases contain data which is necessarily uncertain. To go from the raw text to structmed data, information extraction systems employ a se

quence of statistical machine learning techniques, from part-of-speech tagging until relation extraction [!)]. For knowledge-l»ase completion the task Open world probabilistic databases semantics, algorithms, complexity

of deriving new facts from existing knowledge statistical relational learning algorithms make use of embeddings (10. 11] or probabilistic rules [12,

Open world probabilistic databases semantics, algorithms, complexity

13). In both settings, the output is a predicted fact with a probability, or confidence, a value. It is therefore common to interpret such knowledge t

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity es (PDBs) [14], which indeed underlies some of these systems [6, 2], Probabilistic databases, however, lack a suitable handling of incompleteness. In

particular, each of the above systems encodes only a portion of the real-world, and this description is necessarily incomplete. For example, according Open world probabilistic databases semantics, algorithms, complexity

to YAGO, the average 26 number of children per person is 0.02 [15]. However, when it comes to querying, most of these systems employ the closed-world

Open world probabilistic databases semantics, algorithms, complexity

assumption (CWA) [16] according to the tuple-independent PDB semantics, each database atom Ls an independent Bernoulli random variable, and all other

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity paper, we revisit the CWA of probabilistic databases, and observe that the CIF.4 is violated in » the deployment of these systems, which makes it prob

lematic to reason, learn, or mine on top of these databases. We will argue the following salient points in detail. First, query answering under the CW Open world probabilistic databases semantics, algorithms, complexity

A does not take into account the effect the open-world can have on the query probability. This makes it impossible to distinguish queries whose probab

Open world probabilistic databases semantics, algorithms, complexity

ility should intuitively differ. Second, knowledge bases are part of a larger machine learning loop that continuously updates belief^ about facts base

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity probability. The CWA does not accurately represent this mode of operation and puts it on weak footing. Third, the CWA is problematic for higher level

tasks that one Is usually interested in performing on probabilistic databases, including some principled approaches to knowledge base completion and Open world probabilistic databases semantics, algorithms, complexity

mining. Finally, we note that these issues are not temporary: it will never be possible to complete 40 probabilistic knowledge bases of even the most

Open world probabilistic databases semantics, algorithms, complexity

trivial relations, as the memory requirements quickly become excessive. This already manifests itself today: statistical classifiers output facts at a

Journal Pre-proofOpen-world probabilistic databases: Semantics, algorithms, complexityIsmail llkan Ceylan. Adnan Darwiche and Guy Van den BroeekPll:S0

Open world probabilistic databases semantics, algorithms, complexity or example, 99% of the facts in NELL have a probability larger than 0.91.We propose a new semantics for probabilistic knowledge liases to address thes

e problems, based on the 46 open-world assumption (OWA). In contrast to the CWA, the OWA does not presume that the knowledge of a domain is complete, Open world probabilistic databases semantics, algorithms, complexity

and as a consequence, all open atoms remain possible. Our proposal for open-world probabilistic databases (OpenPDBs) builds on the theory of imprecise

Open world probabilistic databases semantics, algorithms, complexity

probabilities, and credal sets [18], to allow interval-based probabilities for open atoms. OpenPDBs define a probability threshold to determine which

Gọi ngay
Chat zalo
Facebook