Do you care where your data comes from?
WeST
Steffen Staab
[email protected]
Semantic Days 1
How to loose 1,000,000,000 US$ in half a day
WeST Via @Bauckhage
Steffen Staab
[email protected]
Semantic Days 2
+++ „Los Angeles (dpa) – In der kalifornischen Kleinstadt Bluewater soll es nach einem Bericht des örtlichen Senders vpk-tv zu einem Selbstmordanschlag gekommen sein. Es habe in einem Restaurant zwei Explosionen gegeben...“ +++
German Press Agency DPA, 10 Sep 2009
WeST
Steffen Staab
[email protected]
Semantic Days 3
Guerilla Marketing
WeST
Steffen Staab
[email protected]
Semantic Days 4
Loosing your reputation quickly…
Hoax better check who said what when and whether you actually want to trust some information
WeST
Steffen Staab
[email protected]
Semantic Days 5
Web Science & Technologies University of Koblenz ▪ Landau, Germany
Knowing Something About Your Semantic Web Data Steffen Staab Joint work with Simon Schenk, Renata Dividino, Christoph Ringelstein
The situation… Call to Ontoprise from an insurance company: „Can you integrate our 5000 databases?“ EU IP experience (Large Engineering Company): „oh, we just found another PC that has several tens of thousands of relevant documents“ Linked open data cloud
WeST
Steffen Staab
[email protected]
Semantic Days 7
Some of the problems…
I have this piece of data. Can I actually believe it? Default answer: Find some expert and ask him.
I have this inconsistency in my data. Who has introduced it and why? Default answer: Try to find it in the sources.
I have this piece of data. How can I use it? Can I show it to anyone? Default answer: • You are not allowed to do anything with it. Just throw it away. WeST
Steffen Staab
[email protected]
Semantic Days 8
Defining Provenance
Provenance … means the origin… of something, or the history of the ownership or location of an object. The term was …used for works of art, but is now … including science and computing. … In most fields, the primary purpose of provenance is to confirm or gather evidence as to the time, place, and— when appropriate—the person responsible for the creation, production, or discovery of the object. This will typically be accomplished by tracing the whole history of the object up to the present http://en.wikipedia.org/wiki/Provenance May 31, 2011 WeST
Steffen Staab
[email protected]
Semantic Days 9
Two Types of Provenance Knowledge Provenance labels for facts Which confidence? Who?
Which privileges?
Bluewater is a City
Which authority?
When?
WeST
Steffen Staab
[email protected]
Semantic Days 10
Two Types of Provenance Knowledge Provenance labels for facts
Open Provenance Model RDF graph representing
Which confidence? Who?
• • • •
Which privileges?
Bluewater is a City
When?
Who did
Which authority? 1
to a data item 2
admission
what when why …
4
3 examination
asking permit
5 examination
prepare share
„ex post“ workflow instance audit/re-enact WeST
Steffen Staab
[email protected]
Semantic Days 11
OWL REASONING USING PROVENANCE • Works also for RDF + SPARQL (with many technical differences of course) R. Dividino, S. Sizov, S. Staab, B. Schüler. Querying for Provenance, Trust, Uncertainty and other Meta Knowledge in RDF. In: Journal of Web Semantics. Elsevier, 7(3), 2009, pp. 204-219.
• Our work on using provenance with OWL reasoning: S. Schenk, R. Dividino, S. Staab, N. Kurz. Ontology Debugging Using Provenance. In: Journal of Web Semantics, Elsevier, accepted for publication. WeST
Steffen Staab
[email protected]
Semantic Days 12
Do we trust that bluewater is a real city?
Neverest, Low trust, 2009-09-09
German Press Agency, Highest trust, 2001-01-03 WeST
Steffen Staab
[email protected]
Semantic Days 13
Explanation (Pinpointing) Given Ontology O, Axiom , O' O O' is an explanation (pinpoint) for wrt. O, iff O' and O* for all O* O'
1 2
?
3
Explanation formula ( 1 2 ) ( 3 4 )
4
WeST
Steffen Staab
[email protected]
Semantic Days 14
Finding Pinpoints
O
O‘
WeST
Steffen Staab
[email protected]
Semantic Days 15
Computation of meta knowledge for OWL
Query: Meta Knowledge for
Compute Pinpointing Formula for wrt O (A1 … Am) … (Z1 … Zn) Insert Meta Knowledge degrees and operators
min(max(lrm(A1), …, lrm(Am)), max(lrm(Z1), …, lrm(Zn))
Evaluate WeST
Steffen Staab
[email protected]
Semantic Days 16
[KI 2009, SWPM2009]
WeST
Steffen Staab
[email protected]
Semantic Days 17
„Least recently modified?“ (A1 … Am) … (Z1 … Zn) min(max(lrm(A1)),…, lrm(Am)),…,max(lrm(Z1),…,lrm(Zn))
WeST
Steffen Staab
[email protected]
Semantic Days 18
Optimization: Syntactic relevance
WeST
Steffen Staab
[email protected]
Semantic Days 19
Optimized Computation of Provenance
9
9
9
9
8 7
7
7
Time Order
Oracle for you: relevant pinpoint
5
5
3 2 WeST
Color codes reachability Steffen Staab
[email protected]
2 Semantic Days 20
Syntactic Relevance
2
Optimized Computation of Provenance
9
9
9
9
8 7
7
7 5
5
3 2
2 WeST
Steffen Staab
[email protected]
Semantic Days 21
2
Optimized Computation of Provenance
9
9
9
9
8 7
7
7 5
5
3 2
2 WeST
Steffen Staab
[email protected]
Semantic Days 22
2
Optimized Computation of Provenance
9
9
9
9
8 7
7
7 5
5
3 2
2 WeST
Steffen Staab
[email protected]
Semantic Days 23
2
Optimized Computation of Provenance
9
9
9
9
8 7
7
7 5
5
3 2
2 WeST
Steffen Staab
[email protected]
Semantic Days 24
2
Optimized Computation of Provenance
9
9
9
9
8 7
7
7 Relevant pinpoint only contained
5
5
3 2
2 WeST
Steffen Staab
[email protected]
Semantic Days 25
2
Evaluation: Computing Provenance in Milliseconds
Real-world provenance!
WeST
Steffen Staab
[email protected]
Semantic Days 26
PROVENANCE AWARE POLICY LANGUAGE WeST
Steffen Staab
[email protected]
Semantic Days 27
WeST
Steffen Staab
[email protected]
Semantic Days 28
Middle Rhine Hospital
1 admission
Health Record
create
Policies
create
Sticky Log
create (P1): ukob is allowed to process health records for research purposes. However, ukob is not allowed to transfer the health records of patients to other organizations. (P2): The mrh demands that the record is only accessed by ukob after the sharing of the health records is approved by the patient and the approval must have been confirmed by a doctor.
WeST
Steffen Staab
[email protected]
Semantic Days 29
Middle Rhine Hospital
1
2
4
3
admission
examination
Health Record
create
update
Policies
create
asking permit
5
6
examination
prepare share
share for research
update
de-id.
transfer
fulfill
check
update
transfer
You Sticky Log
create
update
update
update
update
encrypt
transfer
Sticky Log:
WeST
step (record, {mrh}, {}, create, patient_treatment, 1, {0}) step (record, {mrh}, {}, update, examination, 2, {1}) reduced (record, hidden, hidden, update, hidden, 4, {2}) step (record, {mrh}, {}, de-identified, privacy, 5, {4}) attribute (record, de-identified, true, 5) step Steffen (record, transfer, research, 6, {5}) Staab {mrh}, {ukob}, Semantic Days
[email protected]
30
Middle Rhine Hospital
1
2
4
3
admission
examination
Health Record
create
update
Policies
create
asking permit
5
6
examination
prepare share
share for research
update
de-id.
transfer
fulfill
check
update
transfer
You permit (6)? Sticky Log
update update create (P3): update update permit (ID) IF (step (record, _, _, transfer, _, ID, _) AND transfer encrypt attribute (record, de-identified, true, ID)). Sticky Log:
WeST
step (record, {mrh}, {}, create, patient_treatment, 1, {0}) step (record, {mrh}, {}, update, examination, 2, {1}) reduced (record, hidden, hidden, update, hidden, 4, {2}) step (record, {mrh}, {}, de-identified, privacy, 5, {4}) attribute (record, de-identified, true, 5) step Steffen (record, transfer, research, 6, {5}) Staab {mrh}, {ukob}, Semantic Days
[email protected]
31
Middle Rhine Hospital
1
2
4
3
admission
examination
Health Record
create
update
Policies
create
asking permit
5
6
examination
prepare share
share for research
update
de-id.
transfer
fulfill
check
update
transfer
You permit (6)? Sticky Log
update update create (P3): update update permit (ID) IF (step (record, _, _, transfer, _, ID, _) AND transfer encrypt attribute (record, de-identified, true, ID)). Sticky Log:
WeST
step (record, {mrh}, {}, create, patient_treatment, 1, {0}) step (record, {mrh}, {}, update, examination, 2, {1}) reduced (record, hidden, hidden, update, hidden, 4, {2}) step (record, {mrh}, {}, de-identified, privacy, 5, {4}) attribute (record, de-identified, true, 5) step Steffen (record, transfer, research, 6, {5}) Staab {mrh}, {ukob}, Semantic Days
[email protected]
32
CONCLUSION
WeST
Steffen Staab
[email protected]
Semantic Days 33
Data Value lies in
Past Knowing what happened to your data Knowing why it happened to your data Present Drawing the right conclusions from your data Future Deciding upon the destiny of your data
Your Strategy is based on Provenance! You better take care! WeST
Steffen Staab
[email protected]
Semantic Days 34
Core References Provenance in RDF R. Dividino, S. Sizov, S. Staab, B. Schüler. Querying for Provenance, Trust, Uncertainty and other Meta Knowledge in RDF. In: Journal of Web Semantics. Special issue on "The Web of Data". Elsevier, 7(3), 2009, pp. 204-219. Provenance in OWL S. Schenk, R. Dividino, S. Staab, N. Kurz. Ontology Debugging Using Provenance. In: Journal of Web Semantics. Special issue on “Ontology Dynamics“, Elsevier, accepted for publication. Provenance for Policy Languages C. Ringelstein, S. Staab. Provenance-aware Policy Definition and Execution. In: IEEE Internet Computing, special issue on Provenance in Web Applications, Jan/Feb 2011, pp. 49-58. Capturing Provenance in Distributed Workflows C. Ringelstein, S. Staab. DiALog: A Distributed Model for Capturing Provenance and Auditing Information. International Journal of Web Services Research (JWSR), Idea Group Publishing, 7(2): 1-20, 2010. WeST
Steffen Staab
[email protected]
Semantic Days 35
Thank You! http://west.uni-koblenz.de
See you again at…
WeST
Steffen Staab
[email protected]
Semantic Days 36