Anonymous

About: Difference between revisions

From geokb
6,017 bytes added ,  6 months ago
no edit summary
(Created page with "Welcome to the Geoscience Knowledgebase (GeoKB) = Project Intent = The Geoscience Knowledgebase (GeoKB) is an experimental R&D effort in the U.S. Geological Survey, with this particular instance on the Wikibase Cloud our current skunkworks. We're ultimately trying to develop a new way of organizing and encoding all applicable knowledge that we develop through our institution about the earth system in a way that connects to the much broader global knowledgebase. As a go...")
 
No edit summary
 
Line 27: Line 27:
= Knowledgebase Development =
= Knowledgebase Development =


We are taking advantage of building the GeoKB within the Wikibase/Wikimedia tech platform to use other features available here for development. Notable are the discussion pages on items and properties where we can capture development work centered on those concepts. This is all very much a work in progress. While we have some formal ontologies and models to draw from, much of the content we are working to organize here is based in other types of discrete and not particularly well connected datasets or less structured forms.
We are taking advantage of building the GeoKB within the Wikibase/Wikimedia tech platform to use other features available here for development. Notable are the discussion pages on items and properties where we can capture development work centered on those concepts. This is all very much a work in progress. While we have some formal ontologies and models to draw from, much of the content we are working to organize here is based in other types of discrete and not particularly well-connected datasets or less structured forms.


A central point of discussion and directory off to other parts of the framework where we are developing concepts and structure is found on the [[SPARQL examples|SPARQL examples]] page. Visit that page for a description of the major entity types, example queries associated with those entities, and links to deeper level details on the technical development.
A central point of discussion and directory off to other parts of the framework where we are developing concepts and structure is found on the [[SPARQL examples|SPARQL examples]] page. Visit that page for a description of the major entity types, example queries associated with those entities, and links to deeper level details on the technical development.
= Graph Federation PLUS Graph Interaction =
After experimenting with this platform for about a year, we're starting to get some clarity on how to frame out more of an operational infrastructure for the Geoscience Knowledgebase idea. We ended up rebuilding quite a number of entities here that don't really need to be in this Wikibase instance other than the need to link to them effectively from other entities. This includes reference material such as geographic places, "minerals," commodities, etc. Some of these were/are not in the best shape anyway in terms of their own alignment with linked open data concepts (e.g., things in source datasets that should explicitly link to something with semantic depth and definition do not). So, doing some work on those to get them into proper RDF+OWL alignment is not wasted effort. However, it would be better overall to do some work at the source or in between to provide more well-formed OWL structured data and then federate with those graphs on persistent resolvable identifiers.
Where we really need the Wikibase functionality is to support human interaction with evolving graphs and the knowledgebase as a whole. One major area we've been exploring on this is claim uncertainty reduction. We might generate a whole set of claims based on an AI-assisted process, but then we need a place where sometimes competing claims about the same thing can be evaluated by subject matter experts who can record something about their expert judgments. Some of this might be surfaced by watching how queries are formed (e.g., what claims are trusted in practical use). But we also need a place where users can interact with the knowledgebase as a whole and introduce new information. This is part of why we started down the Wikibase path - all the different tooling already developed like OpenRefine and a mature API.
There are two big architectural issues that exist currently that would need to be addressed for both effective graph federation and user interactions to flourish.
== Effective Foreign Ontology/Graph Linking ==
What I've experimented with so far to federate content from the GeoKB Wikibase with other graphs is the Qlever SPARQL platform. I think we can get a ton of great functionality at the SPARQL interface level with this capability. I'm also exploring the full text indexing functionality with Qlever that would be an enhancement on what we have been doing with loading larger unstructured text and/or full original content into "item talk" pages. Doing this completely within Wikibase means we would need to build some kind of abstraction that integrates Wikimedia API search with SPARQL.
In Wikibase, we really want as many properties as possible to be of the "item" type. What this essentially means is that they have to have something "real" on the other end to link to vs. a text value with no explicit semantic definition. We need some type of architectural shift that allows a Wikibase instance to be aware of and federated with many other graphs, including formal ontologies, and then a new property type that is an item/entity in one of those other graphs. We want those items to act just like a local Wikibase item, especially in terms of the UI that I'll discuss in the next section.
If we did have the ability to federate with vs. recreate locally the various external graphs that a given knowledgebase needs to work with, we need those foreign items to masquerade like they are local to the given Wikibase. This includes things like type-ahead search support and perhaps even functionality that lets one of those foreign items have a linkable web page directly within a given Wikibase instance. This could get really sophisticated with configurable settings that don't make those localized federated entities editable, or perhaps a way to allow users of a Wikibase instance to add to but not take away from a federated entity.
== UI Enhancements and Plugin Support ==
The current Wikibase.cloud environment does not support the addition of the many plugins and components that have developed in the open-source Wikimedia landscape. This is partly because of the dated but still effective Wikimedia technology base that comes with some serious risks for user malfeasance via Javascript and other injection attacks. However, there are some things that we really should get in play.
At the top of the list is inline conformance checking where deviance from property constraints are highlighted visually. As I understand it, there are basically two methods that Wikibase/Wikidata supports currently to highlight or force compliance with a specific schema - ShEx and property constraints (which require Wikimedia plugin and configuration). The property constraints approach pre-dates the work with ShEx, and so Wikidata developed and incorporated the UI methods around that architecture. I'd like to take this a step further and build something at the data layer that indicates non-conformance with applicable schema definitions from ShEx. I'd like SPARQL results to come along with what would amount to a "buyer beware" statement - go ahead and use this information as you see fit, but be aware that there are these specific issues according to the schemas this knowledgebase conforms with.
Lower down the list are the visual helpers for globe coordinate and Commons image links. There's cool functionality built into the Wikidata UI that would be great to see for wikibase.cloud instances. The little map preview for coordinate location properties is the most important feature we'd like to see as it's often the case where we are recording multiple competing location claims for a given entity. It would be great to be able to see this visually, perhaps even enhancing the existing plugin to show multiple points on a map for a given set of claims on the same property. The commons image preview is important but secondary, with a larger background issue on being able to point to multiple image repositories beyond Wikimedia Commons.


= Partnering =
= Partnering =