Main Page
Welcome to the Geoscience Knowledgebase (GeoKB)
Project Intent
The Geoscience Knowledgebase (GeoKB) is an experimental R&D effort in the U.S. Geological Survey, with this particular instance on the Wikibase Cloud our current skunkworks. We're ultimately trying to develop a new way of organizing and encoding all applicable knowledge that we develop through our institution about the earth system in a way that connects to the much broader global knowledgebase. As a government science institution, it's not really our role to muck about in the public commons knowledge-bank (e.g., Wikidata), but at the same time, it is our mandate to fully release (donate) what we know into the public domain for the public good.
What we're experimenting with here is how we can organize and project our data, information, and knowledge in a way that is more connected and more accessible for others to pick up and run with. By putting it all into the very granular structure that Wikibase is built on and leveraging pertinent aspects of the semantics in use in Wikidata and formal ontologies, our hope is that the curatorial pathway from "our knowledge" to "everyone's shared knowledge" is as frictionless as possible.
We are also working on how this knowledgebase resource can be baked into our ongoing geoscientific research as a living tool - building it by using it in practice. Rather than being an afterthought, only contributed to by someone with enough interest at the end of a project, we are seeking to build it as a scientific instrument used directly in research and analysis. By working with this capability to solve important problems in information and knowledge management that are impeding our research practices today, this should result in a much more efficient pathway to usable knowledge projected out for others to take advantage of for their own purposes.
The Wikidata/Wikibase approach is fascinating in that it specifically promotes and supports the dynamic of having multiple competing claims/assertions about the same things at the same time, leaving it up to the inquirer to determine what characteristics about the claims indicate in terms of trustworthiness or fitness for purpose. Too often, these judgments are made in the background without a sufficient record describing the reasoning. We have thousands of examples of individual scientific datasets developed this way without granular enough information for the next scientist to make their own reasoned decisions upon. We're interested in how the knowledgebase approach helps us correct that dynamic in our work.
Some of what we bring together in the GeoKB will be originated from our work in USGS, while much of it will come from the many other public data and information sources we consult in our research. We'll be working hard to get things right in terms of references and qualifiers on claims and careful provenance tracing through item and property history and annotation. We'll share the code we use in developing bots to handle as much of this work as possible, starting with this project.
Wherever possible, we will pull directly from and build associations with Wikidata properties and classification items, though we are making judgment calls on where we agree/disagree with the specific semantics. While we may pull whole groups of items from Wikidata through bots, we are being selective in what claims we leverage from Wikidata, focusing on the parts that matter to our work and that we trust sufficiently to use. We'll dig a bit into what other groups are doing in this regard to follow useful conventions so we make our stuff as linkable as possible.
Knowledgebase Development
We are taking advantage of building the GeoKB within the Wikibase/Wikimedia tech platform to use other features available here for development. Notable are the discussion pages on items and properties where we can capture development work centered on those concepts. Listed here are pointers to specific knowledge development projects we have in the works:
- Mining Facilities - classification and integration of source data
Usage
We are continuing to work out usage patterns from queries and graph-based analyses to pathways for contributing items and claims. The user interface provides some level of browse and search capability to explore specific concepts. Most regular query users will want to invest a little time in learning at least a little bit of SPARQL. We are working to build up common SPARQL usage patterns via examples, which may serve as a starting point.
Partnering
In USGS, we are committed to the goals and ideals of open science and are working to improve our practices in line with those principles. We don't as yet have a formal route to collaborate on this particular project, but we're working to introduce it and bring it into the community for collaboration via the Earth Science Information Partners (ESIP). Look for a presentation on this work at the ESIP Summer Meeting in Burlington, VT in July, 2023.
Disclaimer
This is an experimental effort that will only deal with an organization and portrayal of already publicly available data, information, and knowledge. Everything we incorporate here, directly attributable to the USGS, is from officially released, peer reviewed material. Our pathway from institutional knowledge development processes to releasable data, software, and knowledge products is guided by policies in our Fundamental Science Practices. How these policies are applied to what is a very different form in an integrated knowledgebase is part of our experimentation and development work.