Entities
Entities ("Q" identifiers or items) in the GeoKB are the primary object in the knowledgebase. They are the focus for our knowledge organization scheme. The majority of properties ("P" identifiers) have an item type classification meaning that the object they connect to a subject entity is another entity in the knowledgebase. The majority of our entities are not "native" to this knowledgebase; rather they are sourced from some other data or information system with the entity in the knowledgebase a basic representation of the "foreign-sourced" item. We often capture these relationships with same as claims when the source item is linkable with a resolvable URL and with references on claims recording specific information from one or more sources.
Entities are built iteratively over time with software codes used to introduce them to the knowledgebase. Software tools continue to evolve with this GitHub project containing many of the original workflows used to build entities and a newer set of cleaner codes in an internal GitLab instance. A slight abstraction on the Wikibaseintegrator toolkit, wbmaker, was built to handle some of the more routine aspects of working with the GeoKB.
We may start an entity representation with only a couple of pieces of information, with the bare minimum being a label, description, and either an "instance of" or "subclass of" claim, depending on the purpose of the entity in the knowledgebase. Most claims (or statements) made about an entity should always have at least one reference that essentially cites the source for the statement. These will often point to an item representing the source, and mant use "reference URL" when the reference source is simply a single linkable online resource. Over time, we may revisit the initial starter code that introduced an entity to the knowledgebase to change how it operates and bring in further claims. We may also change the nature of how an entity specifies its source by changing or adding to references.
This page serves as iterative documentation of the entities in the knowledgebase with a focus toward how they can be retrieved and interacted with through SPARQL queries. It is organized into sections describing a few things about the major entity types. Each section may contain a link to an associated "item talk" page in this Wikibase instance that discusses the evolution of the entity. Each section will also contain a link to the associated ShExC schema for the entity type as those are developed iteratively. Schemas facilitate interactions with the knowledgebase such as ongoing multimodal introduction of new entity instances and claims.
Minerals Related Entities
USGS Mineral Resource Assessment work provided the seminal use cases for the GeoKB. We are using the knowledge graph model to transform and modernize a number of older databases, online tools, and information systems into a interconnected framework that can better scale into the future. The following section lays out the entity types in the GeoKB related to mineral resources and provides some example queries.
Mineral Resource Assessments
We are experimenting with how to define the concept of mineral resource assessment in the knowledgebase and how to use it in practice. As a start to that, we are using the concept in the sense of the assessment as a tool or end product of a scientific practice. As such, we assigned it as an instance of classifier for some publications that are considered mineral resource assessment products. The following query pulls their basic details:
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel
(YEAR(?pub_date) AS ?year)
(CONCAT("https://doi.org/", ?doi) AS ?doi_link)
(CONCAT("https://pubs.er.usgs.gov/publication/", ?indexId) AS ?usgs_link)
WHERE {
?item wdt:P1 wd:Q152682 .
OPTIONAL {
?item wdt:P7 ?pub_date .
}
OPTIONAL {
?item wdt:P74 ?doi .
}
OPTIONAL {
?item wdt:P114 ?indexId .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Mineral Commodity
The following query pulls items identified as mineral commodities with their associated subclass relationships where applicable. Items were sourced and identified as commodities from a combination of sources:
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?itemDescription
?subclassOf ?subclassOfLabel
WHERE {
?item wdt:P1 wd:Q406 .
OPTIONAL {
?item wdt:P2 ?subclassOf .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Mines
This query searches for the first 100 items representing mines with their names, identifiers, and point coordinates (which are a mappable WKT point that can be pulled into a mapping application).
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?mine ?mineLabel ?coordinate_location
WHERE {
?mine wdt:P1 wd:Q3646 .
?mine wdt:P6 ?coordinate_location .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 100
Geospatial Search for Mines
The following query finds mines within a 10km radius of the specified point (This uses a built in functionality found in the GeoKB, specified in the Mediawiki docs).
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?location ?distance
WHERE {
?item wdt:P1 wd:Q3646 .
SERVICE wikibase:around {
?item wdt:P6 ?location .
bd:serviceParam wikibase:center "POINT(-87.107680869 33.10434839)"^^geo:wktLiteral .
bd:serviceParam wikibase:radius "10" .
bd:serviceParam wikibase:distance ?distance.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
ORDER BY ASC(?distance)
Mining Facility
As we continue to evolve the model for how we organize information and knowledge about mining in the GeoKB, we are honing in on the need to have specific mining facilities classified and organized as top-level entities that can then be associated with a specific "mining project" as a higher level concept. We established mining facility as a subclass of a more general facility and then a set of more specific subclasses from the USMIN topographic mine symbol digitization project.
There is a class of mining facility for mine, which is what we used to classify all of the mine feature classes we pulled from GNIS. We will either continue to use that as a specific type of mining facility and use "mining project," "mining prospect," or something else as the higher level container or else elevate and reclassify "mine" to mean something else.
The following query returns the mining facility classification.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel
WHERE {
?item wdt:P2* wd:Q44143 . # subclass of (transitive) mining facility
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Rock Classification
We've started the GeoKB understanding of rock classification via the Mindat system. We will likely augment this over time with other interpretations and classification systems, but since we are pulling minerals from Mindat and want to link to rock types included in those records, starting with Mindat made some sense. The following queries show a bit of how to work with the classification itself in addition to what we link to specific classes.
Igneous Rocks
The following query starts with igneous rock and pulls the full classification from that point (* on the end of the predicate). Because we pull identifiers in this, you can use something like the graph view in Wikibase to visualize and explore the items through their connections.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?rock ?rockLabel ?subclass_of
WHERE {
?rock wdt:P2* wd:Q41459 .
?rock wdt:P2 ?subclass_of .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Items that are an instance of multiple things at the same time
One of the things we are experimenting with that is geared toward making the knowledge content in the GeoKB more transferrable to non-expert domains like the global knowledge commons is somewhat counter to how Mindat has handled the situation. Many of the things that we would give the same basic label are actually different things in different circumstances/contexts. We're experimenting with focusing on the label and then asserting that the entity can be an instance of multiple things. This probably violates some semantic modeling rules, so it may or may not stand the test of time. But it's a thought experiment in motion. The following query focuses in on commodities, pulling those that have more than one instance of claim.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel (COUNT(?instance_of) AS ?num_instance_of)
WHERE {
?item wdt:P1 wd:Q406 ;
wdt:P1 ?instance_of .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
GROUP BY ?item ?itemLabel
HAVING (COUNT(?instance_of) > 1)
Mindat Identifiers
We are using Mindat as a key reference for a number of things (rocks, minerals, etc.). The Mindat identifier provides the linkage back to Mindat for gathering additional information on the subject items. Mindat IDs are incorporates as a qualifier on the instance of statements made for items sourced from the Mindat API. The following query is one example of how to return rock items with their Mindat IDs.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
PREFIX p: <https://geokb.wikibase.cloud/prop/>
PREFIX pq: <https://geokb.wikibase.cloud/prop/qualifier/>
SELECT ?item ?itemLabel ?mindat_id
WHERE {
?item wdt:P1 wd:Q41261 . # "instance of" "rock"
OPTIONAL {
?item p:P1 ?statement . # Get the instance of statement to operate on
OPTIONAL { ?statement pq:P99 ?mindat_id . } # Pull the Mindat ID qualifier as a value
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 10
We've made a design decision in the GeoKB with items that we refer to in common usage to only declare or instantiate these entities with one, uniquely identified item that is then classified and characterized to indicate the different ways the concept can be used. An example of this are items that can be a mineral, a mineral commodity, and a chemical element. This is essentially dealing with the issue of the same word or phrase meaning different things in different contexts. The alternative would be to declare separate entities, each with their own specific classification and other characteristics, and then use relationships between the different items or disambiguation features in the knowledgebase to distinguish between them. We will have to determine exactly which approach makes the most sense in practical use over time.
The following query searches for items that are classified as chemical element, mineral, and mineral commodity in the same logical labeled entity.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?commodity ?commodityLabel ?instance_ofLabel
WHERE {
?commodity wdt:P1 wd:Q406;
wdt:P1 wd:Q280;
wdt:P1 wd:Q24 .
?commodity wdt:P1 ?instance_of .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Organizations
Entities representing organizations are important in the GeoKB in a couple of areas. We have information on entities such as publications and projects connected to USGS "sub-organizations" such as Science Centers and Labs. We also have information associated with external organizations such as mining companies used to retrieve and organize prospecting history for mineral resource assessments.
People
Items representing people associated with the USGS are another type of entity built out in the GeoKB. We use these as reference points and connections to the overall scientific record captured in this knowledge graph. Person records come from public sources such as our USGS Staff Profiles and are further discussed on the person classification talk page.
People by employer
This query pulls all people along with their email address (already publicly visible) and reference URL (pointer to USGS staff profile).
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?email ?profile_url ?orcid
WHERE {
?item wdt:P1 wd:Q3 .
?item wdt:P107 wd:Q44210 .
OPTIONAL {
?item wdt:P109 ?email .
}
OPTIONAL {
?item wdt:P31 ?profile_url .
}
OPTIONAL {
?item wdt:P106 ?orcid .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
LIMIT 10000
Occupations and Roles
We organized a number of concepts for the major occupations/professions of USGS staff along with a set of specialized leadership roles that help to understand the organization in capacity assessment use cases.
- Discussion on standardized occupations
- Discussion on USGS leadership roles
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?item_alt_label
WHERE {
?item wdt:P2* ?classes .
VALUES ?classes {wd:Q159568 wd:Q159617}
OPTIONAL {
?item skos:altLabel ?item_alt_label .
FILTER (lang(?item_alt_label)='en')
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Places
The GeoKB includes references for many named places that are necessary links from many other items. The following are some examples for this part of the knowledgebase.
U.S. States and Territories
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?fips_alpha
WHERE {
?item wdt:P13 ?fips_alpha . # Both states and territories in the U.S. have two-character FIPS codes
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
U.S. Counties or Equivalent Subdivision
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?fips_code ?coordinates
WHERE {
?item wdt:P1 wd:Q481 . # "instance of" "county or equivalent"
?item wdt:P34 wd:Q256 . # in the state of Colorado
?item wdt:P22 ?fips_code .
?item wdt:P6 ?coordinates .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Publications/Documents
Another fundamental entity type we are bringing together in the GeoKB are representations of documents. This includes a full representation of all records in the USGS Publications Warehouse catalog because so many other types of content (claims and other entities) link to these as source material.
For right now, we put "document" right at the top of the classification as a foundational type of "entity." You can query for the classification structure for documents with the following transitive query from the document root.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel
WHERE {
?item wdt:P2* wd:Q5 . # subclass of, transitive "document"
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
We can build on the query for document classes to get document instances and some of the statements we might need to use.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?instance_ofLabel (YEAR(?publication_date) AS ?year) ?doi
WHERE {
?classes wdt:P2* wd:Q5 . # subclass of, transitive "document"
?item wdt:P1 ?classes ; # get entities that are instances of any of the classes
wdt:P1 ?instance_of ; # get the item classification to display
wdt:P7 ?publication_date ; # only get items that have a publication date
wdt:P74 ?doi . # only get items that have a doi
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 100
USGS Numbered Series
The "USGS Numbered Reports Series" are an important reference source in the GeoKB for many other things we are working on. We keep a representation of USGS reports in sync from the USGS Publications Warehouse source. The different USGS numbered series are part of a classification of documents. The following query uses that classification to pull all USGS reports and specific claims.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?item ?itemLabel ?itemAltLabel ?pub_year ?doi
?pw_index_id ?country ?countryLabel ?us_state ?us_stateLabel
?county ?countyLabel
WHERE {
?item wdt:P1/wdt:P2* wd:Q11 .
OPTIONAL {
?item wdt:P7 ?pub_year .
}
OPTIONAL {
?item wdt:P74 ?doi .
}
OPTIONAL {
?item wdt:P114 ?pw_index_id .
}
OPTIONAL {
?item wdt:P33 ?country .
}
OPTIONAL {
?item wdt:P34 ?us_state .
}
OPTIONAL {
?item wdt:P35 ?county .
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
NI 43-101 Technical Reports
These are a special type of document that is part of an early use case for the GeoKB. We manage the metadata and stored document content for these reports in a Zotero collection as part of the GeoArchive. From that data management foundation, we create a representation of basic identification information in the GeoKB for knowledge graph purposes. The following query is designed to assist operators that may be wanting to use the GeoKB as a route to retrieve metadata for NI 43-101 reports for external processing. It provides the crucial information needed to identify individual reports and their PDF file attachments.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
PREFIX p: <https://geokb.wikibase.cloud/prop/>
PREFIX pq: <https://geokb.wikibase.cloud/prop/qualifier/>
SELECT ?report ?reportLabel ?meta_url ?content_url ?attachment_key ?file_size
WHERE {
?report wdt:P1 wd:Q10 ; # instance of NI 43-101 Technical Report
wdt:P141 ?meta_url ; # permanent URL to the online representation (responds to application/json content negotiation)
wdt:P136 ?content_url ; # read URL to attachment content (only accessible if authenticated to Zotero web UI)
wdt:P143 ?attachment_key ; # attachment key that can be used to download PDF file content
p:P143 ?attachment_key_statement . # get attachment key statement so we can get the file size qualifier
?attachment_key_statement pq:P144 ?file_size . # attachment file size
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 100
Research Methods
Linkable research methods are incorporated into the GeoKB to support cases where we need these as filters or other factors in analyses.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
SELECT ?method ?methodLabel ?method_alt_label ?subclass_of ?subclass_ofLabel
WHERE {
?method wdt:P2* wd:Q152412 ; # Gets all subclasses of research method
wdt:P2 ?subclass_of . # gets the subclass identifier itself so we can graph this
OPTIONAL {
?method skos:altLabel ?method_alt_label . # gets any aliases individually so we can run name matching
FILTER (lang(?method_alt_label)='en')
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}