Item talk:Q164044
From geokb
Introduction
This class of government report represents an archive collection of reports from the U.S. Bureau of Mines (disbanded in 1996). The items that are instances of this class represent digital scans of the original reports with basic bibliographic metadata and pointers to the file download locations in the ScienceBase repository.
Query
The following query pulls most of the relevant details needed to build basic bibliographic metadata for Bureau of Mines reports and fetch the report content via URL.
PREFIX wd: <https://geokb.wikibase.cloud/entity/>
PREFIX wdt: <https://geokb.wikibase.cloud/prop/direct/>
PREFIX p: <https://geokb.wikibase.cloud/prop/>
PREFIX ps: <https://geokb.wikibase.cloud/prop/statement/>
PREFIX pq: <https://geokb.wikibase.cloud/prop/qualifier/>
SELECT ?report ?title ?publisher ?publisherLabel
(YEAR(?publication_date) AS ?date)
?author_name
?meta_url ?content_url ?mime_type ?checksum
WHERE {
?report wdt:P1 wd:Q164044 ;
wdt:P66 ?title ;
wdt:P141 ?meta_url ;
wdt:P136 ?content_url ;
wdt:P7 ?publication_date ;
p:P136 ?content_url_statement .
OPTIONAL {
?report wdt:P198 ?publisher .
}
OPTIONAL {
?report wdt:P196 ?author_name .
}
?content_url_statement ps:P136 ?content_url ;
pq:P65 ?mime_type ;
pq:P197 ?checksum .
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
Notes
- Publisher values will either be a link to a GeoKB entity for the organization that is listed as publisher or the explicit unknown value type in Wikibase. The latter is for cases where we have not resolved the publisher to a defined entity and are unclear on metadata completeness and quality.
- The authors from this collection are all name only values that we may not formalize in the GeoKB. We used a separate type of string property to contain this information.
- Qualifiers for content URL include a MD5 checksum that comes from the original ScienceBase Item information. Initially, some duplicate checksum values will be found. These represent a problem in the original content we are working to clean up at the source.
- The query does not do any grouping, so grouping may be necessary in further processing of this content. Items may have multiple publishers, authors, and content URLs. Each unique GeoKB item corresponds to a unique ScienceBase Item, which should represent and contain an individual report.
- The meta URL values here, pointing to ScienceBase Items, are unique and should be reasonably persistent for the long term. They can be accessed as HTML as well as JSON via content negotiation (or "?format=json" added to the URL) to get full content.