Item talk:Q3: Difference between revisions

Item talk:Q3 (view source)

7 bytes removed , 3 months ago

1,477,004

edits

@@ Line 2: / Line 2: @@
 = Caching raw data as schema.org documents =
-The information we are using to build representations of people comes from a couple of different online sources.
+The information we are using to build representations of people comes several sources.
 * USGS Staff Profile pages (via a web scraping routine)
 * ORCID records
+* OpenAlex records
 In the case of USGS Staff Profiles, our primary source for personnel information in this knowledge graph, we have no programmatic or structured data access path and must use a web scraper to pull from pages periodically. In striving toward an ideal we'd like to see in future, we have started organizing all of the scraped content into notional [https://schema.org/Person schema.org/Person] documents. These are cached to the associated "item talk" pages (encoded in YAML) for the person entity and then used from that state to set labels, descriptions, aliases, and claims.