These are the overarching geoscience categories that drove the generation of the full keyword cloud:
Geologic Units; Geologic Temporal; Geologic Terms; Geohazard Terms; CGS Research/Expertise Areas; GIS/map-related; Publication Type Codes; Medium/format; Locations: Geologic Regional, Mtns/Range, Rivers, Towns, Counties; Decades
This is the final specific keyword cloud for the Colorado Geological Survey that was applied across the entire (meta)/data/information space when/wherever possible:
Adams, aggregates, Alamosa*, alluvial fan, alluvium, Animas River, anticline, aquifer, Arapahoe, Archuleta, Arkansas River, arkose, Aurora, avalanche (snow, debris), B, Baca, basalt, basin, batholith, Bent, bibliography, BLM, book, Book Cliffs, Boulder*, Broomfield*, Browns Park, Buena Vista, Bull Lake, CAIC, Cambrian, Canon City Embayment, Carboniferous, Castle Rock*, CBM (coalbed methane), CD, Cenomanian, Cenozoic, Chaffee, Cheyenne, Cheyenne Basin, claystone, Clear Creek, coal, collapsible, Colorado Piedmont, Colorado Plateau, Colorado River, Colorado Spgs, Conejos, conglomerate, correlation, Costilla, county, cover, Cretaceous, Crowley, CSM, Custer, Dakota, data, Dawson, debris fan, debris flow, Delta, Denver Basin, Denver*, derivative, Devonian, diamonds, digital, dikes, Dillon, district, Dolores, Dolores River, Douglas, download, Dry Union, Durango, Eagle, Eagle Basin, Eagle River, earthquake, EG, El Paso, Elbert, Elk Mtns, Energy, engineering, Entrada, environmental, Eocene, eolian, epicenter, erosion, evaporite, expansive, Fairplay, fault, fieldtrip, fold, Fort Union, fossil, Fountain, Fox Hills, free, Fremont, Front Range, Ft Collins, Garfield, geochemistry, Geology, geophysics, geothermal, Gilpin, GIS/gis, glacial, Glenwood Spgs, gneiss, gold, Gore Range, Grand, Grand Junction, granite, granodiorite, gravel, Greeley, groundwater, GSA, Gunnison, Gunnison River, HA, Hazards, HAZUS, Hinsdale, Historic, historic, HM, Holocene, Huerfano, Hugoton Embayment, hydrogeology, hydrology, hydrothermal, igneous, Iles, IS, Jackson, Jefferson, Jurassic, karst, Kiowa, Kit Carson, kml, kmz, La Plata, laccolith, Lake, land use, Land Use Review, landslide, Laramide, Laramie, Larimer, Las Animas, lava, Leadville*, limestone, Lincoln, loess, Logan, Longmont, Loveland, Mancos, map, Maroon, Mesa, Mesaverde, Mesoproterozoic, Mesozoic, metals, metals, metamorphic, meteorite, MI, migmatite, Mineral, Minerals, mines, mining, Minturn, Miocene, Mississippian, Modern, Moffat, molybdenum, monocline, Montezuma, Montrose, monzonite, moraine, Morgan, Morrison, Mosquito Range, MS, mudflow, mudstone, Neogene, Neoproterozoic, Never Summer Mtns, Niobrara, North Park, North Platte River, o&g (oil and gas), oblique, OF, oil shale, ON, Oligocene, Ordovician, orogeny, Otero, Ouray, Paleocene, paleocurrent, Paleogene, paleontology, Paleoproterozoic, Paleozoic, Palmer Divide, Paradox Basin, Park, Park Range, pdf, pegmatite, Pennsylvanian, Permian, Phillips, Piceance Basin, Pierre, Pinedale, Pitkin, Pleistocene, Pliocene, Pliocene-Quaternary, pluton, PO, Post-Laramide, postcard, poster, Precambrian, print, proceedings, Proterozoic, Prowers, Pueblo, Pueblo, quartzite, Quaternary, range/mtns, Raton Basin, region, resources, rhyolite, rift, Rio Blanco, Rio Grande, Rio Grande River, rivers, Roaring Fork River, rockfall, Routt, RS, RT, Saguache, San Juan, San Juan Basin, San Juan Mtns, San Juan River, San Luis Valley, San Miguel, Sand Wash Basin, sandstone, Sangre de Cristo Mtns, Sawatch Range, scarp, schist, Sedgwick, sedimentary, sedimentology, shale, shp, Silurian, silver, sinkhole, slump, slump, software, South Park, South Platte River, SP, Statemap, Steamboat Spgs, subsidence, Summit, susceptibility, susceptibility, swelling soil, syncline, talus, Teller, Tenmile Range, Tertiary, till, travertine, Triassic, tufa, tuff, Uncompahgre River, undermined, uplift, uranium, USFS, USGS, vanadium, video, volcanic, Wall Mountain, Walsenberg, Wasatch, Washington, Water, water quality, Weld, well, well-log, West Elk Mtns, Western Slope, Wet Mtns, White River, Williams Fork, x-section, xls, xml, Yampa River, Yuma, zip, 1850s, 1860s, 1870s, 1880s, 1890s, 1900s, 1910s, 1920s, 1930s, 1940s, 1950s, 1960s, 1970s, 1980s, 1980s, 1990s, 2000s, 2010s, 2020s, 2030s
* denotes keywords that refer to multiple items with the same name (cities, counties, etc.); and the bolded terms are the five overarching categories of expertise for the organization. These five were used as an ‘upper-level’ basis for organizing the web presence/navigation system.
I received no consequent support from CGS management for the process I undertook implementing a keywording system. Their info/dataspace was in disarray and between the public-facing and internal spaces there was no consistency in even the most basic of data organizational principles (like file naming). Given that the organization was, ostensibly, about science, this seemed to be a major lapse if only from the standpoint of consistency and repeatability. And, given my own responsibilities—publications, editing, social media and web work—directly related to the space, it was essential to do something!
Generating an effective keyword system requires careful balance and deliberate constraints to remain useful. While it’s tempting to create extensive tag clouds, an over-saturated system quickly becomes unwieldy and defeats its purpose of improving information findability. For most systems, an upward limit of ~500 keywords is optimal.
A strong keyword strategy needs to reflect both organizational needs and user behavior patterns, incorporating essential discipline-specific terminology and terms that apply to dispersed-but-related information. However, the keyword universe must be intentionally limited—being selective and strategic about which terms make the cut rather than allowing unlimited growth.
User perspectives are crucial in determining these constraints. I did spend time observing my colleagues, asking them about their search methodologies, and how they managed their own data. The forthcoming information generally illustrated what I observed among my media students: a lack of understanding of what “search” meant in a digital environment. Keywords should reflect how people actually search for content, although most people have rather idiosyncratic ways of dealing with information, often based on their first encounters with the digital tools involved. Generational differences in this regard are substantial, and do change all the time. The implementation of AI into the process are radically changing search processes along with the fundamentals surrounding the very presentation of information on the Internet itself.
A focus on standardizing is crucial, but this restriction will run up against that idiosyncrasy instantly, often with problematic results. Recognized and respected processes for responsible governance of a keyword system are necessary. That said, because of the resources needed to do the actual content tagging, a more-or-less static set of agreed-upon terms is best. If, for example, a new term is added, the entire dataspace should be reviewed for relevant placement of that new term. Otherwise there will be datasets that will not show up in search returns, rendering them redundant and reducing the overall value of data assets, and, negating the entire concept of standards.
According to sources, the CGS has apparently abandoned the keyword application process and even basic file-naming and filing conventions. So counterproductive! Good luck finding stuff!