These are the overarching geoscience categories that drove the generation of the full keyword cloud:
Geologic Units; Geologic Temporal; Geologic Terms; Geohazard Terms; CGS Research/Expertise Areas; GIS/map-related; Publication Type Codes; Medium/format; Locations: Geologic Regional, Mtns/Range, Rivers, Towns, Counties; Decades
This is the final specific keyword cloud for the Colorado Geological Survey that was applied across the entire (meta)/data/information space when/wherever possible:
* denotes keywords that refer to multiple items with the same name (cities, counties, etc.); and the bolded terms are the five overarching categories of expertise for the organization. These five were used as an ‘upper-level’ basis for organizing the web presence/navigation system.
I received no consequent support from CGS management for the process I undertook implementing a keywording system. Their info/dataspace was in disarray and between the public-facing and internal spaces there was no consistency in even the most basic of data organizational principles (like file naming). Given that the organization was, ostensibly, about science, this seemed to be a major lapse if only from the standpoint of consistency and repeatability. And, given my own responsibilities—publications, editing, social media and web work—directly related to the space, it was essential to do something!
Generating an effective keyword system requires careful balance and deliberate constraints to remain useful. While it’s tempting to create extensive tag clouds, an over-saturated system quickly becomes unwieldy and defeats its purpose of improving information findability. For most systems, an upward limit of ~500 keywords is optimal.
A strong keyword strategy needs to reflect both organizational needs and user behavior patterns, incorporating essential discipline-specific terminology and terms that apply to dispersed-but-related information. However, the keyword universe must be intentionally limited—being selective and strategic about which terms make the cut rather than allowing unlimited growth.
User perspectives are crucial in determining these constraints. I did spend time observing my colleagues, asking them about their search methodologies, and how they managed their own data. The forthcoming information generally illustrated what I observed among my media students: a lack of understanding of what “search” meant in a digital environment. Keywords should reflect how people actually search for content, although most people have rather idiosyncratic ways of dealing with information, often based on their first encounters with the digital tools involved. Generational differences in this regard are substantial, and do change all the time. The implementation of AI into the process are radically changing search processes along with the fundamentals surrounding the very presentation of information on the Internet itself.
A focus on standardizing is crucial, but this restriction will run up against that idiosyncrasy instantly, often with problematic results. Recognized and respected processes for responsible governance of a keyword system are necessary. That said, because of the resources needed to do the actual content tagging, a more-or-less static set of agreed-upon terms is best. If, for example, a new term is added, the entire dataspace should be reviewed for relevant placement of that new term. Otherwise there will be datasets that will not show up in search returns, rendering them redundant and reducing the overall value of data assets, and, negating the entire concept of standards.
According to sources, the CGS has apparently abandoned the keyword application process and even basic file-naming and filing conventions. So counterproductive! Good luck finding stuff!
Generational differences in this regard are substantial… LOL – so true. And an extension to this notion is what I will stick my neck out to state my experience is frequent with all generations… My last name is difficult to spell, and when at a retailer or some “service desk” (library, DMV, recreation center) inquiring my name to search their system, I pronounce my name and recite the first 4 letters. Confused, they ask how to spell the rest of it. I prompt them to look on their screen. They proceed to state they see a name come up but it must not be mine because I only provided 4 letters. My observation is this obstacle of wasted time comes more from a younger generation than the more experienced, aged crowd who know continuing with risks of misunderstanding the remaining letters can lead to mispelling and serves no one. Efficiency gain is at the doorstep for these organizations, but many do not care. Another abandoned opportunity piles on top of the rubble.