Taxonomy Questions

Theresa Regli and Seth Earley


Leading taxonomy consultants Theresa Regli and Seth Earley presented several taxonomy questions at the CM Professionals Fall 2005 Summit. We can start our FAQ by answering some of their questions.

Do taxonomies make managing content easier?

Managing content begins with organizing information. Taxonomies (using the term broadly to include even shallow taxonomies like subject headings) are critical tools for organizing information. Indeed whenever content, or information, is arranged, or organized, it can be seen to be a taxonomic process.

Do content management systems need a taxonomy to function?

Even the simplest CMS has some means of navigating the content. And we shall see that although not all taxonomy work is for navigation, there is no navigation system that does not have an implied taxonomy.

Is there such thing as a galactic taxonomy/uber taxonomy?

There are hundreds of thousands, perhaps hundreds of millions, of taxonomies in use in the world. Many of them are online. Every website navigation scheme has an underlying taxonomy.

The great attempts to produce a universal classification system for knowledge, from Aristotle's original categories and Francis Bacon's creation of the modern natural sciences to the great Dewey Decimal and Library of Congress classification schemes, all have a taxonomy.

They are all attempts at a galactic taxonomy.

Does a taxonomy have to have at least three levels?

Although many taxonomy requirements documents and requests for proposals specify that the taxonomy include three (or even four or more) levels, the right number of levels is highly dependent on the particular content being organized. There are many large taxonomies in use with tens of thousands of nodes arranged in ten or more levels. Many web portals drill down to well over four levels. And many file/folder structures in our personal computers go deeper than four levels.

Isn’t taxonomy the same as navigation?

Although there can be a taxonomy that is not used for website navigation, there is no website navigation without taxonomy. And a taxonomy is perforce a means of navigating the content being arranged. Taxonomy is derived from the Greek tassein, to arrange.

Note that a navigation taxonomy should generally not go down deeper than a few levels.

How does a thesaurus differ from a taxonomy?

A thesaurus is an arrangement of terms (words or phrases), with a simple hierarchy of parent-child relationships described as broader terms (BT) and narrower terms (NT).

But thesauri also introduce the concept of synonym relationships, a set of equivalent terms that may be substituted for one another, usually with one preferred term (PT).

In addition, thesauri allow arbitrary associative relationships between terms. References to a term such as See and See also are examples of related terms (RT).

How does an ontology differ from a taxonomy?

Ontologies share the hierarchical structure of taxonomies, but they make extra demands on the objects they include..

Explicit rules or axioms describe the relationship between a node (e.g, parent or container) and the objects included in that node (children or contained objects).

The Linnaean biological taxonomy is an ontology in which each species is also a member of the containing genus.

The Semantic Web uses ontologies that conceptualize (describe in terms agreed to by participants in a community of discourse) some domain of phenomenal knowledge in a formal way that allows computers to make inferences about a term from its containing relationships. For example, that a bulldog is a dog in some contexts.

How does a faceted classification use taxonomies?

A single facet in a faceted classification scheme typically has an enumerated set of possible values. For example, a size facet might be small, medium, large. However a facet may contain a taxonomy of possible values.

So a faceted classification might contain many taxonomies.

Should one organization have one master taxonomy or allow specialized departmental taxonomies?

At a minimum, there will be different taxonomies for groups working with different content. In the grand vision of Enterprise Content Management (ECM), a single uber-taxonomy may have a place, with different groups using different sections of the master taxonomy.

But in large enterprises with multiple public Internet and private intranet websites, each of them likely has its own navigation taxonomy. And it would be a big surprise if records management and document management taxonomies resembled web content management (WCM) taxonomies.

What's the difference between categorization and classification?

These are two distinct stages in information organization - first categorization, then classification.

First comes the design and construction of a taxonomy, thesaurus, ontology, or faceted classification. At this stage the terms, keywords, ideas, concepts, memes, topics, facets, etc. are identified. These are the categories. However they are arranged, they are a controlled vocabulary. This process is categorization. Each node in the resulting classification scheme should provide a globally unique meme ID.

Next comes the classification of content into the categories. This can be described as attaching metadata to the documents, "tagging" them with the appropriate terms or keywords. In advanced memography, they are tagged with globally unique meme IDs to support a high precision memetic search.

Where can you find existing taxonomies, thesauri, and ontologies?

We are assembling a list of taxonomy sources.

If you know some good sources, please tell us and we will list them here.