The Knowledge Management Connection   

   Communication is the common thread of knowledge management.
   Why Categorize? ] Faceted Classification ] [ Conventional Categorization ] Full-text Search ]
 

Conventional Categorization — a Complement to Faceted Classification


 
Home
Our Secret
Advanced C & R
About KM
Services
Links
Resources
What's New
BRAKOR downloads
 

 

 

Effective access to a knowledgebase must combine faceted classification with “conventional” categorization, because faceted classification by itself has intrinsic weaknesses for example, the top-level categories in faceted schemas often don't provide quick grasp of the scope of the knowledgebase or quick access to popular topics.  Index-like or Yahoo!®-like directories are one way to respond to the need for popularity-based access.

Benefits of conventional categorization for a shared knowledgebase

 We sometimes refer to conventional categorization as as “popularity-based” because categories in such schemas often float to the top as interest in that topic increases.  Conventional categorization is vital, for the following reasons:

  • Popular topics at or near the top.  Why do people get on the Web? The top-level categories on the home page of Yahoo! are your answer.  The category shipbuilding may be very important to a small audience — and of vital importance to national defense — but this category won't ever make it to the home page of Yahoo!. But the home page of Yahoo! does find room for very specific hot topics like Harry Potter and Xbox, often  outside of the top-level hierarchies.
  • Familiarity of the interface. The proliferation of copycats, including Altavista and DMOZ, seems testament enough that many people like the Yahoo! interface and classification model. And we all know how to use a back-of-the-book index. The conventions are simple, but usable.
  • Limitations on abstractness. Casual information-seekers have little interest in the elegance and power of classification systems, and even less interest in highly abstract top-level categories.  For example, the top level category information objects is very useful in faceted classification for the KMconnection site, but only information-seeking geeks interested in faceted classification are likely to go to the trouble of navigating through that facet hierarchy to improve their searches.

    Casual users, in particular, are interested only  in results, and they're happiest when they can go as directly as possible to the topics that interest them. The classification experts  at Yahoo! understand this, so they place very specific, concrete subcategories under more abstract top-level categories — for example, in January 2002, Internet, WWW, Software, Games are placed directly under Computers & Internet, which is a top-level category cobbled together based on current a popular understanding of closely related topics. (We can assume that the top-level organization might have been very different had Yahoo! existed 10  years ago.)

  • Polyhierarchy. Yahoo!, traditional library classification schemas, back-of-the-book indexes, and most thesauri are polyhierarchical — that is, concepts in the schema may have more than one parent … and therefore belong to multiple hierarchies. Most users are familiar with this kind of organization, and it is a common characteristic of most large classification systems. (In any case, it's your only choice for organizing complex subjects in hierarchies.)
  • No “empty” categories. By design, popularity-based directories (as in book indexes) never present an “empty” category — that is, a category with no associated instances or subcategories that have associated instances. Empty categories are very frustrating for the typical information seeker.

    By contrast, faceted schemas and thesaurus-style implementations of knowledgebases  may have many categories with no associated instances of content. They are designed to be as exhaustive as possible and provide a very structured navigational path among related concepts before instances are assigned to those categories. Some categories in thesaurus-style classification systems may always remain “empty.”

  • A contributory model. Some Yahoo!-like directory-construction tools support candidate submissions by others in the organization as well as ranking of the quality of the candidate sites. This helps the ramp-up process by building content quickly and identifying what's important to members of the organization. The results may not be ideal, but richness of categorization often beats purity of structure.

Limitations of conventional categorization

Conventional categorization techniques, including Yahoo!-like directories and back-of-the-book indexes,  impose restrictions that limit their usefulness for managing organizational knowledge.

  • Most experts in classifying organizational knowledge recommend multiple conventional categorization schemas to accommodate different audiences. This is a good recommendation … up to a point. Unfortunately, it's not possible to discern and satisfy every audience's perspective or even satisfactorily accommodate the vocabulary of multiple audiences and individuals in such schemas. You could spend many months trying to identify these requirements. In addition, conventional categorization schemas are inherently brittle. The richer the directory structure and the greater the number of audience-specific directories, the more difficult management becomes.
  • Conventional categorization schemas are essentially simplified versions of the traditional domain-centered library card catalogue. Although Yahoo! has gained wide acceptance (judging by the number of imitators) for the Internet as a whole and we relish professional indexes in books, they do not make key concepts and their semantic relationships explicit — in other words, they don't really add much “structural knowledge.” 
  • Conventional categorization schemas are not easily scalable. That may seem like an odd statement about a directory like Yahoo!, which includes millions of indexed pages, but you have to remember that Yahoo! employs hundreds of indexing professionals to categorize new pages. And the results are still not consistently effective at providing high relevance for narrowly-defined queries.

In summary, conventional categorization is a necessary  complement to faceted access to organizational knowledge resources. By contrast, a faceted classification schema can provide:

  • Explicit representation of the structural knowledge of the organization. Not anywhere near all of it, of course, but many key basics.
  • Precise meaning of key concepts. In a sense, faceted knowledge organization provides a “digital” representation of meaning, contrasted with the looser meaning of concepts in a variety of document contexts.

Back Up Next

The impact of “managing knowledge” must be more than measurable; it must be predictable.

   

NOTE: As of December, 2007, this web site will no longer be updated.

Please go to Phil Murray's The Semantic Advantage web site or his Semantic Advantage blog for up-to-date information and opinion from Phil Murray.

 

Interested in faceted classification of information? Take a look at the Faceted Classification Discussion (FCD) mailing list.