Pages

Saturday, April 4, 2009

Semantic Search and Ontologies

What is Ontology?It Depends on What the Meaning of "Is" Is.
It is ironic that the word "ontology", which has to do with making clear and explicit statements about entities in a particular domain, has so many conflicting definitions. I'll offer two general ones.

The term "ontology" comes from the field of philosophy that is concerned with the study of being or existence. In philosophy, one can talk about an ontology as a theory of the nature of existence (e.g., Aristotle's ontology offers primitive categories, such as substance and quality, which were presumed to account for All That Is). In computer and information science, ontology is a technical term denoting an artifact that is designed for a purpose, which is to enable the modeling of knowledge about some domain, real or imagined

The term had been adopted by early Artificial Intelligence (AI) researchers, who recognized the applicability of the work from mathematical logic and argued that AI researchers could create new ontologies as computational models that enable certain kinds of automated reasoning. In the 1980's the AI community came to use the term ontology to refer to both a theory of a modeled world and a component of knowledge systems.

Classification
Categorization and classification are the act of organizing a collection of entities, whether things or concepts, into related groups. And then there's ontological classification or categorization, which is organizing a set of entities into groups, based on their essences and possible relations.

Now, anyone who deals with categorization for a living will tell you they can never get a perfect system. In working classification systems, success is not "Did we get the ideal arrangement?" but rather "How close did we come, and on what measures?" The idea of a perfect scheme is simply a Platonic ideal.

Ontological classification works well in some places, of course. You need a card catalog if you are managing a physical library. You need a hierarchy to manage a file system. So what you want to know, when thinking about how to organize anything, is whether that kind of classification is a good strategy.

When Does Ontological Classification Work Well?
Domain: small corpus, formal categories, stable entities, restricted entities, clear edges
Participants: expert catalogers, authoritative judgment, coordinated users, expert users
When Does Ontological Classification fail?
Domain: big corpus, informal categories, unstable and unrestricted entities, unclear edges
Participants: uncoordinated users, amateur users, naive catalogers, no authority

The list of factors making ontology a bad fit is, also, an almost perfect description of the Web; largest corpus, most naive users, no global authority, and so on. The more you push in the direction of scale, fluidity, flexibility, the harder it becomes to handle the expense of starting a cataloging system and the hassle of maintaining it.

Rigid structures will evidentely fail in tasks such as the "semantic search" one. Non-predefined structures that emerge from the context and also have the hability to evolve are needed to succed in that kind of tasks.

6 comments:

Ben Stein - Semantic Web said...

Great post Mariana!

You can check out http://www.urlclassifier.com for a great semantic web classification tool. Just like the one you've mentioned.
Using http://www.contextin.com - dealing with the exact challenges you're writing about.

Mariana Soffer said...

Hei ben, great tip, I did not know those searchears, they where just what I wanted. Indeed I am kind of working in one, not for searching but for opinion mining, so as soon as I have something running (which is not easy at all) I will let you know.

julio said...

yo entendia ontologia como la definicion del "ser", que quizas dista de la taxonomia, que es lo que etniendo del post.

es decir, definicion, luego, clasificacion.

Mariana Soffer said...

Estoy hablando mas que nada desde el area de inteligencia artificial. Una de las claras diferencias entre taxonomia y ontologia es que la taxonomia puede incluir reglas de inferencia, esta vivo -> respira

Paul said...

I'm afraid I fit into the firmly incapable of classification into finer definitions category. The syllables start to fascinate me beyond their meaning.

Mariana Soffer said...

The beauty of the words and sylables is an important part of the magic ontologies and taxonomies have. That is a very good start. Hugs.