Project status: Active

This project is currently (as of 2011) being developed under the auspices of my graduate degree research in Tokyo University, Ota Laboratory. It is a web search application that crawls the web to find information about activities related to the query given by a user

An activity is defined as

  • Something that you can do related to the query
  • An action that can be done
  • A task to be accomplished

An activity is not:

  • A definition of the query
  • Information about the query

The application's operation can be described by 3 phases. At the first phase there is a syntactical analysis of the query, achieved mainly by the usage of Wordnet, a lexical database created by Princeton University.

At the second phase follows query expansion by using the results of the syntactical analysis done in the previous step and querying the ontological database of ConceptNet. ConceptNet is an ontology database with common sense knowledge maintained by MIT. Depending on the Query's syntactical analysis ConceptNet is queried with assertions such as What is 'query noun' used for or What can receive 'query verb' as an action? From the above we obtain a host of related keywords which basically expand the original query

In the third and final phase websites containing how to articles and guides are crawled for all the above generated keywords including the original query. And in the end the results are shown in the application
This is a screenshot of the Activity Search Modeler:

Usage Guide

For a guide on the usage of the software demo go to the documentation page.

Future Work & Comments

The creation of this data mining application was but the starting part of the research in activity modeling by use of the web. From now and on machine learning algorithms will be employed to parse the contents of the activities articles that are found from this application and create a hierarchical activity tree of each activity. In a sense this will allow computer software to interpret and comprehend the contents of each activity. For the scope of this research the goal is to filter the search result goals according to the part of the activity tree that the user has selected.

But in out opinion this research has greater implications. This will enable total modeling of what any activity is and how it can be accomplished. Most importantly making this data comprehensible to a computer or a robot. That in turn can have many applications in the software world or even in robotics, where robots would be able to be trained to understand the model of an environment not by expert knowledge but just by a simple web search.

Here you can download a demo of the activity search application and play around with it. It is not without bugs as it is still very early in its development. Feedback is very important for this research so please direct any questions or comments to lefteris _at_ realintelligence.net . If you have any feedback or ideas for improvements do not hesitate to email me at the above mail.