Remember my previous post about IKHarvester. There, I’ve briefly described how I collect metadata for blog posts which support SIOC. Then, I thought it’s a good idea to describe in one place what really IKHarvester is and how it works.

IKHarvester (Informal Knowledge Harvester) is a web service that characterizes with two core features: harvesting data, and providing it for eLearning frameworks. It benefits from the Semantic Web core postulate that demands rich descriptions of resource available online. Thus, the content of web pages is understandable not only with machines but also by machines.

Data harvesting

IKHarvester captures RDF data from Social Semantic Information Sources (SSIS). The current version works with semantic blogs, semantic wikis, and JeromeDL (the Social Semantic Digital Library).

IKHarvester looks for RDF documents related to the given resource, which is indicated by invisible links in HTML (look there for more details on capturing RDF). Besides reading pure RDF data, I IKHarvester is supposed to use Microformats which allow embedding RDF into HTML documents. Moreover, IKHarvester is capable of creating RDF description for non-semantic information sources, like Wikipedia. For that reason, it scrapes the HTML code of an article and collects some data (a title, external links, see also links, references, etc.) from it.

All in all, a read or created RDF document for an online resource is saved to the informal knowledge repository.

Data providing

Once the informal knowledge repository is filled with data, it can be used by possible clients, like Learning Management System (LMS). Because of the eLearning background, IKHarvester provides informal knowledge in a form of Learning Objects (LOs). In general, a Learning Object is something you can acquire, manage and use; LOs are reusable, modular, flexible, portable and compatible. We have followed SCORM CAM (Content Aggregation Model) instructions in defining the way of creating and managing LOs. This standard suggests using Learning Object Metadata (LOM) for
describing learning material.

There are nine categories of this information, each of which focus on different aspects:

  • General – general information about the LO as a whole
  • Lifecycle – features related to the history and current state of the LO and those who have affected it during its evolution
  • MetaMetadata – information about the metadata instance itself
  • Technical – groups the technical requirements and technical characteristics of the LO
  • Educational – educational and pedagogic characteristics of the LO
  • Rights – intellectual property rights and condition of use the LO
  • Relation – group of features defining the relationship between the LO and other related LOs
  • Annotation – comments on the educational use of the LO and information on the author of the comment and time when it was written
  • Classification – describes the LO in relation to a particular classification system

Soon I’ll write about progress status and successful stories of IKHarvester.

Previous Post
Next Post