IKHarvester

Some time ago I wrote about Didaskon, a framework for composing curriculum for a specific user, basing on his profile and using formal and informal knowledge. I belong to team of the developers.

At the moment, I am developing the one of its component – IKHarvester (Informal Knowledge Harvester). It aims at collecting (harvesting) data from Social Semantic Information Sources (SSIS) and providing it to Didaskon as informal Learning Objects (LOs). By SSIS, I mean community sites (blogs, wikis, social semantic digital libraries, bookmark sharing, video sharing etc.) with semantic annotations added. The prototype will use only wikis based on MediaWiki engine, blogs that support SIOC, and JeromeDL. For the general idea look at earlier presented poster.

In this post, I will focus only on blog posts.

Metadata for blog posts shall be collected with SIOC data exporters. A blog that supports SIOC some additional information in the meta tag (inside head tag) of its HTML code. For instance, regarding my blog, which is available at http://dobrzanski.net, it has the following statement:

The href attribute value is the URL of the RDF representation of the data on current page. Its value changes during browsing such blog; it is always up to date, ready to produce RDF output. In general, the output consists of some information about the blog itself and its posts.

IKHarvester is supposed to create metadata information about a blog post so that it can be used as an informal learning object. For that reason, it employs SIOC ecporter. Having the SIOC URL of the post it invokes the exporter and is given RDF triples. There are number of them, some do not carry crucial (for it) information. So, the system filters the output and saves only important triples to the repository. When its providing feature is called, IKHarvester collects that triples from the repository and transform them so they describe the post in a way compatible with LOM standard.

The following table presents how posts’ attributes (first column) are mapped to SIOC ontology predicates (second column) and then to LOM attributes (third column). Some of the LOM attributes are set to default values, which cannot be collected from SIOC exporter output.

Blog posts

Attribute	Predicate	LOM
	sioc:Post	Educational.LearningResourceType=“BlogPost”
URI	–	Technical.Location General.Identifier.Catalog=“URI” General.Identifier.Entry Meta-Metadata.Identifier.Catalog=“URI” Meta-Metadata.Identifier.Entry
title	dc:title	General.Identifier.Title
creator	sioc:has_creator	Lifecycle.Contribute.Role=“Author” Lifecycle.Contribute.Entity=“Personal info.” Lifecycle.Contribute.Date=“Date of creation” Meta-Metadata.Contribute.Role=“Author” Meta-Metadata.Contribute.Entity=“Personal info.” Meta-Metadata.Contribute.Date=“Date”
creation date	dctermss:link	Lifecycle.version=“Date”
description	SIOC:content	General.Description Educational.Description Classification.Description
rich content (HTML)	content:encoded	–
topic	sioc:topic	General.Keyword Classification.Keyword
reply	sioc:has_reply	Annotation.Entity=“About author” Annotation.Date=“Date” Annotation.Description=“Content”
external link*	sioc:links_to	Relation.Kind=“references” Relation.Resource.Identifier.Catalog=“URI” Relation.Resource.Identifier.Entry Relation.Resource.Description=“references”
language	–	General.Language Educational.Language Meta-Metadata.Language
–	–	Educational.InteractivityType=“expositive”
–	–	Educational.InteractivityLevel=“medium”
–	–	Educational.SemanticDensity=“medium”
–	–	Educational.IntendedEndUserRole=“learner”
–	–	Educational.Context=“school” Educational.Context=“higher education” Educational.Context=“training” Educational.Context=“other”
–	–	Educational.Difficulty=“easy”
–	–	Rights.Cost=“no”
–	–	General.Structure=“atomic”
–	–	General.AggregationLevel=“1”
–	–	MetaMetadata.MetadataSchema=“LOMv1.0”
–	–	Technical.Requirement.OrComposite… .Type=“operating system” .Name=“multi-os” .Type=“browser” .Name=“any”
–	–	LifeCycle.Status=“revised”

So, for blog posts I use SIOC ontology. I use it for wiki articles as well. For JeromeDL resources, I employ jeromedl and marcont ontologies. There is no point in showing the tables for each resource type, since they are similar. Looking forward to hearing some feedback icon_smile-3784334

Blog posts

You May Also Like