-
Notifications
You must be signed in to change notification settings - Fork 270
GSOC2013_Progress_Hady Elsahar
hady elsahar edited this page Aug 8, 2013
·
36 revisions
- Sebastian Hellman
- Dimitris kontokostas
#Project Progress:
- public clone of Extraction framework
- preparing development environment
- compiling the Extraction framework
- Getting to know DBpedia main classes structures of the extraction framework
readings
- papers No. #1 #2 #4 in DBpedia publications
- http://wiki.dbpedia.org/Documentation
important discussions :
- exploring the PubSubHubbub Protocol
- installing a local Hub and subscribing to some RSS Feed
Overview about the PubSubHubbub protocol
readings
- PubSubHubbub home page : https://code.google.com/p/pubsubhubbub/
important discussions :
- Create a RDF dump out of 1-2K WikiData entities
- work on the language links from API:
- process Wikidata info, generate master IL links file.
- produce language-specific same_as files from master IL links file,
- Create a few mappings in the mappings wiki (as owl:equivalentProperty). The most common ones in the dumps
important discussions :
- step 1: Creating Master LLinks file (replacing the old bash commands with scala code)
- Step 2: Creating specific LLinks extraction in folders (after some number of code iterations we agreed upon that we can depend on that links comes in blocks ) , Implemented Algorithm
- updating code to utilize some Extraction framework utilities instead of rewriting them
- Code Reviews 1 , 2 ,3
- More code reviews , some code conflicts
important links/Discussions :
- implicit conversions in Scala
- Master branch uses Scala 2.9 , Dump branch uses Scala 2.10
- Updating RichReader.foreach to support end of lines detection
- recent commits : https://github.com/hadyelsahar/extraction-framework/commits/lang-link-extract
--- off to Leipzig 2-8 > 6-8
- Running wda-export-data.py script on lgd server
important discussions/Links :