Fri 11 Jul 2008
I’ve pretty much got my setup as needed now: parallel Jena and AllegroGraph triple stores, each with an identical set of 21 million triples. So I’m doing experiments with queries - trying to find good tests that demonstrate particular points I want to make. I’m having some trouble with AG queries - my setup isn’t quite right I think, but Franz are being very helpful. With Jena I’m running queries via JDBC and having a lot of grief over Java heap space in Eclipse. Inching forwards though…
Now I should do the conversion and loading of thesauri I think. I need that in place before I can do my query summarisation stuff. Designing the schema should be pretty straightforward, but I’ve found a couple of papers on the subject from the Amsterdam people, so I’ll check out their thoughts first. I’m hoping it’ll only take a day or two to do the whole thing: design schema, export from Oracle in the right format, load into my two stores. The plan is to use separate named graphs within the same physical store.
I need to write up my query experiments in time for the IACH deadline in August, as they’ve accepted my submission. My attempt for ISWC wasn’t in the 16% of good ones, alas. Pity, as I’d have liked to go, but actually ECDL might be more useful in terms of meeting people. It’s certainly likely to be a more familiar milieu for me.