Sunday, 15 March 2015

rdf - Computing custom histogram metrics to understand graph structure using SPARQL -



rdf - Computing custom histogram metrics to understand graph structure using SPARQL -

i looking analyze construction of graph , 1 particular query wanted seek out extract different combinations of subject type - border type - object type in graph.

this follow couple of before questions of mine:

how generate triples fit particular node type or/and border type using sparql query?

how list , count different types of node , border entities in graph info using sparql query?

for example: if there semantic graph border types(property/predicate types)

iscapitalof iscityof haspopulation etc etc etc

and if node types like:

cities countries rivers mountains etc

then should get:

city->iscapitalof->country 4 tuples city->iscityof->country 21 tuples river->ispartof->country 3 river->passesthrough->city 11

and on...

note: no literals in object field want unit subgraph pattern fitting (subjecttype edgetype objecttype)

to summarize: think way i'd approach be:

a) compute distinct subject types in graph b) compute distinct border types in graph c) compute distinct object type in graph (a/b/c have been answered in previous questions)

now d) generate possible combinations(of subject type -> border type -> object type(no literals) , counts (like histogram) of such patterns

hope question articulated reasonably well.|

edit: adding sample info [few rows entire dataset] yago dataset available publicly

<alabama> rdf:type <wordnet_country_108544813> . <abraham_lincoln> rdf:type <wordnet_president_110467179> . <aristotle> rdf:type <wordnet_writer_110794014> . <academy_award_for_best_art_direction> rdf:type <wordnet_award_106696483> . <academy_award> rdf:type <wordnet_award_106696483> . <actrius> rdf:type <wordnet_movie_106613686> . <animalia_(book)> rdf:type <wordnet_book_106410904> . <ayn_rand> rdf:type <wordnet_novelist_110363573> . <allan_dwan> rdf:type <wikicategory_american_film_directors> . <algeria> rdf:type <wordnet_country_108544813> . <andre_agassi> rdf:type <wordnet_player_110439851> . <austro-asiatic_languages> rdf:type <wordnet_language_106282651> . <afroasiatic_languages> rdf:type <wordnet_language_106282651> . <andorra> rdf:type <wordnet_country_108544813> . <animal_farm> rdf:type <wordnet_novelette_106368962> . <alaska> rdf:type <wordnet_country_108544813> . <aldous_huxley> rdf:type <wordnet_writer_110794014> . <andrei_tarkovsky> rdf:type <wordnet_film_maker_110088390> .

suppose you've got info this:

class="lang-none prettyprint-override">@prefix : <http://stackoverflow.com/q/24313367/1281433/> . :city1 :city . :city2 :city . :country1 :country . :country2 :country . :country3 :country . :river1 :river . :river2 :river . :river3 :river . :city1 :iscapitalof :country1 . :river1 :ispartof :country1, :country2 . :river2 :ispartof :country2, :country3 . :river1 :passesthrough :city1, :city2 . :river2 :passesthrough :city2 .

then query gives kind results want, think:

class="lang-sql prettyprint-override">prefix : <http://stackoverflow.com/q/24313367/1281433/> select ?type1 ?p ?type2 (count(distinct *) ?count) { [ ?type1 ; ?p [ ?type2 ] ] } grouping ?type1 ?p ?type2 class="lang-none prettyprint-override">---------------------------------------------- | type1 | p | type2 | count | ============================================== | :river | :passesthrough | :city | 3 | | :city | :iscapitalof | :country | 1 | | :river | :ispartof | :country | 4 | ----------------------------------------------

if you're not comfortable [ … ] blank node syntax, might help see expanded form:

class="lang-none prettyprint-override">select ?type1 ?p ?type2 (count(distinct *) ?count) { _:b0 rdf:type ?type1 . _:b0 ?p _:b1 . _:b1 rdf:type ?type2 } grouping ?type1 ?p ?type2

this catches things have types, though. if want include things don't have rdf:types, you'd want

class="lang-none prettyprint-override">select ?type1 ?p ?type2 (count(distinct *) ?count) { ?x ?p ?y optional { ?x ?type1 } optional { ?y ?type2 } } grouping ?type1 ?p ?type2

graph rdf sparql

No comments:

Post a Comment