Help:WikiPathways SPARQL queries
From WikiPathways
(Difference between revisions)
(→Resources) |
(→Prefixes) |
||
| Line 18: | Line 18: | ||
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | ||
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> | ||
| - | PREFIX dc: <http://purl.org/dc/elements/1.1/> | + | PREFIX dc: <http://purl.org/dc/elements/1.1/> |
| - | PREFIX foaf: <http://xmlns.com/foaf/0.1/> | + | PREFIX cas: <http://identifiers.org/cas/> |
| - | PREFIX | + | PREFIX wprdf: <http://rdf.wikipathways.org/> |
| - | PREFIX | + | PREFIX foaf: <http://xmlns.com/foaf/0.1/> |
| - | PREFIX | + | PREFIX pubmed: <http://www.ncbi.nlm.nih.gov/pubmed/> |
| - | PREFIX dcterms: | + | PREFIX wp: <http://vocabularies.wikipathways.org/wp#> |
| - | PREFIX | + | PREFIX biopax: <http://www.biopax.org/release/biopax-level3.owl#> |
| - | PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> | + | PREFIX dcterms: <http://purl.org/dc/terms/> |
| - | PREFIX gpml: <http://vocabularies.wikipathways.org/gpml#> | + | PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> |
| - | PREFIX skos: <http://www.w3.org/2004/02/skos/core#> | + | PREFIX ncbigene: <http://identifiers.org/ncbigene/> |
| + | PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> | ||
| + | PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> | ||
| + | PREFIX gpml: <http://vocabularies.wikipathways.org/gpml#> | ||
| + | PREFIX skos: <http://www.w3.org/2004/02/skos/core#> | ||
</pre> | </pre> | ||
Revision as of 09:40, 29 November 2012
On sparql.wikipathways.org wikipathways content is replicated. Currently this SPARQL endpoint is being developed, with very irregular updates.
Resources
- WikiPathways internal vocabularies: http://vocabularies.wikipathways.org
- WikiPathways data as RDF: http://rdf.wikipathways.org
- WikiPathways SPARQL endpoint http://sparql.wikipathways.org
- Identifiers.org: http://identifiers.org
- Sparqlbin http://sparqlbin.org
Other sparql endpoints
- Gene Wiki: http://genewiki.semwebinside
Prefixes
Below are example queries. For readability we have omitted the prefixes. We use the following prefixes: (Not complete yet)
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX cas: <http://identifiers.org/cas/> PREFIX wprdf: <http://rdf.wikipathways.org/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX pubmed: <http://www.ncbi.nlm.nih.gov/pubmed/> PREFIX wp: <http://vocabularies.wikipathways.org/wp#> PREFIX biopax: <http://www.biopax.org/release/biopax-level3.owl#> PREFIX dcterms: <http://purl.org/dc/terms/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX ncbigene: <http://identifiers.org/ncbigene/> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX gpml: <http://vocabularies.wikipathways.org/gpml#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
Example queries
Queries with a * requires a bit more time for results.
Data curation oriented queries
Get the pathway with the erroneous data source "null"
SELECT DISTINCT ?identifier ?pathway ?label
WHERE {
?concept dc:source "null"^^xsd:string .
?concept dc:identifier ?identifier .
?concept dcterms:isPartOf ?pathway .
?concept rdfs:label ?label
}
Get all geneproducts that lack either a DataSource or an Identifier
prefix wp: <http://vocabularies.wikipathways.org/wp#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix dcterms: <http://purl.org/dc/terms/>
select distinct ?pathway ?label where {?geneProduct a wp:GeneProduct .
?geneProduct rdfs:label ?label .
?geneProduct dcterms:isPartOf ?pathway .
FILTER regex(str(?geneProduct), "^node").
FILTER regex(str(?pathway), "^http").
}
Pathway oriented queries
Get the species currently in WikiPathways with their respective URI's
SELECT DISTINCT ?organism ?label
WHERE {
?concept wp:organism ?organism .
?organism rdfs:label ?label .
}
List pathways and their species
SELECT DISTINCT ?title ?label
WHERE {
?pathway dc:title ?title .
?pathway wp:organism ?organism .
?organism rdfs:label ?label .
}
List the species captured in WikiPathways and the number of pathways per species
SELECT DISTINCT ?organism ?label count(?pathway) as ?noPathways
WHERE {
?pathway dc:title ?title .
?pathway wp:organism ?organism .
?organism rdfs:label ?label .
}
ORDER BY DESC(?noPathways)
Count the pathways per pathway category
SELECT DISTINCT ?category count(?category) as ?noCategories
WHERE {
?pathway wp:category ?category .
?pathway dc:title ?title .
}
ORDER BY ?category
List all pathways of category Metabolic Process
SELECT DISTINCT *
WHERE {
?pathway wp:category wp:MetabolicProcess .
?pathway dc:title ?title .
}
Get all pathways with CYP protein
select distinct ?pathway ?label where {
?geneProduct a wp:GeneProduct .
?geneProduct rdfs:label ?label .
?geneProduct dcterms:isPartOf ?pathway .
FILTER regex(str(?label), "CYP").
FILTER regex(str(?pathway), "^http").
}
Datasource oriented queries
Get all datasources currently captured in WikiPathways
SELECT DISTINCT ?datasource
WHERE {
?concept dc:source ?datasource
}
Get the number of entries per datasource in WikiPathways
SELECT DISTINCT ?datasource count(?datasource) as ?numberEntries
WHERE {
?concept dc:source ?datasource
}
ORDER BY DESC(?numberEntries)
Count the identifiers per data source
SELECT DISTINCT ?datasource ?identifier count(?identifier) AS ?numberEntries
WHERE {
?concept dc:source ?datasource .
?concept dc:identifier ?identifier
}
Count the identifiers per data source and order them from high to low
SELECT DISTINCT ?datasource ?identifier count(?identifier) AS ?numberEntries
WHERE {
?concept dc:source ?datasource .
?concept dc:identifier ?identifier
}
ORDER BY DESC(?numberEntries)
Return all Chembl compounds in WikiPathways and the pathways they are in
SELECT DISTINCT ?identifier ?pathway
WHERE {
?concept dcterms:isPartOf ?pathway .
?concept dc:source "ChEMBL compound"^^xsd:string .
?concept dc:identifier ?identifier .
}
Curators oriented queries
Extract contributors
SELECT DISTINCT ?contributor
WHERE {
?pathway dc:contributor ?contributor
}
Extract the amount of pathways edited per contributor
SELECT DISTINCT ?contributor, count(?pathway) as ?pathwaysEdited
WHERE {
?pathway dc:contributor ?contributor
}
ORDER BY DESC(?pathwaysEdited)
find the pathways a user have edited so far.
SELECT DISTINCT ?pathway, ?pathwayLabel
WHERE {
?pathway dc:contributor wpuser:Andra .
?pathway dc:contributor ?contributor .
?pathway rdfs:label ?pathwayLabel .
}
Federated queries
WikiPathways with GeneWiki
SELECT DISTINCT ?wplabel ?identifier ?snp where {
?s dc:identifier <http://identifiers.org/ncbigene/53975> .
?s dc:identifier ?identifier .
?s rdfs:label ?wplabel .
?s dc:source ?source .
SERVICE <http://genewiki.semwebinsi.de/> {
?gws dc:identifier ?identifier .
?gws rdf:type ?gwtype .
?gws <http://genewikiplus.org/wiki/Special:URIResolver/Property-3AHasSNP> ?snp .
}
}
prefix dc: <http://purl.org/dc/elements/1.1/>
prefix dcterms: <http://purl.org/dc/terms/>
select distinct * where {
?pwEntity dc:identifier ?identifier .
?pwEntity dcterms:isPartOf ?pathway .
SERVICE <http://genewiki.semwebinsi.de/> {
?concept dc:identifier <http://identifiers.org/ncbigene/12189> .
?concept dc:identifier ?identifier .
?concept <http://genewikiplus.org/wiki/Special:URIResolver/Property-3AIs_associated_with_disease> ?disease .
}
}
WikiPathways with ChEMBL: ChEMBL compounds in WikiPathways (without BridgeDB)
SELECT *
WHERE {{
SELECT DISTINCT ?pathway ?concept iri(bif:concat("http://linkedchemistry.info/chembl/chemblid/", bif:regexp_substr('http://identifiers.org/chembl.compound/(.*)',?identifier, 1))) as ?ChEMBLId where {
?concept dcterms:isPartOf ?pathway .
?concept dc:source "ChEMBL compound"^^xsd:string .
?concept dc:identifier ?identifier .
FILTER regex(str(?identifier), "^http").
}
} SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/>{
?ChEMBLId ?p ?o .
} }
WikiPathways with ChEMBL: all ChEMBL assays for pathways
SELECT ?pathway ?target ?assay WHERE {
{
SELECT DISTINCT
?pathway ?uniprot
iri(
bif:concat("http://bio2rdf.org/uniprot:",
bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
) as ?chembluniprot
WHERE {
?s ?p ?uniprot .
?s dcterms:isPartOf ?pathway .
FILTER regex(?uniprot, "uniprot")
}
}
SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
?target owl:sameAs ?chembluniprot .
?score chembl:forTarget ?target .
?assay chembl:hasTargetScore ?score .
}
}
WikiPathways with ChEMBL: all molecules targeting pathways
SELECT ?pathway ?target ?assay ?smiles WHERE {
{
SELECT DISTINCT
?pathway ?uniprot
iri(
bif:concat("http://bio2rdf.org/uniprot:",
bif:regexp_substr('http://identifiers.org/uniprot/(.*)',?uniprot, 1))
) as ?chembluniprot
WHERE {
?s ?p ?uniprot .
?s dcterms:isPartOf ?pathway .
FILTER regex(?uniprot, "uniprot")
}
}
SERVICE <http://rdf.farmbio.uu.se/chembl/sparql/> {
?target owl:sameAs ?chembluniprot .
?score chembl:forTarget ?target .
?assay chembl:hasTargetScore ?score .
?activity chembl:onAssay ?assay ;
chembl:forMolecule ?molecule .
?molecule bo:smiles ?smiles .
}
}
Code examples
Java
For java we recommend the Jena Framework.
import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
public class javaCodeExample {
public static void main(String[] args) {
String sparqlQueryString = "SELECT * WHERE {?s ?p ?o} LIMIT 10";
Query query = QueryFactory.create(sparqlQueryString);
QueryExecution queryExecution = QueryExecutionFactory.sparqlService("http://sparql.wikipathways.org", query);
ResultSet resultSet = queryExecution.execSelect();
while (resultSet.hasNext()) {
QuerySolution solution = resultSet.next();
System.out.print(solution.get("s"));
System.out.print("\t"+solution.get("p"));
System.out.println("\t"+solution.get("o"));
}
}
}
php
For php we recommend the arc2: Easy RDF and SPARQL for LAMP systems
R
library(rrdf)
sparql.remote(
"http://sparql.wikipathways.org/",
"SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
)
Bioclipse
The below code works in both the JavaScript and the Groovy console:
rdf.sparqlRemote(
"http://sparql.wikipathways.org/",
"SELECT DISTINCT ?p WHERE { ?s ?p ?o }"
)
SPARQL from the command line
For quick and easy querying, we recommend to use curl (Linux and OS X)
curl -F "query=SELECT * WHERE {?s ?p ?o} LIMIT 10" http://sparql.wikipathways.org

