As mentioned in the Creating a Database section, you can choose to supply data files to load at database creation time to bulk load data. This page discusses adding data to an existing Stardog database. If you have too much data to load into a database, consider querying the data from a Virtual Graph.
<details open markdown="block"> <summary> Page Contents </summary> 1. TOC </details>Stardog supports the following data formats:
Stardog supports loading data from compressed files directly: there's no need to uncompress files before loading. Loading compressed data is the recommended way to load large input files. Stardog supports GZIP, BZIP2 and ZIP compressions natively.
A file passed provided to Stardog to be loaded will be treated as compressed if the file name ends with .gz or .bz2. The RDF format of the file is determined by the penultimate extension. For example, if a file named test.ttl.gz is used as input, Stardog will perform GZIP decompression during loading and parse the file with Turtle parser. All the formats supported by Stardog can be used with compression.
If Stardog does not recognize the format or compression for the file being added, you can specify them explicitly using the --format and --compression options when adding data via the CLI.
The ZIP support works differently since zipped files can contain many files. When an input file name ends with .zip, Stardog performs ZIP decompression and tries to load all the files inside the ZIP file. The RDF format of the files inside the zip is determined by their file names. If there is an unrecognized file extension (e.g. .txt), then that file will be skipped.
ZIP files can only be added via the CLI or Stardog Studio. If adding via HTTP, Java API, etc, you must first unzip the files on the client side.
Use the data add CLI command to add data to a database in a single commit.
The add operation is atomic -- if multiple files are being added to the database and there is an error adding one or more of the files, the entire operation will be rolled back.
There are 2 required arguments for the data add command:
http://remote-server:5820/myDatabase).
-Dstardog.default.cli.server provided at startup.
http://localhost:5820 is used.5820 is used.https.@ and all the files specified after a named graph (and before the next graph) are added into that named graph.--server-side flag. The client will just send the file path to the server.To load a Turtle file (file.ttl) into the default graph of a database:
stardog data add --format turtle myDatabase file.ttl
To load data in an N-Triples file to one specific named graph (urn:g1) on a remote server:
stardog data add --named-graph urn:g1 -- http://remote-server:5820/myDatabase path/to/data.nt
To load data into multiple named graphs:
stardog data add myDatabase input0.ttl @urn:g1 input1.ttl input2.ttl @urn:g2 input3.ttl @ input4.ttl
In the above example:
input0.ttl is loaded into the default graphinput1.ttl and input2.ttl are loaded into urn:g1input3.ttl is loaded into urn:g2input4.ttl is loaded into the default graph (@ denotes to the CLI to switch back to adding to default graph)To load data that exists on the server side:
stardog data add --server-side --named-graph urn:g1 -- http://remote-server:5820/myDatabase path/on/server/data.nt
Use the --remove-all flag to delete all data in the target database prior to adding data to it.
stardog data add --remove-all myDatabase data.nt
There are 2 main ways to upload data in Stardog Studio:
+ button to open a new tabAll HTTP requests that are mutative (add or remove) must include a valid Content-Type header set to the MIME type of the request body, where "valid" is a valid MIME type for N-Triples, TriG, Turtle, N-Quads, JSON-LD, or RDF/XML:
| Data Format | MIME Type |
|---|---|
| N-Triples | application/n-triples |
| TriG | application/trig |
| Turtle | application/x-turtle or text/turtle |
| N-Quads | application/n-quads |
| JSON-LD | application/ld+json |
| RDF/XML | application/rdf+xml |
When adding compressed data, you must additionally provide a Content-Encoding header set to the compression format.
| Compression Format | Content Encoding |
|---|---|
| GZIP | gzip |
| BZIP2 | bzip2 |
The following examples show how to add data within a transaction.
Begin a transaction and set it to tx for use in steps 2 and 3.
tx=`curl -u username:password -X POST http://localhost:5820/myDatabase/transaction/begin`
Add data within the transaction - when adding data to the default graph, specifying the query parameter graph-uri is optional. It is shown here for clarity.
If a graph format that specifies context/named-graph is used, such as TriG, the graph-uri parameter will override the context(s) specified in data file. For example, if graph-uri is set to some:graph and a TriG file is added that contains 10 different named graphs, all triples contained in those 10 graphs will be added to some:graph. The graph-uri parameter can be omitted if you want to use the contexts specified in the file.
curl -u username:password -X POST "http://localhost:5820/myDatabase/${tx}/add?graph-uri=default" \
-H "Content-Type: application/trig" \
--data-binary @/path/to/data.trig
Commit the transaction after the data completes adding
curl -u username:password -X POST "http://localhost:5820/myDatabase/transaction/commit/${tx}"
Begin a transaction and set it to tx for use in steps 2 and 3.
tx=`curl -u username:password -X POST http://localhost:5820/mydb/transaction/begin`
Add data within the transaction - the graph-uri query parameter is used to denote which named graph the data will be added to.
# add data within the transaction
curl -u username:password -X POST "http://localhost:5820/mydb/${tx}/add?graph-uri=urn:graph" \
-H "Content-Type: text/turtle" \
-H "Content-Encoding: gzip" \
--data-binary @/path/to/data.ttl.gz
Commit the transaction after the data completes adding
curl -u username:password -X POST "http://localhost:5820/mydb/transaction/commit/${tx}"
POST /{db}The Graph Store HTTP Protocol can be used to merge the RDF contained in the request into the specified graph. This can be used to create a new named graph or add data to an existing one. This will automatically begin and commit a new transaction when the data has finished loading.
curl -u username:password http://localhost:5820/myDatabase?graph=urn:graph -X POST \
-H "Content-Type: text/turtle" \
-H "Content-Encoding: gzip" \
--data-binary @foo.ttl.gz
For more information see the HTTP API and the W3C SPARQL 1.1 Graph Store HTTP Protocol
PUT /{db}The Graph Store HTTP Protocol can also be used to overwrite and replace a named graph with the specified content in the request. If no graph exists, a new graph will be created. This will automatically begin and commit a new transaction when the data has finished loading.
curl -u username:password http://localhost:5820/myDatabase?graph=urn:graph -X PUT \
-H "Content-Type: text/turtle" \
-H "Content-Encoding: gzip" \
--data-binary @foo.ttl.gz
For more information see the HTTP API and the W3C SPARQL 1.1 Graph Store HTTP Protocol
import com.complexible.stardog.api.Connection;
import com.complexible.stardog.api.ConnectionConfiguration;
import com.stardog.stark.Resource;
import com.stardog.stark.Statement;
import com.stardog.stark.Values;
import com.stardog.stark.io.RDFFormats;
import java.nio.file.Paths;
import java.util.Collection;
import java.util.Collections;
public class AddingData {
public static void main(String[] args) {
try (Connection aConn = ConnectionConfiguration
.to("myDatabase") // the name of the db to connect to
.server("http://localhost:5820") //server url
.credentials("username", "password") // credentials to use while connecting
.connect()) {
//begin transaction
aConn.begin();
//add data that exists on client to a database (local or remote)
aConn.add()
.io()
.format(RDFFormats.TURTLE)
.file(Paths.get("/path/to/data.ttl"));
//create some data and add to a named graph
Collection<Statement> aGraph = Collections.singleton(
Values.statement(Values.iri("urn:subj"),
Values.iri("urn:pred"),
Values.iri("urn:obj")));
Resource aContext = Values.iri("urn:test:context");
aConn.add().graph(aGraph, aContext);
// commit transaction
aConn.commit();
}
}
}
You must always enclose changes to a database within a transaction begin and commit or rollback. Changes are local until the transaction is committed or until you try and perform a query operation to inspect the state of the database within the transaction.
By default, RDF added will go into the default context unless specified otherwise. As shown, you can use Adder directly to add statements and graphs to the database; and if you want to add data from a file or input stream, you use the io, format, and stream chain of method invocations.
For more information take a look at Programming with Stardog
You can use SPARQL's LOAD operation to load a file in S3 to your database. To do so:
Create an entry in your password file:
s3:s3host:port:bucket:accessKey:secretKey
Execute the LOAD query:
LOAD <s3://s3host/bucketName/path/to/file/to/load>
LOAD <s3://s3host/bucketName/path/to/file/to/load> INTO GRAPH <my:graph>
Typically s3host will be set to s3.amazonaws.com but if you are using a different S3 provider you can use the corresponding host address.