Datasets

Model objects representing datasets.

Dataset object

class renku.core.models.datasets.Dataset(*, commit=None, client=None, path=None, project=None, parent=None, creator=NOTHING, id=None, label=None, date_published=None, description=None, identifier=NOTHING, in_language=None, keywords=NOTHING, license=None, name=None, url=None, version=None, created=NOTHING, files=NOTHING, tags=NOTHING, same_as=None, short_name=None)[source]

Represent a dataset.

Type:

["prov:Entity", "schema:Dataset", "wfprov:Artifact"]

Context:

{
  "schema": "http://schema.org/",
  "@version": 1.1,
  "prov": "http://www.w3.org/ns/prov#",
  "wfprov": "http://purl.org/wf4ever/wfprov#",
  "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
  "path": "prov:atLocation",
  "_id": "@id",
  "_label": "rdfs:label",
  "_project": {
    "@id": "schema:isPartOf",
    "@context": {
      "schema": "http://schema.org/",
      "prov": "http://www.w3.org/ns/prov#",
      "@version": 1.1,
      "name": "schema:name",
      "created": "schema:dateCreated",
      "updated": "schema:dateUpdated",
      "version": "schema:schemaVersion",
      "creator": {
        "@id": "schema:creator",
        "@context": {
          "schema": "http://schema.org/",
          "prov": "http://www.w3.org/ns/prov#",
          "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
          "@version": 1.1,
          "name": "schema:name",
          "email": "schema:email",
          "label": "rdfs:label",
          "affiliation": "schema:affiliation",
          "alternate_name": "schema:alternateName",
          "_id": "@id"
        }
      },
      "_id": "@id"
    }
  },
  "creator": {
    "@id": "schema:creator",
    "@context": {
      "schema": "http://schema.org/",
      "prov": "http://www.w3.org/ns/prov#",
      "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
      "@version": 1.1,
      "name": "schema:name",
      "email": "schema:email",
      "label": "rdfs:label",
      "affiliation": "schema:affiliation",
      "alternate_name": "schema:alternateName",
      "_id": "@id"
    }
  },
  "date_published": "schema:datePublished",
  "description": "schema:description",
  "identifier": "schema:identifier",
  "in_language": {
    "@id": "schema:inLanguage",
    "@context": {
      "schema": "http://schema.org/",
      "@version": 1.1,
      "alternate_name": "schema:alternateName",
      "name": "schema:name"
    }
  },
  "keywords": "schema:keywords",
  "license": "schema:license",
  "name": "schema:name",
  "url": "schema:url",
  "version": "schema:version",
  "created": "schema:dateCreated",
  "files": {
    "@id": "schema:hasPart",
    "@context": {
      "schema": "http://schema.org/",
      "renku": "https://swissdatasciencecenter.github.io/renku-ontology#",
      "@version": 1.1,
      "prov": "http://www.w3.org/ns/prov#",
      "wfprov": "http://purl.org/wf4ever/wfprov#",
      "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
      "path": "prov:atLocation",
      "_id": "@id",
      "_label": "rdfs:label",
      "_project": {
        "@id": "schema:isPartOf",
        "@context": {
          "schema": "http://schema.org/",
          "prov": "http://www.w3.org/ns/prov#",
          "@version": 1.1,
          "name": "schema:name",
          "created": "schema:dateCreated",
          "updated": "schema:dateUpdated",
          "version": "schema:schemaVersion",
          "creator": {
            "@id": "schema:creator",
            "@context": {
              "schema": "http://schema.org/",
              "prov": "http://www.w3.org/ns/prov#",
              "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
              "@version": 1.1,
              "name": "schema:name",
              "email": "schema:email",
              "label": "rdfs:label",
              "affiliation": "schema:affiliation",
              "alternate_name": "schema:alternateName",
              "_id": "@id"
            }
          },
          "_id": "@id"
        }
      },
      "creator": {
        "@id": "schema:creator",
        "@context": {
          "schema": "http://schema.org/",
          "prov": "http://www.w3.org/ns/prov#",
          "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
          "@version": 1.1,
          "name": "schema:name",
          "email": "schema:email",
          "label": "rdfs:label",
          "affiliation": "schema:affiliation",
          "alternate_name": "schema:alternateName",
          "_id": "@id"
        }
      },
      "added": "schema:dateCreated",
      "name": "schema:name",
      "url": "schema:url",
      "based_on": "schema:isBasedOn",
      "external": "renku:external"
    }
  },
  "tags": {
    "@id": "schema:subjectOf",
    "@context": {
      "schema": "http://schema.org/",
      "@version": 1.1,
      "name": "schema:name",
      "description": "schema:description",
      "commit": "schema:location",
      "created": "schema:startDate",
      "dataset": "schema:about",
      "_id": "@id"
    }
  },
  "same_as": {
    "@id": "schema:sameAs",
    "@context": {
      "schema": "http://schema.org/",
      "@version": 1.1,
      "url": "schema:url",
      "_id": "@id"
    }
  },
  "short_name": "schema:alternateName"
}
as_jsonld()[source]

Create JSON-LD.

asjsonld()

Create JSON-LD with the original source data.

contains_any(files)[source]

Check if files are already within a dataset.

creators_csv

Comma-separated list of creators associated with dataset.

creators_full_csv

Comma-separated list of creators with full identity.

data_dir

Directory where dataset files are stored.

default_id()

Configure calculated ID.

default_label()

Generate a default label.

default_reference()

Create a default reference path.

editable

Subset of attributes which user can edit.

entities

Yield itself.

find_file(filename, return_index=False)[source]

Find a file in files container.

find_files(paths)[source]

Return all paths that are in files container.

classmethod from_jsonld(data, client=None, commit=None, schema_class=None)[source]

Create an instance from JSON-LD data.

classmethod from_revision(client, path, revision='HEAD', parent=None, find_previous=True, **kwargs)

Return dependency from given path and revision.

classmethod from_yaml(path, client=None, commit=None)[source]

Return an instance from a YAML file.

keywords_csv

Comma-separated list of keywords associated with dataset.

parent

Return the parent object.

rename_files(rename)[source]

Rename files using the path mapping function.

set_client(client)

Sets the clients on this entity.

short_id

Shorter version of identifier.

short_name_validator(attribute, value)[source]

Validate short_name.

submodules

Proxy to client submodules.

tags_csv

Comma-separated list of tags associated with dataset.

to_yaml()[source]

Write an instance to the referenced YAML file.

uid

UUID part of identifier.

Unlink a file from dataset.

Parameters:file_path – Relative path used as key inside files container.
update_files(files)[source]

Update files with collection of DatasetFile objects.

update_metadata(other_dataset)[source]

Updates instance attributes with other dataset attributes.

Parameters:other_datasetDataset
Returns:self

Dataset file

Manage files in the dataset.

class renku.core.models.datasets.DatasetFile(*, commit=None, client=None, path=None, id=None, label=NOTHING, project=None, parent=None, creator=NOTHING, added=NOTHING, checksum=None, filename=NOTHING, name=None, filesize=None, filetype=None, url=None, based_on=None, external=False)[source]

Represent a file in a dataset.

Type:

["prov:Entity", "schema:DigitalDocument", "wfprov:Artifact"]

Context:

{
  "schema": "http://schema.org/",
  "renku": "https://swissdatasciencecenter.github.io/renku-ontology#",
  "@version": 1.1,
  "prov": "http://www.w3.org/ns/prov#",
  "wfprov": "http://purl.org/wf4ever/wfprov#",
  "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
  "path": "prov:atLocation",
  "_id": "@id",
  "_label": "rdfs:label",
  "_project": {
    "@id": "schema:isPartOf",
    "@context": {
      "schema": "http://schema.org/",
      "prov": "http://www.w3.org/ns/prov#",
      "@version": 1.1,
      "name": "schema:name",
      "created": "schema:dateCreated",
      "updated": "schema:dateUpdated",
      "version": "schema:schemaVersion",
      "creator": {
        "@id": "schema:creator",
        "@context": {
          "schema": "http://schema.org/",
          "prov": "http://www.w3.org/ns/prov#",
          "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
          "@version": 1.1,
          "name": "schema:name",
          "email": "schema:email",
          "label": "rdfs:label",
          "affiliation": "schema:affiliation",
          "alternate_name": "schema:alternateName",
          "_id": "@id"
        }
      },
      "_id": "@id"
    }
  },
  "creator": {
    "@id": "schema:creator",
    "@context": {
      "schema": "http://schema.org/",
      "prov": "http://www.w3.org/ns/prov#",
      "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
      "@version": 1.1,
      "name": "schema:name",
      "email": "schema:email",
      "label": "rdfs:label",
      "affiliation": "schema:affiliation",
      "alternate_name": "schema:alternateName",
      "_id": "@id"
    }
  },
  "added": "schema:dateCreated",
  "name": "schema:name",
  "url": "schema:url",
  "based_on": "schema:isBasedOn",
  "external": "renku:external"
}
as_jsonld()[source]

Create JSON-LD.

asjsonld()

Create JSON-LD with the original source data.

creators_csv

Comma-separated list of creators associated with dataset.

creators_full_csv

Comma-separated list of creators with full identity.

default_filename()[source]

Generate default filename based on path.

default_id()

Configure calculated ID.

default_label()

Generate a default label.

default_reference()

Create a default reference path.

entities

Yield itself.

classmethod from_jsonld(data)[source]

Create an instance from JSON-LD data.

classmethod from_revision(client, path, revision='HEAD', parent=None, find_previous=True, **kwargs)

Return dependency from given path and revision.

classmethod from_yaml(path, client=None, commit=None)

Return an instance from a YAML file.

full_path

Return full path in the current reference frame.

parent

Return the parent object.

set_client(client)

Sets the clients on this entity.

size_in_mb

Return file size in megabytes.

submodules

Proxy to client submodules.

to_yaml()

Store an instance to the referenced YAML file.