
Models representing datasets.

class renku.domain_model.dataset.AnnotationJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

Schema for Annotations.

class renku.domain_model.dataset.Dataset(*args, **kwargs)[source]

Bases: renku.infrastructure.persistent.Persistent

Represent a dataset.


Add new files or update existing files.


Remove all files.


Return a clone of this dataset.

property creators_csv

Comma-separated list of creators associated with dataset.

property creators_full_csv

Comma-separated list of creators with full identity.

derive_from(dataset, creator, identifier=None, date_created=None)[source]

Make self a derivative of dataset and update related fields.

property files

Return list of existing files.


Find a file in the dataset using its relative path.

static generate_id(identifier)[source]

Generate an identifier for Dataset.


Return if a dataset has correct derived_from.


Return true if dataset is removed.

property keywords_csv

Comma-separated list of keywords associated with dataset.


Mark the dataset as removed.


Replace dataset’s identifier and update relevant fields.

NOTE: Call this only for newly-created/-imported datasets that don’t have a mutability chain because it sets initial_identifier.

Mark a file as removed using its relative path.

update_files_from(current_dataset, date=None)[source]

Check current_files to reuse existing entries and mark removed files.


Updates metadata.

update_metadata_from(other, exclude=None)[source]

Update metadata from another dataset.

class renku.domain_model.dataset.DatasetCreatorsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

Schema for the dataset creators.

class renku.domain_model.dataset.DatasetDetailsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

Serialize a dataset to a response object.

class renku.domain_model.dataset.DatasetFile(*, entity, based_on=None, date_added=None, date_removed=None, id=None, is_external=False, source=None)[source]

Bases: renku.infrastructure.immutable.Slots

A file in a dataset.


Return a clone of this object.

classmethod from_dataset_file(other)[source]

Return a copy with a different id.

classmethod from_path(client, path, source=None, based_on=None)[source]

Return an instance from a path.

static generate_id()[source]

Generate an identifier for DatasetFile.

NOTE: ID should not rely on Entity properties because the same Entity can be added and removed multiple times. So, it should be marked by different DatasetFiles.


Compare content.

NOTE: id is generated randomly and should not be included in this comparison.


Return true if dataset is removed and should not be accessed.


Create a new instance and mark it as removed.

class renku.domain_model.dataset.DatasetFileDetailsJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

Serialize dataset files to a response object.

class renku.domain_model.dataset.DatasetTag(*args, **kwargs)[source]

Bases: renku.infrastructure.persistent.Persistent

Represents a Tag of an instance of a dataset.

static generate_id(dataset_id, name)[source]

Define default value for id field.

class renku.domain_model.dataset.ImageObject(*, content_url, id, position)[source]

Bases: renku.infrastructure.immutable.Slots

Represents a ImageObject.

static generate_id(dataset_id, position)[source]

Generate @id field.

property is_absolute

Whether content_url is an absolute or relative url.

class renku.domain_model.dataset.ImageObjectJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

ImageObject json schema.

class renku.domain_model.dataset.ImageObjectRequestJson(*, only=None, exclude=(), many=False, context=None, load_only=(), dump_only=(), partial=False, unknown=None)[source]

Bases: marshmallow.schema.Schema

ImageObject json schema.

class renku.domain_model.dataset.Language(name, alternate_name=None, id=None)[source]

Bases: renku.infrastructure.immutable.Immutable

Represent a language of an object.

static generate_id(name)[source]

Generate @id field.

class renku.domain_model.dataset.RemoteEntity(*, checksum, id=None, path, url)[source]

Bases: renku.infrastructure.immutable.Slots

Reference to an Entity in a remote repository.

static generate_id(checksum, path, url)[source]

Generate an id.

class renku.domain_model.dataset.Url(*, id=None, url=None, url_str=None, url_id=None)[source]

Bases: object

Represents a schema URL reference.

static generate_id(url_str, url_id)[source]

Generate an identifier for Url.

property value

Returns the url value as string.

renku.domain_model.dataset.generate_default_name(dataset_title, dataset_version=None)[source]

Get dataset name.

renku.domain_model.dataset.get_dataset_data_dir(client, dataset)[source]

Return default data directory for a dataset.


Check if name is a valid slug.