Core Business Logic

renku.core contains the business logic of renku-python. Functionality is split into subfolders based on topic, such as dataset or workflow.

Command Builder

Most renku commands require context (database/git/etc.) to be set up for them. The command builder pattern makes this easy by wrapping commands in factory methods.

Renku Command Builder .

class renku.command.command_builder.Command[source]

Base renku command builder.

__init__ of Command.

add_injection_pre_hook(order, hook)[source]

Add a pre-execution hook for dependency injection.

Parameters
  • order (int) – Determines the order of executed hooks, lower numbers get executed first.

  • hook (Callable) – The hook to add.

add_post_hook(order, hook)[source]

Add a post-execution hook.

Parameters
  • order (int) – Determines the order of executed hooks, higher numbers get executed first.

  • hook (Callable) – The hook to add.

add_pre_hook(order, hook)[source]

Add a pre-execution hook.

Parameters
  • order (int) – Determines the order of executed hooks, lower numbers get executed first.

  • hook (Callable) – The hook to add.

build()[source]

Build (finalize) the command.

Returns

Finalized command that cannot be modified.

Return type

Command

command(operation)[source]

Set the wrapped command.

Parameters

operation (Callable) – The function to wrap in the command builder.

Returns

This command.

Return type

Command

execute(*args, **kwargs)[source]

Execute the wrapped operation.

First executes pre_hooks in ascending order, passing a read/write context between them. It then calls the wrapped operation. The result of the operation then gets pass to all the post_hooks, but in descending order. It then returns the result or error if there was one.

Returns

Result of execution of command.

Return type

CommandResult

property finalized

Whether this builder is still being constructed or has been finalized.

lock_dataset()[source]

Acquire a lock for a dataset.

lock_project()[source]

Acquire a lock for the whole project.

require_clean()[source]

Check that the repository is clean.

require_migration()[source]

Check if a migration is needed.

track_std_streams()[source]

Whether to track STD streams or not.

Returns

This command.

Return type

Command

with_commit(message=None, commit_if_empty=False, raise_if_empty=False, commit_only=None, skip_staging=False)[source]

Create a commit.

Parameters
  • message (str, optional) – The commit message. Auto-generated if left empty (Default value = None).

  • commit_if_empty (bool, optional) – Whether to commit if there are no modified files (Default value = False).

  • raise_if_empty (bool, optional) – Whether to raise an exception if there are no modified files (Default value = False).

  • commit_only (bool, optional) – Only commit the supplied paths (Default value = None).

with_communicator(communicator)[source]

Create a communicator.

Parameters

communicator (CommunicationCallback) – Communicator to use for writing to user.

with_database(write=False, path=None, create=False)[source]

Provide an object database connection.

Parameters
  • write (bool, optional) – Whether or not to persist changes to the database (Default value = False).

  • path (str, optional) – Location of the database (Default value = None).

  • create (bool, optional) – Whether the database should be created if it doesn’t exist (Default value = False).

with_git_isolation()[source]

Whether to run in git isolation or not.

working_directory(directory)[source]

Set the working directory for the command.

Parameters

directory (str) – The working directory to work in.

Returns

This command.

Return type

Command

JSON-LD Schemes

Schema classes used to serialize domain models to JSON-LD.

Activity JSON-LD schema.

class renku.command.schema.activity.ActivitySchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Activity schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.activity.Activity

class renku.command.schema.activity.AssociationSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Association schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.activity.Association

class renku.command.schema.activity.GenerationSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Generation schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.activity.Generation

class renku.command.schema.activity.ParameterValueSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

ParameterValue schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.parameter.ParameterValue

class renku.command.schema.activity.UsageSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Usage schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.activity.Usage

Agents JSON-LD schemes.

class renku.command.schema.agent.PersonSchema(*args, **kwargs)[source]

Bases: calamus.schema.JsonLDSchema

Person schema.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.agent.Person

class renku.command.schema.agent.SoftwareAgentSchema(*args, **kwargs)[source]

Bases: calamus.schema.JsonLDSchema

SoftwareAgent schema.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.agent.SoftwareAgent

Annotation JSON-LD schema.

class renku.command.schema.annotation.AnnotationSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Annotation schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.provenance.annotation.Annotation

Classes for integration with Calamus.

class renku.command.schema.calamus.DateTimeList(*args, **kwargs)[source]

Bases: calamus.fields.DateTime

A DateTime field that might be a list when deserializing.

Create an instance.

class renku.command.schema.calamus.JsonLDSchema(*args, **kwargs)[source]

Bases: calamus.schema.JsonLDSchema

Base schema class for Renku.

Create an instance.

class renku.command.schema.calamus.Nested(*args, **kwargs)[source]

Bases: calamus.fields.Nested

Nested field that passes along client and commit info.

Init method.

property schema

The nested calamus.Schema object.

This method was copied from marshmallow and modified to support multiple different nested schemes.

class renku.command.schema.calamus.StringList(*args, **kwargs)[source]

Bases: calamus.fields.String

A String field that might be a list when deserializing.

Create an instance.

class renku.command.schema.calamus.Uri(*args, **kwargs)[source]

Bases: calamus.fields._JsonLDField, marshmallow.fields.String, marshmallow.fields.Dict

A Dict/String field.

Create an instance.

Represent a group of run templates.

class renku.command.schema.composite_plan.CompositePlanSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Plan schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.composite_plan.CompositePlan

Datasets JSON-LD schemes.

class renku.command.schema.dataset.DatasetFileSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

DatasetFile schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.DatasetFile

class renku.command.schema.dataset.DatasetSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Dataset schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.Dataset

class renku.command.schema.dataset.DatasetTagSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

DatasetTag schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.DatasetTag

class renku.command.schema.dataset.ImageObjectSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

ImageObject schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.ImageObject

class renku.command.schema.dataset.LanguageSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Language schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.Language

class renku.command.schema.dataset.RemoteEntitySchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

RemoteEntity schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.RemoteEntity

class renku.command.schema.dataset.UrlSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Url schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.dataset.Url

renku.command.schema.dataset.dump_dataset_as_jsonld(dataset)[source]

Return JSON-LD representation of a dataset.

Parameters

dataset (Dataset) – The dataset to convert.

Returns

JSON-LD data of dataset.

Return type

dict

Entities JSON-LD schemes.

class renku.command.schema.entity.CollectionSchema(*args, **kwargs)[source]

Bases: renku.command.schema.entity.EntitySchema

Entity Schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.entity.Collection

class renku.command.schema.entity.EntitySchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Entity Schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.entity.Entity

Parameters JSON-LD schemes.

class renku.command.schema.parameter.CommandInputSchema(*args, **kwargs)[source]

Bases: renku.command.schema.parameter.CommandParameterBaseSchema

CommandInput schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.CommandInput

class renku.command.schema.parameter.CommandOutputSchema(*args, **kwargs)[source]

Bases: renku.command.schema.parameter.CommandParameterBaseSchema

CommandOutput schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.CommandOutput

class renku.command.schema.parameter.CommandParameterBaseSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

CommandParameterBase schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.CommandParameterBase

class renku.command.schema.parameter.CommandParameterSchema(*args, **kwargs)[source]

Bases: renku.command.schema.parameter.CommandParameterBaseSchema

CommandParameter schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.CommandParameter

class renku.command.schema.parameter.MappedIOStreamSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

MappedIOStream schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.MappedIOStream

class renku.command.schema.parameter.ParameterLinkSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

ParameterLink schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.ParameterLink

class renku.command.schema.parameter.ParameterMappingSchema(*args, **kwargs)[source]

Bases: renku.command.schema.parameter.CommandParameterBaseSchema

ParameterMapping schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.parameter.ParameterMapping

Represent run templates.

class renku.command.schema.plan.PlanSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Plan schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.workflow.plan.Plan

Project JSON-LD schema.

class renku.command.schema.project.ProjectSchema(*args, **kwargs)[source]

Bases: renku.command.schema.calamus.JsonLDSchema

Project Schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.domain_model.project.Project

Datasets

Dataset business logic.

renku.core.dataset.dataset.create_dataset(name, client_dispatcher, title=None, description=None, creators=None, keywords=None, images=None, update_provenance=True, custom_metadata=None)[source]

Create a dataset.

Parameters
  • name (str) – Name of the dataset

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • title (Optional[str], optional) – Dataset title (Default value = None).

  • description (Optional[str], optional) – Dataset description (Default value = None).

  • creators (Optional[List[Person]], optional) – Dataset creators (Default value = None).

  • keywords (Optional[List[str]], optional) – Dataset keywords (Default value = None).

  • images (Optional[List[ImageRequestModel]], optional) – Dataset images (Default value = None).

  • update_provenance (bool, optional) – Whether to add this dataset to dataset provenance (Default value = True).

  • custom_metadata (Optional[Dict[str, Any]], optional) – Custom JSON-LD metadata (Default value = None).

Returns

The created dataset.

Return type

Dataset

renku.core.dataset.dataset.edit_dataset(name, title, description, creators, client_dispatcher, keywords=None, images=None, skip_image_update=False, custom_metadata=None)[source]

Edit dataset metadata.

Parameters
  • name (str) – Name of the dataset to edit

  • title (str) – New title for the dataset.

  • description (str) – New description for the dataset.

  • creators (List[Person]) – New creators for the dataset.

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • keywords (List[str], optional) – New keywords for dataset (Default value = None).

  • images (List[ImageRequestModel], optional) – New images for dataset (Default value = None).

  • skip_image_update (bool, optional) – Whether or not to skip updating dataset images (Default value = False).

  • custom_metadata (Dict, optional) – Custom JSON-LD metadata (Default value = None).

Returns

True if updates were performed.

Return type

bool

renku.core.dataset.dataset.export_dataset(name, provider_name, publish, tag, client_dispatcher, **kwargs)[source]

Export data to 3rd party provider.

Parameters
  • name – Name of dataset to export.

  • provider_name – Provider to use for export.

  • publish – Whether to export as proper version or draft.

  • tag – Dataset tag from which to export.

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

Remove matching files from a dataset.

Parameters
  • name – Dataset name.

  • include – Include filter for files.

  • exclude – Exclude filter for files.

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • yes – Whether to skip user confirmation or not (Default value = False).

Returns

List of files that were removed.

Return type

List[DynamicProxy]

renku.core.dataset.dataset.filter_dataset_files(client_dispatcher, dataset_gateway, names=None, creators=None, include=None, exclude=None, ignore=None, immutable=False)[source]

Filter dataset files by specified filters.

Parameters
  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • dataset_gateway (IDatasetGateway) – Injected dataset gateway.

  • names – Filter by specified dataset names. (Default value = None).

  • creators – Filter by creators. (Default value = None).

  • include – Include files matching file pattern. (Default value = None).

  • exclude – Exclude files matching file pattern. (Default value = None).

  • ignore – Ignored datasets. (Default value = None).

  • immutable – Return immutable copies of dataset objects. (Default value = False).

Returns

List of filtered files sorted by date added.

Return type

List[DynamicProxy]

renku.core.dataset.dataset.import_dataset(uri, client_dispatcher, database_dispatcher, name='', extract=False, yes=False, previous_dataset=None, delete=False, gitlab_token=None)[source]

Import data from a 3rd party provider or another renku project.

Parameters
  • uri – DOI or URL of dataset to import.

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • database_dispatcher (IDatabaseDispatcher) – Injected database dispatcher.

  • name – Name to give imported dataset (Default value = “”).

  • extract – Whether to extract compressed dataset data (Default value = False).

  • yes – Whether to skip user confirmation (Default value = False).

  • previous_dataset – Previously imported dataset version (Default value = None).

  • delete – Whether to delete files that don’t exist anymore (Default value = False).

  • gitlab_token – Gitlab OAuth2 token (Default value = None).

renku.core.dataset.dataset.list_dataset_files(client_dispatcher, datasets=None, creators=None, include=None, exclude=None)[source]

List dataset files.

Parameters
  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • datasets – Datasets to list files for (Default value = None).

  • creators – Creators to filter by (Default value = None).

  • include – Include filters for file paths (Default value = None).

  • exclude – Exclude filters for file paths (Default value = None).

Returns

Filtered dataset files.

Return type

List[DynamicProxy]

renku.core.dataset.dataset.list_datasets()[source]

List all datasets.

renku.core.dataset.dataset.move_files(client_dispatcher, dataset_gateway, files, to_dataset_name=None)[source]

Move files and their metadata from one or more datasets to a target dataset.

Parameters
  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • dataset_gateway (IDatasetGateway) – Injected dataset gateway.

  • files (Dict[Path, Path]) – Files to move

  • to_dataset_name (Optional[str], optional) – Target dataset (Default value = None)

renku.core.dataset.dataset.remove_dataset(name)[source]

Delete a dataset.

Parameters

name – Name of dataset to delete.

renku.core.dataset.dataset.search_datasets(name)[source]

Get all the datasets whose name starts with the given string.

Parameters

name (str) – Beginning of dataset name to search for.

Returns

List of found dataset names.

Return type

List[str]

renku.core.dataset.dataset.set_dataset_images(client, dataset, images)[source]

Set a dataset’s images.

Parameters
  • client ("LocalClient") – The LocalClient.

  • dataset (Dataset) – The dataset to set images on.

  • images (List[ImageRequestModel]) – The images to set.

Returns

True if images were set/modified.

renku.core.dataset.dataset.show_dataset(name)[source]

Show detailed dataset information.

Parameters

name – Name of dataset to show details for.

Returns

JSON dictionary of dataset details.

Return type

dict

renku.core.dataset.dataset.update_dataset_custom_metadata(dataset, custom_metadata)[source]

Update custom metadata on a dataset.

Parameters
  • dataset (Dataset) – The dataset to update.

  • custom_metadata (Dict) – Custom JSON-LD metadata to set.

renku.core.dataset.dataset.update_dataset_git_files(client_dispatcher, files, ref, delete, dry_run)[source]

Update files and dataset metadata according to their remotes.

Parameters
  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • files (List[DynamicProxy]) – List of files to be updated.

  • ref (str) – Reference to use for update.

  • delete (bool, optional) – Indicates whether to delete files or not (Default value = False).

  • dry_run (bool) – Whether to perform update or only print changes.

Returns

Tuple of updated and deleted file records.

Return type

Tuple[List[DynamicProxy], List[DynamicProxy]]

renku.core.dataset.dataset.update_dataset_local_files(client_dispatcher, records)[source]

Update files metadata from the git history.

Parameters
Returns

Tuple of updated and deleted file records.

Return type

Tuple[List[DynamicProxy], List[DynamicProxy]]

renku.core.dataset.dataset.update_datasets(names, creators, include, exclude, ref, delete, no_external, update_all, dry_run, client_dispatcher, dataset_gateway)[source]

Update dataset files.

Parameters
  • names – Names of datasets to update.

  • creators – Creators to filter dataset files by.

  • include – Include filter for paths to update.

  • exclude – Exclude filter for paths to update.

  • ref – Git reference to use for update.

  • delete – Whether to delete files that don’t exist on remote anymore.

  • no_external – Whether to exclude external files from the update.

  • update_all – Whether to update all datasets.

  • dry_run – Whether to return a preview of what would be updated.

  • client_dispatcher (IClientDispatcher) – Injected client dispatcher.

  • dataset_gateway (IDatasetGateway) – Injected dataset gateway.

renku.core.dataset.dataset.update_external_files(client, records, dry_run)[source]

Update files linked to external storage.

Parameters
  • client ("LocalClient") – The LocalClient.

  • records (List[DynamicProxy]) – File records to update.

  • dry_run (bool) – Whether to return a preview of what would be updated.

Dataset add business logic.

class renku.core.dataset.dataset_add.AddAction(value)[source]

Bases: enum.Enum

Types of action when adding a file to a dataset.

renku.core.dataset.dataset_add.add_data_to_dataset(dataset_name, urls, client_dispatcher, database_dispatcher, force=False, create=False, overwrite=False, sources=None, destination='', ref=None, external=False, extract=False, all_at_once=False, destination_names=None, repository=None, clear_files_before=False, total_size=None, with_metadata=None)[source]

Import the data into the data directory.

renku.core.dataset.dataset_add.move_files_to_dataset(client, files)[source]

Copy/Move files into a dataset’s directory.

Dataset constants.

renku.core.dataset.constant.DATASET_IMAGES = 'dataset_images'

Directory for dataset images.

renku.core.dataset.constant.POINTERS = 'pointers'

Directory for storing external pointer files.

renku.core.dataset.constant.renku_dataset_images_path(client)[source]

Return a Path instance of Renku dataset metadata folder.

renku.core.dataset.constant.renku_pointers_path(client)[source]

Return a Path instance of Renku pointer files folder.

Dataset context managers.

class renku.core.dataset.context.DatasetContext(name, create=False, commit_database=False, creator=None)[source]

Bases: object

Dataset context manager for metadata changes.

Datasets Provenance.

class renku.core.dataset.datasets_provenance.DatasetsProvenance[source]

Bases: object

A set of datasets.

add_or_update(dataset, date=None, creator=None)[source]

Add/update a dataset according to its new content.

NOTE: This functions always mutates the dataset.

add_tag(dataset, tag)[source]

Add a tag from a dataset.

property datasets

Return an iterator of datasets.

get_all_tags(dataset)[source]

Return the list of all tags for a dataset.

get_by_id(id, immutable=False)[source]

Return a dataset by its id.

get_by_name(name, immutable=False, strict=False)[source]

Return a dataset by its name.

get_previous_version(dataset)[source]

Return the previous version of a dataset if any.

get_provenance_tails()[source]

Return the provenance for all datasets.

remove(dataset, date=None, creator=None)[source]

Remove a dataset.

remove_tag(dataset, tag)[source]

Remove a tag from a dataset.

update_during_migration(dataset, commit_sha, date=None, tags=None, remove=False, replace=False, preserve_identifiers=False)[source]

Add, update, remove, or replace a dataset in migration.

Pointer file business logic.

renku.core.dataset.pointer_file.create_external_file(client, target, path, checksum=None)[source]

Create a new external file.

renku.core.dataset.pointer_file.create_pointer_file(client, target, checksum=None)[source]

Create a new pointer file.

renku.core.dataset.pointer_file.get_pointer_file(client_path, path)[source]

Return pointer file from an external file.

renku.core.dataset.pointer_file.is_external_file_updated(client_path, path)[source]

Check if an update to an external file is available.

renku.core.dataset.pointer_file.update_external_file(client, path, checksum)[source]

Delete existing external file and create a new one.

Renku management dataset request models.

class renku.core.dataset.request_model.ImageRequestModel(content_url, position, mirror_locally=False, safe_image_paths=None)[source]

Bases: object

Model for passing image information to dataset use-cases.

to_image_object(dataset, client_dispatcher)[source]

Convert request model to ImageObject.

Tag management for dataset.

renku.core.dataset.tag.add_dataset_tag(dataset_name, tag, description='', force=False)[source]

Adds a new tag to a dataset.

Validates if the tag already exists and that the tag follows the same rules as docker tags. See https://docs.docker.com/engine/reference/commandline/tag/ for a documentation of docker tag syntax.

Raises

errors.ParameterError – If tag is too long or contains invalid characters.

renku.core.dataset.tag.list_dataset_tags(dataset_name, format)[source]

List all tags for a dataset.

renku.core.dataset.tag.prompt_access_token(exporter)[source]

Prompt user for an access token for a provider.

Returns

The new access token

renku.core.dataset.tag.prompt_tag_selection(tags)[source]

Prompt user to chose a tag or <HEAD>.

renku.core.dataset.tag.remove_dataset_tags(dataset_name, tags)[source]

Removes tags from a dataset.

Dataset Providers

Providers for dataset import and export

API for providers.

class renku.core.dataset.providers.api.ExporterApi[source]

Bases: abc.ABC

Interface defining exporter methods.

abstract access_token_url()[source]

Endpoint for creation of access token.

abstract export(publish, **kwargs)[source]

Execute export process.

abstract set_access_token(access_token)[source]

Set access token.

class renku.core.dataset.providers.api.ProviderApi[source]

Bases: abc.ABC

Interface defining provider methods.

static export_parameters()[source]

Returns parameters that can be set for export.

abstract find_record(uri, **kwargs)[source]

Find record by URI.

abstract get_exporter(dataset, access_token)[source]

Get export manager.

property is_git_based

True if provider is a git repository.

set_parameters(**kwargs)[source]

Set and validate required parameters for a provider.

abstract static supports(uri)[source]

Whether or not this provider supports a given URI.

static supports_export()[source]

Whether this provider supports dataset export.

property supports_images

True if provider is a git repository.

static supports_import()[source]

Whether this provider supports dataset import.

class renku.core.dataset.providers.api.ProviderRecordSerializerApi(uri)[source]

Bases: abc.ABC

Interface defining provider record serializer methods.

abstract as_dataset(client)[source]

Deserialize this record to a ProviderDataset.

property files_info

Return list of dataset file proxies.

This is only valid after a call to as_dataset.

abstract is_last_version(uri)[source]

Check if record is at last possible version.

property latest_uri

Get URI of the latest version.

property url

Return url of this record.

property version

Get record version.

Dataverse API integration.

class renku.core.dataset.providers.dataverse.DataverseExporter(*, dataset, access_token, server_url=None, dataverse_name=None)[source]

Bases: renku.core.dataset.providers.api.ExporterApi

Dataverse export manager.

Method generated by attrs for class DataverseExporter.

access_token_url()[source]

Endpoint for creation of access token.

export(publish, client=None, **kwargs)[source]

Execute export process.

set_access_token(access_token)[source]

Set access token.

class renku.core.dataset.providers.dataverse.DataverseFileSerializer(*, id=None, identifier=None, name=None, file_format=None, content_size=None, description=None, content_url=None, parent_url=None, type=None)[source]

Bases: object

Dataverse record file.

Method generated by attrs for class DataverseFileSerializer.

property remote_url

Get remote URL as urllib.ParseResult.

class renku.core.dataset.providers.dataverse.DataverseProvider(is_doi=False, server_url=None, dataverse_name=None)[source]

Bases: renku.core.dataset.providers.api.ProviderApi

Dataverse API provider.

Method generated by attrs for class DataverseProvider.

static export_parameters()[source]

Returns parameters that can be set for export.

find_record(uri, **kwargs)[source]

Retrieves a record from Dataverse.

Parameters

uri – DOI or URL.

Returns

The found record

Return type

DataverseRecordSerializer

get_exporter(dataset, access_token)[source]

Create export manager for given dataset.

static record_id(uri)[source]

Extract record id from URI.

set_parameters(client_dispatcher, *, dataverse_server, dataverse_name, **kwargs)[source]

Set and validate required parameters for a provider.

static supports(uri)[source]

Check if provider supports a given URI.

static supports_export()[source]

Whether this provider supports dataset export.

static supports_import()[source]

Whether this provider supports dataset import.

class renku.core.dataset.providers.dataverse.DataverseRecordSerializer(uri, json)[source]

Bases: renku.core.dataset.providers.api.ProviderRecordSerializerApi

Dataverse record serializer.

as_dataset(client)[source]

Deserialize DataverseRecordSerializer to Dataset.

property files

Get all file metadata entries.

get_files()[source]

Get Dataverse files metadata as DataverseFileSerializer.

is_last_version(uri)[source]

Check if record is at last possible version.

property latest_uri

Get URI of latest version.

property version

Get the major and minor version of this dataset.

renku.core.dataset.providers.dataverse.check_dataverse_doi(doi)[source]

Check if a DOI points to a dataverse dataset.

renku.core.dataset.providers.dataverse.check_dataverse_uri(url)[source]

Check if an URL points to a dataverse instance.

renku.core.dataset.providers.dataverse.make_file_url(file_id, base_url)[source]

Create URL to access record by ID.

renku.core.dataset.providers.dataverse.make_records_url(record_id, base_url)[source]

Create URL to access record by ID.

renku.core.dataset.providers.dataverse.make_versions_url(record_id, base_url)[source]

Create URL to access the versions of a record.

Dataverse metadata templates.

DOI API integration.

class renku.core.dataset.providers.doi.DOIMetadataSerializer(id, doi, url, abstract=None, author=None, categories=None, container_title=None, contributor=None, copyright=None, issued=None, language=None, publisher=None, title=None, type=None, version=None)[source]

Bases: renku.core.dataset.providers.api.ProviderRecordSerializerApi

Response from doi.org for DOI metadata.

as_dataset(client)[source]

Deserialize this record to a ProviderDataset.

is_last_version(uri)[source]

Check if record is at last possible version.

property latest_uri

Get URI of the latest version.

property version

Get record version.

class renku.core.dataset.providers.doi.DOIProvider(headers=None, timeout=3)[source]

Bases: renku.core.dataset.providers.api.ProviderApi

doi.org registry API provider.

find_record(uri, client=None, **kwargs)[source]

Finds DOI record.

get_exporter(dataset, secret)[source]

Implements interface ProviderApi.

static supports(uri)[source]

Whether or not this provider supports a given URI.

renku.core.dataset.providers.doi.make_doi_url(doi)[source]

Create URL to access DOI metadata.

Models for providers.

class renku.core.dataset.providers.models.ProviderDataset(*args, **kwargs)[source]

Bases: renku.domain_model.dataset.Dataset

A Dataset that is imported from a provider.

property files

Return list of existing files.

classmethod from_dataset(dataset)[source]

Create an instance from a Dataset.

classmethod from_jsonld(data, schema_class=None)[source]

Create an instance from JSON-LD data.

class renku.core.dataset.providers.models.ProviderDatasetFile(source, filename, checksum, size_in_mb, filetype, path)[source]

Bases: object

Store metadata for dataset files that will be downloaded from a provider.

class renku.core.dataset.providers.models.ProviderDatasetSchema(*args, **kwargs)[source]

Bases: renku.command.schema.dataset.DatasetSchema

ProviderDataset schema.

Create an instance.

class Meta[source]

Bases: object

Meta class.

model

alias of renku.core.dataset.providers.models.ProviderDataset

OLOS API integration.

class renku.core.dataset.providers.olos.OLOSExporter(*, dataset, access_token, server_url=None)[source]

Bases: renku.core.dataset.providers.api.ExporterApi

OLOS export manager.

Method generated by attrs for class OLOSExporter.

access_token_url()[source]

Endpoint for creation of access token.

export(publish, client=None, **kwargs)[source]

Execute export process.

set_access_token(access_token)[source]

Set access token.

class renku.core.dataset.providers.olos.OLOSProvider(server_url=None)[source]

Bases: renku.core.dataset.providers.api.ProviderApi

Provider for OLOS integration.

Method generated by attrs for class OLOSProvider.

static export_parameters()[source]

Returns parameters that can be set for export.

find_record(uri, client=None, **kwargs)[source]

Find record by URI.

get_exporter(dataset, access_token)[source]

Create export manager for given dataset.

set_parameters(client_dispatcher, *, dlcm_server=None, **kwargs)[source]

Set and validate required parameters for a provider.

static supports(uri)[source]

Check if provider supports a given URI for importing.

static supports_export()[source]

Whether this provider supports dataset export.

Renku dataset provider.

class renku.core.dataset.providers.renku.RenkuProvider(is_doi=False, accept='application/json', authorization_header=None, uri='', gitlab_token=None, renku_token=None)[source]

Bases: renku.core.dataset.providers.api.ProviderApi

Renku API provider.

Method generated by attrs for class RenkuProvider.

find_record(uri, **kwargs)[source]

Retrieves a dataset from Renku.

Parameters

uri – URL to search for.

Returns

Serializer containing record data.

Return type

RenkuRecordSerializer

get_exporter(dataset, access_token)[source]

Create export manager for given dataset.

property is_git_based

True if provider is git-based.

static supports(uri)[source]

Whether or not this provider supports a given URI.

property supports_images

True if provider is a git repository.

static supports_import()[source]

Whether this provider supports dataset import.

class renku.core.dataset.providers.renku.RenkuRecordSerializer(uri, identifier, name, latest_version_uri, project_url_ssh, project_url_http, gitlab_token, renku_token)[source]

Bases: renku.core.dataset.providers.api.ProviderRecordSerializerApi

Renku record serializer.

Create a RenkuRecordSerializer from a Dataset.

as_dataset(client)[source]

Return encapsulated dataset instance.

property datadir_exists

Whether the dataset data directory exists (might be missing in git if empty).

import_images(dataset, client_dispatcher)[source]

Add images from remote dataset.

is_last_version(uri)[source]

Check if dataset is at last possible version.

property latest_uri

Get URI of the latest version.

property project_url

URL of the Renku project in Gitlab.

property repository

The cloned repository that contains the dataset.

property version

Get record version.

Zenodo API integration.

class renku.core.dataset.providers.zenodo.ZenodoDeposition(exporter, id=None)[source]

Bases: object

Zenodo record for a deposit.

Method generated by attrs for class ZenodoDeposition.

attach_metadata(dataset, tag)[source]

Attach metadata to deposition on Zenodo.

property attach_metadata_url

Return URL for attaching metadata.

property deposit_at

Return deposit at URL.

property new_deposit_url

Return URL for creating new deposit.

new_deposition()[source]

Create new deposition on Zenodo.

publish_deposition(secret)[source]

Publish existing deposition.

property publish_url

Returns publish URL.

property published_at

Return published at URL.

upload_file(filepath, path_in_repo)[source]

Upload and attach a file to existing deposition on Zenodo.

property upload_file_url

Return URL for uploading file.

class renku.core.dataset.providers.zenodo.ZenodoExporter(dataset, access_token)[source]

Bases: renku.core.dataset.providers.api.ExporterApi

Zenodo export manager.

Method generated by attrs for class ZenodoExporter.

access_token_url()[source]

Endpoint for creation of access token.

dataset_to_request()[source]

Prepare dataset metadata for request.

property default_params

Create request default parameters.

export(publish, tag=None, client=None, **kwargs)[source]

Execute entire export process.

set_access_token(access_token)[source]

Set access token.

property zenodo_url

Returns correct Zenodo URL based on environment.

class renku.core.dataset.providers.zenodo.ZenodoFileSerializer(*, id=None, checksum=None, links=None, filename=None, filesize=None)[source]

Bases: object

Zenodo record file.

Method generated by attrs for class ZenodoFileSerializer.

property remote_url

Get remote URL as urllib.ParseResult.

property type

Get file type.

class renku.core.dataset.providers.zenodo.ZenodoMetadataSerializer(*, access_right=None, communities=None, contributors=None, creators=None, description=None, doi=None, extras=None, grants=None, image_type=None, journal_issue=None, journal_pages=None, journal_title=None, journal_volume=None, keywords=None, language=None, license=None, notes=None, prereserve_doi=None, publication_date=None, publication_type=None, references=None, related_identifiers=None, title=None, upload_type=None, version=None)[source]

Bases: object

Zenodo metadata.

Method generated by attrs for class ZenodoMetadataSerializer.

class renku.core.dataset.providers.zenodo.ZenodoProvider(is_doi=False)[source]

Bases: renku.core.dataset.providers.api.ProviderApi

Zenodo registry API provider.

Method generated by attrs for class ZenodoProvider.

find_record(uri, client=None, **kwargs)[source]

Retrieves a record from Zenodo.

Parameters
  • uri – DOI or URL.

  • client – The LocalClient (Default value = None).

Returns

Record found.

Return type

ZenodoRecordSerializer

find_record_by_doi(doi)[source]

Resolve the DOI and make a record for the retrieved record id.

get_exporter(dataset, access_token)[source]

Create export manager for given dataset.

static record_id(uri)[source]

Extract record id from URI.

static supports(uri)[source]

Whether or not this provider supports a given URI.

static supports_export()[source]

Whether this provider supports dataset export.

static supports_import()[source]

Whether this provider supports dataset import.

class renku.core.dataset.providers.zenodo.ZenodoRecordSerializer(uri, *, conceptdoi=None, conceptrecid=None, created=None, doi=None, doi_url=None, files=None, id=None, links=None, metadata=None, modified=None, owner=None, record_id=None, state=None, submitted=None, title=None)[source]

Bases: renku.core.dataset.providers.api.ProviderRecordSerializerApi

Zenodo record.

as_dataset(client)[source]

Deserialize ZenodoRecordSerializer to Dataset.

get_files()[source]

Get Zenodo files metadata as ZenodoFile.

get_jsonld()[source]

Get record metadata as jsonld.

is_last_version(uri)[source]

Check if this record is the latest version.

property latest_uri

Get URI of latest version.

property version

Get record version.

renku.core.dataset.providers.zenodo.make_records_url(record_id)[source]

Create URL to access record by ID.

Parameters

record_id – The id of the record.

Returns

Full URL for the record.

Return type

str

Workflows

Renku workflow commands.

renku.core.workflow.activity.add_activity_if_recent(activity, activities)[source]

Add activity to activities if it’s not in the set or is the latest executed instance.

Remove existing activities that were executed earlier.

renku.core.workflow.activity.create_activity_graph(activities, remove_overridden_parents=True, with_inputs_outputs=False)[source]

Create a dependency DAG from activities.

renku.core.workflow.activity.filter_overridden_activities(activities)[source]

Filter out overridden activities from a list of activities.

renku.core.workflow.activity.get_activities_until_paths(paths, sources, activity_gateway, client_dispatcher, revision=None)[source]

Get all current activities leading to paths, from sources.

renku.core.workflow.activity.get_all_modified_and_deleted_activities_and_entities(repository, activity_gateway)[source]

Return latest activities with at least one modified or deleted input along with the modified/deleted input entity.

An activity can be repeated if more than one of its inputs are modified.

Parameters
  • repository – The current Repository.

  • activity_gateway (IActivityGateway) – The injected Activity gateway.

Returns

Tuple of modified and deleted

activities and entities.

Return type

Tuple[Set[Tuple[Activity, Entity]], Set[Tuple[Activity, Entity]]]

renku.core.workflow.activity.get_downstream_generating_activities(starting_activities, paths, ignore_deleted, client_path, activity_gateway)[source]

Return activities downstream of passed activities that generate at least a path in paths.

Parameters
  • starting_activities (Set[Activity]) – Activities to use as starting/upstream nodes.

  • paths (List[str]) – Optional generated paths to end downstream chains at.

  • ignore_deleted (bool) – Whether to ignore deleted generations.

  • client_path (Path) – Path to project’s root directory.

  • activity_gateway (IActivityGateway) – The injected Activity gateway.

Returns

All activities and their downstream activities.

Return type

Set[Activity]

renku.core.workflow.activity.get_modified_activities(activities, repository)[source]

Get lists of activities that have modified/deleted usage entities.

renku.core.workflow.activity.is_activity_valid(activity, plan_gateway)[source]

Return whether this plan is current and has not been deleted.

Parameters
  • activity (Activity) – The Activity whose Plan should be checked.

  • plan_gateway (IPlanGateway) – The injected Plan gateway.

Returns

True if the activities’ Plan is still valid, False otherwise.

Return type

bool

renku.core.workflow.activity.sort_activities(activities, remove_overridden_parents=True)[source]

Return a sorted list of activities based on their dependencies and execution order.

Build an execution graph for a workflow.

class renku.core.workflow.concrete_execution_graph.ExecutionGraph(workflows, virtual_links=False)[source]

Bases: object

Represents an execution graph for one or more workflow steps.

calculate_concrete_execution_graph(virtual_links=False)[source]

Create an execution DAG between Plans showing dependencies between them.

Resolve ParameterLink’s involving ParameterMapping’s to the underlying actual parameters and potentially also virtual links determined by parameter values.

property cycles

Get potential cycles in execution graph.

property workflow_graph

Return a subgraph with only workflows and their dependencies.

Represent a PlanFactory for tracking workflows.

class renku.core.workflow.plan_factory.PlanFactory(command_line, explicit_inputs=None, explicit_outputs=None, explicit_parameters=None, directory=None, working_dir=None, no_input_detection=False, no_output_detection=False, success_codes=None, stdin=None, stdout=None, stderr=None)[source]

Bases: object

Factory for creating a plan from a command line call.

add_command_input(default_value, prefix=None, position=None, postfix=None, name=None, encoding_format=None)[source]

Create a CommandInput.

add_command_output(default_value, prefix=None, position=None, postfix=None, encoding_format=None, name=None, id=None, mapped_to=None)[source]

Create a CommandOutput.

add_command_output_from_input(input, name)[source]

Create a CommandOutput from an input.

add_command_output_from_parameter(parameter, name)[source]

Create a CommandOutput from a parameter.

add_command_parameter(default_value, prefix=None, position=None, name=None)[source]

Create a CommandParameter.

add_explicit_inputs()[source]

Add explicit inputs .

add_explicit_parameters()[source]

Add explicit parameters.

add_indirect_inputs()[source]

Read indirect inputs list and add them to explicit inputs.

add_indirect_outputs()[source]

Read indirect outputs list and add them to explicit outputs.

add_inputs_and_parameters(*arguments)[source]

Yield command input parameters.

add_outputs(candidates)[source]

Yield detected output and changed command input parameter.

get_stream_mapping_for_value(value)[source]

Return a stream mapping if value is a path mapped to a stream.

guess_type(value, ignore_filenames=None)[source]

Return new value and CWL parameter type.

iter_input_files(basedir)[source]

Yield tuples with input id and path.

split_command_and_args()[source]

Return tuple with command and args from command line arguments.

to_plan(project_gateway, name=None, description=None, keywords=None)[source]

Return an instance of Plan based on this factory.

watch(client_dispatcher, no_output=False)[source]

Watch a Renku repository for changes to detect outputs.

renku.core.workflow.plan_factory.add_indirect_parameter(working_dir, name, value)[source]

Add a parameter to indirect parameters.

renku.core.workflow.plan_factory.add_to_files_list(file_list_path, name, path)[source]

Add a parameter to indirect parameters.

renku.core.workflow.plan_factory.delete_indirect_files_list(working_dir)[source]

Remove indirect inputs, outputs, and parameters list.

renku.core.workflow.plan_factory.get_indirect_inputs_path(client_path)[source]

Return path to file that contains indirect inputs list.

renku.core.workflow.plan_factory.get_indirect_outputs_path(client_path)[source]

Return path to file that contains indirect outputs list.

renku.core.workflow.plan_factory.get_indirect_parameters_path(client_path)[source]

Return path to file that contains indirect parameters list.

renku.core.workflow.plan_factory.read_files_list(files_list)[source]

Read files list yaml containing name:path pairs.

renku.core.workflow.plan_factory.read_indirect_parameters(working_dir)[source]

Read and return indirect parameters.

Represent the Common Workflow Language types.

class renku.core.workflow.types.Directory(path, listing=None)[source]

Represent a directory.

class renku.core.workflow.types.File(path, mime_type=None)[source]

Represent a file.

Resolution of Workflow execution values precedence.

class renku.core.workflow.value_resolution.CompositePlanValueResolver(plan, values=None)[source]

Bases: renku.core.workflow.value_resolution.ValueResolver

Value resolution class for a CompositePlan.

Applies values and default_values to a nested workflow.

Order of precedence is as follows (from lowest to highest): - Default value on a parameter - Default value on a mapping to the parameter - Value passed to a mapping to the parameter - Value passed to the parameter - Value propagated to a parameter from the source of a ParameterLink

apply()[source]

Applies values and default_values to a CompositePlan.

Returns

A CompositePlan with values applied.

class renku.core.workflow.value_resolution.PlanValueResolver(plan, values)[source]

Bases: renku.core.workflow.value_resolution.ValueResolver

Value resolution class for a Plan.

Applies values and default_values to a workflow.

apply()[source]

Applies values and default_values to a Plan.

Returns

A Plan with values applied.

class renku.core.workflow.value_resolution.ValueResolver(plan, values)[source]

Bases: abc.ABC

Value resolution class for an AbstractPlan.

abstract apply()[source]

Applies values and default_values to a potentially nested workflow.

Returns

The AbstractPlan with the user provided values set.

Return type

AbstractPlan

static get(plan, values)[source]

Factory method to obtain the specific ValueResolver for a workflow.

Parameters
  • plan (AbstractPlan) – a workflow.

  • values (Dict[str, Any]) – user defined dictionary of runtime values for the provided workflow.

Returns

A ValueResolver object.

Return type

“ValueResolver”

Apply values from parameter links.

Parameters

workflow (CompositePlan) – The workflow whose links values should be applied on.

Sessions

Docker based interactive session provider.

class renku.core.session.docker.DockerSessionProvider[source]

Bases: renku.domain_model.session.ISessionProvider

A docker based interactive session provider.

build_image(image_descriptor, image_name, config)[source]

Builds the container image.

docker_client()[source]

Get the docker client.

Note

This is not a @property, even though it should be, because pluggy will call it in that case in unrelated parts of the code that will

Raises

errors.DockerError – Exception when docker is not available.

Returns

The docker client.

find_image(image_name, config)[source]

Find the given container image.

session_list(project_name, config)[source]

Lists all the sessions currently running by the given session provider.

Returns

a list of sessions.

Return type

list

session_provider()[source]

Supported session provider.

Returns

a tuple of self and provider name.

session_start(image_name, project_name, config, client, cpu_request=None, mem_request=None, disk_request=None, gpu_request=None)[source]

Creates an interactive session.

Returns

a unique id for the created interactive session.

Return type

str

session_stop(project_name, session_name, stop_all)[source]

Stops all or a given interactive session.

session_url(session_name)[source]

Get the URL of the interactive session.

Interactive session business logic.

renku.core.session.session.session_list(config_path, provider=None)[source]

List interactive sessions.

renku.core.session.session.session_open(session_name, provider=None)[source]

Open interactive session in the browser.

renku.core.session.session.session_start(provider, config_path, client_dispatcher, image_name=None, cpu_request=None, mem_request=None, disk_request=None, gpu_request=None)[source]

Start interactive session.

renku.core.session.session.session_stop(session_name, stop_all=False, provider=None)[source]

Stop interactive session.

Templates

Template management.

class renku.core.template.template.EmbeddedTemplates(path, source, reference, version)[source]

Bases: renku.domain_model.template.TemplatesSource

Represent templates that are bundled with Renku.

For embedded templates, source is “renku”. In the old versioning scheme, version is set to the installed Renku version and reference is not set. In the new scheme, both version and reference are set to the template version.

classmethod fetch(source, reference)[source]

Fetch embedded Renku templates.

get_all_references(id)[source]

Return all available references for a template id.

get_latest_reference_and_version(id, reference, version)[source]

Return latest reference and version number of a template.

get_template(id, reference)[source]

Return all available versions for a template id.

class renku.core.template.template.FileAction(value)[source]

Bases: enum.IntEnum

Types of operation when copying a template to a project.

class renku.core.template.template.RepositoryTemplates(path, source, reference, version, repository)[source]

Bases: renku.domain_model.template.TemplatesSource

Represent a local/remote template repository.

A template repository is checked out at a specific Git reference if one is provided. However, it’s still possible to get available versions of templates.

classmethod fetch(source, reference)[source]

Fetch a template repository.

get_all_references(id)[source]

Return a list of git tags that are valid SemVer and include a template id.

get_latest_reference_and_version(id, reference, version)[source]

Return latest reference and version number of a template.

get_template(id, reference)[source]

Return a template at a specific reference.

class renku.core.template.template.TemplateAction(value)[source]

Bases: enum.Enum

Types of template rendering.

renku.core.template.template.copy_template_to_client(rendered_template, client, project, actions, cleanup=True)[source]

Update project files and metadata from a template.

renku.core.template.template.fetch_templates_source(source, reference)[source]

Fetch a template.

renku.core.template.template.get_file_actions(rendered_template, template_action, client, interactive)[source]

Render a template regarding files in a project.

renku.core.template.template.get_sorted_actions(actions)[source]

Return a sorted actions list.

renku.core.template.template.is_renku_template(source)[source]

Return if template comes from Renku.

renku.core.template.template.read_template_checksum(client)[source]

Read templates checksum file for a project.

renku.core.template.template.set_template_parameters(template, template_metadata, input_parameters, interactive=False)[source]

Set and verify template parameters’ values in the template_metadata.

renku.core.template.template.write_template_checksum(client, checksums)[source]

Write templates checksum file for a project.

Template use cases.

renku.core.template.usecase.check_for_template_update(client)[source]

Check if the project can be updated to a newer version of the project template.

renku.core.template.usecase.list_templates(source, reference)[source]

Return available templates from a source.

renku.core.template.usecase.select_template(templates_source, id=None)[source]

Select a template from a template source.

renku.core.template.usecase.set_template(source, reference, id, force, interactive, input_parameters, dry_run, client_dispatcher)[source]

Set template for a project.

renku.core.template.usecase.show_template(source, reference, id, client_dispatcher)[source]

Show template details.

renku.core.template.usecase.update_template(force, interactive, dry_run, client_dispatcher)[source]

Update project’s template if possible. Return True if updated.

Errors

Errors that can be raised by renku.core.

Renku exceptions.

exception renku.core.errors.AuthenticationError[source]

Bases: renku.core.errors.RenkuException

Raise when there is a problem with authentication.

exception renku.core.errors.ChildWorkflowNotFoundError(child, workflow)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter reference cannot be resolved to a parameter.

Embed exception and build a custom message.

exception renku.core.errors.CommandFinalizedError[source]

Bases: renku.core.errors.RenkuException

Raised when trying to modify a finalized command builder.

exception renku.core.errors.CommandNotFinalizedError[source]

Bases: renku.core.errors.RenkuException

Raised when a non-finalized command is executed.

exception renku.core.errors.CommitMessageEmpty[source]

Bases: renku.core.errors.RenkuException

Raise invalid commit message.

Build a custom message.

exception renku.core.errors.CommitProcessingError[source]

Bases: renku.core.errors.RenkuException

Raised when a commit couldn’t be processed during graph build.

exception renku.core.errors.ConfigurationError[source]

Bases: renku.core.errors.RenkuException

Raise in case of misconfiguration; use GitConfigurationError for git-related configuration errors.

exception renku.core.errors.DatasetExistsError[source]

Bases: renku.core.errors.RenkuException

Raise when trying to create an existing dataset.

exception renku.core.errors.DatasetFileExists[source]

Bases: renku.core.errors.RenkuException

Raise when file is already in dataset.

Build a custom message.

exception renku.core.errors.DatasetImageError[source]

Bases: renku.core.errors.RenkuException

Raised when a local dataset image is not accessible.

exception renku.core.errors.DatasetNotFound(*, name=None, message=None)[source]

Bases: renku.core.errors.RenkuException

Raise when dataset is not found.

Build a custom message.

exception renku.core.errors.DirtyRenkuDirectory(repository)[source]

Bases: renku.core.errors.RenkuException

Raise when a directory in the renku repository is dirty.

Build a custom message.

exception renku.core.errors.DirtyRepository(repository)[source]

Bases: renku.core.errors.RenkuException

Raise when trying to work with dirty repository.

Build a custom message.

exception renku.core.errors.DockerError(reason)[source]

Bases: renku.core.errors.RenkuException

Raised when error has occurred while executing docker command.

Embed exception and build a custom message.

exception renku.core.errors.DockerfileUpdateError[source]

Bases: renku.core.errors.RenkuException

Raised when the renku version in the Dockerfile couldn’t be updated.

exception renku.core.errors.ExportError[source]

Bases: renku.core.errors.RenkuException

Raised when a dataset cannot be exported.

exception renku.core.errors.ExternalFileNotFound(path)[source]

Bases: renku.core.errors.RenkuException

Raise when an external file is not found.

Build a custom message.

exception renku.core.errors.ExternalStorageDisabled(repository)[source]

Bases: renku.core.errors.RenkuException

Raise when disabled repository storage API is trying to be used.

Build a custom message.

exception renku.core.errors.ExternalStorageNotInstalled(repository)[source]

Bases: renku.core.errors.RenkuException

Raise when LFS is required but not found or installed in the repository.

Build a custom message.

exception renku.core.errors.FailedMerge(repository, branch, merge_args)[source]

Bases: renku.core.errors.RenkuException

Raise when automatic merge failed.

Build a custom message.

exception renku.core.errors.GitCommandError(message='Git command failed.', command=None, stdout=None, stderr=None, status=None)[source]

Bases: renku.core.errors.GitError

Raised when a Git command fails.

Build a custom message.

exception renku.core.errors.GitCommitNotFoundError[source]

Bases: renku.core.errors.GitError

Raised when a commit cannot be found in a Repository.

exception renku.core.errors.GitConfigurationError[source]

Bases: renku.core.errors.GitError

Raised when a git configuration cannot be accessed.

exception renku.core.errors.GitError[source]

Bases: renku.core.errors.RenkuException

Raised when a Git operation fails.

exception renku.core.errors.GitLFSError[source]

Bases: renku.core.errors.RenkuException

Raised when a Git LFS operation fails.

exception renku.core.errors.GitMissingEmail(message=None)[source]

Bases: renku.core.errors.GitConfigurationError

Raise when the email is not configured.

Build a custom message.

exception renku.core.errors.GitMissingUsername(message=None)[source]

Bases: renku.core.errors.GitConfigurationError

Raise when the username is not configured.

Build a custom message.

exception renku.core.errors.GitReferenceNotFoundError[source]

Bases: renku.core.errors.GitError

Raised when a branch or a reference cannot be found.

exception renku.core.errors.GitRemoteNotFoundError[source]

Bases: renku.core.errors.GitError

Raised when a remote cannot be found.

exception renku.core.errors.GraphCycleError(cycles)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter reference cannot be resolved to a parameter.

Embed exception and build a custom message.

exception renku.core.errors.IncompatibleParametersError(a=None, b=None)[source]

Bases: renku.core.errors.ParameterError

Raise in case of incompatible parameters/flags.

Build a custom message.

exception renku.core.errors.InvalidAccessToken[source]

Bases: renku.core.errors.RenkuException

Raise when access token is incorrect.

Build a custom message.

exception renku.core.errors.InvalidFileOperation[source]

Bases: renku.core.errors.RenkuException

Raise when trying to perform invalid file operation.

exception renku.core.errors.InvalidGitURL[source]

Bases: renku.core.errors.GitError

Raise when a Git URL is not valid.

exception renku.core.errors.InvalidInputPath[source]

Bases: renku.core.errors.RenkuException

Raise when input path does not exist or is not in the repository.

exception renku.core.errors.InvalidOutputPath[source]

Bases: renku.core.errors.RenkuException

Raise when trying to work with an invalid output path.

exception renku.core.errors.InvalidSuccessCode(returncode, success_codes=None)[source]

Bases: renku.core.errors.RenkuException

Raise when the exit-code is not 0 or redefined.

Build a custom message.

exception renku.core.errors.InvalidTemplateError[source]

Bases: renku.core.errors.TemplateError

Raised when using a non-valid template.

exception renku.core.errors.MappingExistsError(existing_mappings)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter mapping exists already.

Embed exception and build a custom message.

exception renku.core.errors.MappingNotFoundError(mapping, workflow)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter mapping does not exist.

Embed exception and build a custom message.

exception renku.core.errors.MigrationError[source]

Bases: renku.core.errors.RenkuException

Raised when something went wrong during migrations.

exception renku.core.errors.MigrationRequired[source]

Bases: renku.core.errors.RenkuException

Raise when migration is required.

Build a custom message.

exception renku.core.errors.NodeNotFoundError[source]

Bases: renku.core.errors.RenkuException

Raised when NodeJs is not installed on the system.

Build a custom message.

exception renku.core.errors.NotFound[source]

Bases: renku.core.errors.RenkuException

Raise when an object is not found in KG.

exception renku.core.errors.NothingToCommit[source]

Bases: renku.core.errors.RenkuException

Raise when there is nothing to commit.

Build a custom message.

exception renku.core.errors.NothingToExecuteError[source]

Bases: renku.core.errors.RenkuException

Raised when a rerun/update command does not execute any workflows.

exception renku.core.errors.ObjectNotFoundError(filename)[source]

Bases: renku.core.errors.RenkuException

Raised when an object is not found in the storage.

Embed exception and build a custom message.

exception renku.core.errors.OperationError[source]

Bases: renku.core.errors.RenkuException

Raised when an operation at runtime raises an error.

exception renku.core.errors.OutputsNotFound(repository, inputs)[source]

Bases: renku.core.errors.RenkuException

Raise when there are not any detected outputs in the repository.

Build a custom message.

exception renku.core.errors.ParameterError(message, param_hint=None)[source]

Bases: renku.core.errors.RenkuException

Raise in case of invalid parameter.

Build a custom message.

exception renku.core.errors.ParameterLinkError(reason)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter link cannot be created.

Embed exception and build a custom message.

exception renku.core.errors.ParameterNotFoundError(parameter, workflow)[source]

Bases: renku.core.errors.RenkuException

Raised when a parameter reference cannot be resolved to a parameter.

Embed exception and build a custom message.

exception renku.core.errors.ProjectNotFound[source]

Bases: renku.core.errors.RenkuException

Raise when one or more projects couldn’t be found in the KG.

exception renku.core.errors.ProjectNotSupported[source]

Bases: renku.core.errors.RenkuException

Raise when project version is newer than the supported version.

Build a custom message.

exception renku.core.errors.ProtectedFiles(ignored)[source]

Bases: renku.core.errors.RenkuException

Raise when trying to work with protected files.

Build a custom message.

exception renku.core.errors.RenkuException[source]

Bases: Exception

A base class for all Renku related exception.

You can catch all errors raised by Renku SDK by using except RenkuException:.

exception renku.core.errors.RenkuImportError(exp, msg)[source]

Bases: renku.core.errors.RenkuException

Raised when a dataset cannot be imported.

Embed exception and build a custom message.

exception renku.core.errors.RenkuSaveError[source]

Bases: renku.core.errors.RenkuException

Raised when renku save doesn’t work.

exception renku.core.errors.RequestError[source]

Bases: renku.core.errors.RenkuException

Raise when a requests call fails.

exception renku.core.errors.SHACLValidationError[source]

Bases: renku.core.errors.RenkuException

Raises when SHACL validation of the graph fails.

exception renku.core.errors.TemplateError[source]

Bases: renku.core.errors.RenkuException

Base class for template-related exceptions.

exception renku.core.errors.TemplateMissingReferenceError[source]

Bases: renku.core.errors.TemplateError

Raised when using a non-valid template.

exception renku.core.errors.TemplateNotFoundError[source]

Bases: renku.core.errors.TemplateError

Raised when a template cannot be found in a template source or at a specific reference.

exception renku.core.errors.TemplateUpdateError[source]

Bases: renku.core.errors.TemplateError

Raised when a project couldn’t be updated from its template.

exception renku.core.errors.TerminalSizeError[source]

Bases: renku.core.errors.RenkuException

Raised when terminal is too small for a command.

exception renku.core.errors.UninitializedProject(repo_path)[source]

Bases: renku.core.errors.RenkuException

Raise when a project does not seem to have been initialized yet.

Build a custom message.

exception renku.core.errors.UnmodifiedOutputs(repository, unmodified)[source]

Bases: renku.core.errors.RenkuException

Raise when there are unmodified outputs in the repository.

Build a custom message.

exception renku.core.errors.UrlSchemeNotSupported[source]

Bases: renku.core.errors.RenkuException

Raised when adding data from unsupported URL schemes.

exception renku.core.errors.UsageError[source]

Bases: renku.core.errors.RenkuException

Raise in case of unintended usage of certain function calls.

exception renku.core.errors.WorkflowExecuteError(fail_reason=None)[source]

Bases: renku.core.errors.RenkuException

Raises when a workflow execution fails.

Build a custom message.

exception renku.core.errors.WorkflowRerunError(workflow_file)[source]

Bases: renku.core.errors.RenkuException

Raises when a workflow re-execution fails.

Build a custom message.

Utilities

Communicator classes for printing output.

class renku.core.util.communication.CommunicationCallback[source]

Bases: object

Base communication callback class.

confirm(msg, abort=False, warning=False, default=False)[source]

Get confirmation for an action.

echo(msg, end='\n')[source]

Write a message.

error(msg)[source]

Write an error message.

finalize_progress(name)[source]

End a progress tracker.

has_prompt()[source]

Return True if communicator provides a direct prompt to users.

info(msg)[source]

Write an info message.

prompt(msg, type=None, default=None, **kwargs)[source]

Show a message prompt.

start_progress(name, total, **kwargs)[source]

Create a new progress tracker.

update_progress(name, amount)[source]

Update a progress tracker.

warn(msg)[source]

Write a warning message.

renku.core.util.communication.confirm(msg, abort=False, warning=False, default=False)[source]

Get confirmation for an action from all listeners.

renku.core.util.communication.disable()[source]

Disable all outputs; by default everything is enabled.

renku.core.util.communication.enable()[source]

Enable all outputs.

renku.core.util.communication.error(msg)[source]

Write an info message to all listeners.

renku.core.util.communication.finalize_progress(name)[source]

End a progress tracker on all listeners.

renku.core.util.communication.has_prompt()[source]

Return True if communicator provides a direct prompt to users.

renku.core.util.communication.info(msg)[source]

Write an info message to all listeners.

renku.core.util.communication.prompt(msg, type=None, default=None, **kwargs)[source]

Show a message prompt.

renku.core.util.communication.start_progress(name, total, **kwargs)[source]

Start a progress tracker on all listeners.

renku.core.util.communication.subscribe(listener)[source]

Subscribe a communication listener.

renku.core.util.communication.unsubscribe(listener)[source]

Unsubscribe a communication listener.

renku.core.util.communication.update_progress(name, amount)[source]

Update a progress tracker on all listeners.

renku.core.util.communication.warn(msg)[source]

Write a warning message to all listeners.

Implement various context managers.

class renku.core.util.contexts.Isolation(**kwargs)[source]

Bases: contextlib.ExitStack

Isolate execution.

Create a context manager.

renku.core.util.contexts.chdir(path)[source]

Change the current working directory.

renku.core.util.contexts.click_context(path, command)[source]

Provide a click context with repo path injected.

renku.core.util.contexts.measure(message='TOTAL')[source]

Measure execution time of enclosing code block.

class renku.core.util.contexts.redirect_stdin(new_target)[source]

Bases: contextlib.ContextDecorator

Implement missing redirect stdin based on contextlib.py.

Keep the original stream.

Renku datetime utilities.

renku.core.util.datetime8601.fix_datetime(value)[source]

Fix timezone of non-aware datetime objects and remove microseconds.

renku.core.util.datetime8601.local_now()[source]

Return current datetime in local timezone.

renku.core.util.datetime8601.match_iso8601(string, pos=0, endpos=9223372036854775807)

Matches zero or more characters at the beginning of the string.

renku.core.util.datetime8601.parse_date(value)[source]

Convert date to datetime.

renku.core.util.datetime8601.validate_iso8601(str_val)[source]

Check if datetime string is in ISO8601 format.

Helper utilities for handling DOIs.

renku.core.util.doi.extract_doi(uri)[source]

Return the DOI in a string if there is one.

renku.core.util.doi.is_doi(uri)[source]

Check if URI is DOI.

Helper utilities for handling file size strings.

renku.core.util.file_size.bytes_to_unit(size_in_bytes, unit)[source]

Return size in the provided unit.

renku.core.util.file_size.parse_file_size(size_str)[source]

Parse a human readable file size to bytes.

Git utility functions.

renku.core.util.git.check_global_git_user_is_configured()[source]

Check that git user information is configured.

renku.core.util.git.clone_renku_repository(url, path, gitlab_token=None, deployment_hostname=None, depth=None, install_githooks=False, install_lfs=True, skip_smudge=True, recursive=True, progress=None, config=None, raise_git_except=False, checkout_revision=None, use_renku_credentials=False, reuse_existing_repository=False)[source]

Clone a Renku Repository.

Parameters
  • url (str) – The Git URL to clone.

  • path (Union[Path, str]) – The path to clone into.

  • gitlab_token – The gitlab OAuth2 token (Default value = None).

  • deployment_hostname – The hostname of the current renku deployment (Default value = None).

  • depth (Optional[int], optional) – The clone depth, number of commits from HEAD (Default value = None).

  • install_githooks – Whether to install git hooks (Default value = False).

  • install_lfs – Whether to install Git LFS (Default value = True).

  • skip_smudge – Whether to pull files from Git LFS (Default value = True).

  • recursive – Whether to clone recursively (Default value = True).

  • progress – The GitProgress object (Default value = None).

  • config (Optional[dict], optional) – Set configuration for the project (Default value = None).

  • raise_git_except – Whether to raise git exceptions (Default value = False).

  • checkout_revision – The revision to checkout after clone (Default value = None).

  • use_renku_credentials (bool, optional) – Whether to use Renku provided credentials (Default value = False).

  • reuse_existing_repository (bool, optional) – Whether to clone over an existing repository (Default value = False).

Returns

The cloned repository.

renku.core.util.git.clone_repository(url, path=None, install_githooks=True, install_lfs=True, skip_smudge=True, recursive=True, depth=None, progress=None, config=None, raise_git_except=False, checkout_revision=None, no_checkout=False, clean=False, clone_options=None)[source]

Clone a Git repository and install Git hooks and LFS.

Parameters
  • url – The Git URL to clone.

  • path (Union[Path, str], optional) – The path to clone into (Default value = None).

  • install_githooks – Whether to install git hooks (Default value = True).

  • install_lfs – Whether to install Git LFS (Default value = True).

  • skip_smudge – Whether to pull files from Git LFS (Default value = True).

  • recursive – Whether to clone recursively (Default value = True).

  • depth – The clone depth, number of commits from HEAD (Default value = None).

  • progress – The GitProgress object (Default value = None).

  • config (Optional[dict], optional) – Set configuration for the project (Default value = None).

  • raise_git_except – Whether to raise git exceptions (Default value = False).

  • checkout_revision – The revision to checkout after clone (Default value = None).

  • no_checkout (bool, optional) – Whether to perform a checkout (Default value = False).

  • clean (bool, optional) – Whether to require the target folder to be clean (Default value = False).

  • clone_options (List[str], optional) – Additional clone options (Default value = None).

Returns

The cloned repository.

renku.core.util.git.commit_changes(*paths, repository, message=None)[source]

Commit paths to the repository.

Parameters
  • *paths (Union[Path, str]) – The paths to commit.

  • repository (Repository) – The repository to commit to.

  • message – The commit message (Default value = None).

Raises

errors.GitError – If paths couldn’t be committed.

Returns

List of paths that were committed.

renku.core.util.git.create_backup_remote(repository, remote_name, url)[source]

Create a backup for remote_name and sets its url to url.

Parameters
  • repository (Repository) – The current repository.

  • remote_name (str) – The name of the backup remote.

  • url (str) – The remote URL.

Returns

Tuple of backup remote name, whether it existed already and the created

remote if successful.

Return type

Tuple[str, bool, Optional[Remote]]

renku.core.util.git.default_path(path='.')[source]

Return default repository path.

renku.core.util.git.get_cache_directory_for_repository(client, url)[source]

Return a path to client’s cache directory.

Parameters
  • clientLocalCLient.

  • url – The repository URL.

Returns

The path of the cache.

Return type

Path

renku.core.util.git.get_committer_agent(commit)[source]

Return committer SoftwareAgent.

Parameters

commit (Commit) – The commit to check.

Returns

The agent responsible for the commit.

Return type

SoftwareAgent

renku.core.util.git.get_entity_from_revision(repository, path, revision=None, bypass_cache=False)[source]

Return an Entity instance from given path and revision.

Parameters
  • repository (Repository) – The current repository.

  • path (Union[Path, str]) – The path of the entity.

  • revision (str, optional) – The revision to check at (Default value = None).

  • bypass_cache (bool) – Whether to ignore cached entries and get information from disk (Default value = False).

Returns

The Entity for the given path and revision.

Return type

Entity

renku.core.util.git.get_full_repository_path(url)[source]

Extract hostname/path of a git repository from its URL.

Parameters

url (Optional[str]) – The URL.

Returns

The hostname plus path extracted from the URL.

renku.core.util.git.get_git_progress_instance()[source]

Return a GitProgress object.

renku.core.util.git.get_git_user(repository)[source]

Return git user.

Parameters

repository (Optional[Repository]) – The Git repository.

Returns

The person associated with the repository.

Return type

Optional[Person]

renku.core.util.git.get_hook_path(repository, name)[source]

Return path to the given named hook in the given repository.

Parameters
  • repository – The current Git repository.

  • name (str) – The name of the hook.

Returns

Path to the hook.

Return type

Path

renku.core.util.git.get_oauth_url(url, gitlab_token)[source]

Format URL with a username and password.

Parameters
  • url – The URL to format.

  • gitlab_token – The Gitlab OAuth2 Token.

Returns

The URL with credentials added.

renku.core.util.git.get_remote(repository, *, name=None, url=None)[source]

Return repository’s remote using its name or url or return default remote if any.

Parameters
  • repository (Optional[Repository]) – The Git repository.

  • name (str, optional) – The name of the remote (Default value = None).

  • url (str, optional) – The remote URL (Default value = None).

Returns

The remote, if found.

Return type

Optional[Remote]

renku.core.util.git.get_renku_repo_url(remote_url, deployment_hostname=None, access_token=None)[source]

Return a repo url that can be authenticated by renku.

Parameters
  • remote_url – The repository URL.

  • deployment_hostname – The host name used by this deployment (Default value = None).

  • access_token – The OAuth2 access token (Default value = None).

Returns

The Renku repository URL with credentials.

renku.core.util.git.get_repository_name(url)[source]

Extract name of a git repository from its URL.

Parameters

url (str) – The URL to get the repository name from.

Returns

The repository name.

Return type

str

renku.core.util.git.have_same_remote(url1, url2)[source]

Checks if two git urls point to the same remote repo ignoring protocol and credentials.

Parameters
  • url1 – The first URL.

  • url2 – The second URL.

Returns

True if both URLs point to the same repository.

Return type

bool

renku.core.util.git.is_path_safe(path)[source]

Check if the path should be used in output.

Parameters

path (Union[Path, str]) – The path to check.

Returns

True if the path is safe else False.

Return type

bool

renku.core.util.git.is_valid_git_repository(repository)[source]

Return if is a git repository and has a valid HEAD.

Parameters

repository (Optional[Repository]) – The repository to check.

Returns

Whether or not this is a valid Git repository.

Return type

bool

renku.core.util.git.parse_git_url(url)[source]

Return parsed git url.

Parameters

url (Optional[str]) – The URL to parse.

Raises

errors.InvalidGitURL – If url is empty.

Returns

The parsed GitURL.

Return type

GitURL

renku.core.util.git.push_changes(repository, remote=None, reset=True)[source]

Push to a remote branch. If the remote branch is protected a new remote branch will be created and pushed to.

Parameters
  • repository (Repository) – The current repository.

  • remote (str, optional) – The remote to push to (Default value = None).

  • reset (bool, optional) – Whether to reset active branch to its upstream branch, used if changes get pushed to a temporary branch (Default value = True).

Raises

errors.GitError – If there’s no remote or the push fails.

Returns

Name of the branch that was pushed to.

Return type

str

renku.core.util.git.run_command(command, *paths, separator=None, **kwargs)[source]

Execute command by splitting paths to make sure that argument list will be within os limits.

Parameters
  • command – A list or tuple containing command and its arguments.

  • *paths – Paths to run on.

  • separator – Separator for paths if they need to be passed as string. (Default value = None)

Raises

errors.GitError – If a Git subcommand failed.

Returns

Result of last invocation.

Helpers functions for metadata management/parsing.

renku.core.util.metadata.construct_creator(creator, ignore_email)[source]

Parse input and return an instance of Person.

renku.core.util.metadata.construct_creators(creators, ignore_email=False)[source]

Parse input and return a list of Person.

renku.core.util.metadata.get_renku_version(client)[source]

Return project’s Renku version from its Dockerfile.

renku.core.util.metadata.is_external_file(path, client_path)[source]

Checks if a path is an external file.

renku.core.util.metadata.read_renku_version_from_dockerfile(path)[source]

Read RENKU_VERSION from the content of path if a valid version is available.

OS utility functions.

Return True if paths are equal or one is the parent of the other.

renku.core.util.os.delete_file(filepath, ignore_errors=True, follow_symlinks=False)[source]

Remove a file/symlink and its pointer file (for external files).

renku.core.util.os.get_absolute_path(path, base=None)[source]

Return absolute normalized path without resolving symlinks.

renku.core.util.os.get_files(path)[source]

Return all files from a starting file/directory.

renku.core.util.os.get_relative_path(path, base, strict=False)[source]

Return a relative path to the base if path is within base without resolving symlinks.

renku.core.util.os.get_relative_path_to_cwd(path)[source]

Get a relative path to current working directory.

renku.core.util.os.get_relative_paths(base, paths)[source]

Return a list of paths relative to a base path.

renku.core.util.os.get_safe_relative_path(path, base)[source]

Return a relative path to the base and check path is within base with all symlinks resolved.

NOTE: This is used to prevent path traversal attack.

renku.core.util.os.hash_file(path)[source]

Calculate the sha256 hash of a file.

renku.core.util.os.hash_str(content)[source]

Calculate the sha256 hash of a string.

renku.core.util.os.is_ascii(data)[source]

Check if provided string contains only ascii characters.

renku.core.util.os.is_path_empty(path)[source]

Check if path contains files.

Ref path

target path

renku.core.util.os.is_subpath(path, base)[source]

Return True if path is within base.

renku.core.util.os.normalize_to_ascii(input_string, sep='-')[source]

Convert a string to only contain ASCII characters, with non-ASCII substring replaced with sep.

renku.core.util.os.safe_read_yaml(file)[source]

Parse a YAML file.

Returns

In case of success a dictionary of the YAML’s content, otherwise raises a ParameterError exception.

Utility for working with HTTP session.

This module provides some wrapper functions around requests library. It sets a timeout and converts exception types whenever needed. Use this module instead of requests.

renku.core.util.requests.check_response(response)[source]

Check for expected response status code.

renku.core.util.requests.delete(url, headers=None)[source]

Send a DELETE request.

renku.core.util.requests.download_file(base_directory, url, filename, extract, chunk_size=16384)[source]

Download a URL to a given location.

renku.core.util.requests.get(url, headers=None)[source]

Send a GET request.

renku.core.util.requests.get_filename_from_headers(response)[source]

Extract filename from content-disposition headers if available.

renku.core.util.requests.get_redirect_url(url)[source]

Return redirect URL if any; otherwise, return the original URL.

renku.core.util.requests.head(url, *, allow_redirects=False, headers=None)[source]

Send a HEAD request.

renku.core.util.requests.post(url, *, data=None, files=None, headers=None, json=None, params=None)[source]

Send a POST request.

renku.core.util.requests.put(url, *, data=None, files=None, headers=None, params=None)[source]

Send a PUT request.

Helper utilities for interacting with remote source code management tools.

renku.core.util.scm.shorten_message(message, line_length=100, body_length=65000)[source]

Wraps and shortens a commit message.

Parameters
  • message (str) – message to adjust.

  • line_length (int, optional) – maximum line length before wrapping. 0 for infinite (Default value = 100).

  • body_length (int, optional) – maximum body length before cut. 0 for infinite (Default value = 65000).

Raises

ParameterError – If line_length or body_length < 0

Returns

message wrapped and trimmed.

JSON-LD SHACL validations.

renku.core.util.shacl.validate_graph(graph, shacl_path=None, format='nquads')[source]

Validate the current graph with a SHACL schema.

Uses default schema if not supplied.

Helper utilities for handling URLs.

renku.core.util.urls.check_url(url)[source]

Check if a url is local/remote and if it contains a git repository.

renku.core.util.urls.get_host(client)[source]

Return the hostname for the resource URIs.

Default is localhost. If RENKU_DOMAIN is set, it overrides the host from remote.

renku.core.util.urls.get_path(url)[source]

Return path part of a url.

renku.core.util.urls.get_slug(name, invalid_chars=None, lowercase=True)[source]

Create a slug from name.

renku.core.util.urls.parse_authentication_endpoint(endpoint, client_dispatcher, use_remote=False)[source]

Return a parsed url.

If an endpoint is provided then use it, otherwise, look for a configured endpoint. If no configured endpoint exists then try to use project’s remote url.

renku.core.util.urls.provider_check(url)[source]

Check additional provider related operations.

renku.core.util.urls.remove_credentials(url)[source]

Remove username and password from a URL.

renku.core.util.urls.url_to_string(url)[source]

Convert url from list or ParseResult to string.

General utility functions.

renku.core.util.util.to_semantic_version(value)[source]

Convert value to SemVer.

renku.core.util.util.to_string(value)[source]

Return a string representation of value and return an empty string for None.

Helper utilities for handling UUIDs.

renku.core.util.uuid.is_uuid(value)[source]

Check if value is UUID4.

Copied from https://stackoverflow.com/questions/19989481/

Support JSON-LD context in models.

class renku.core.util.yaml.NoDatesSafeLoader(stream)[source]

Bases: yaml.cyaml.CSafeLoader

Used to safely load basic python objects but ignore datetime strings.

classmethod remove_implicit_resolver(tag_to_remove)[source]

Remove implicit resolvers for a particular tag.

Takes care not to modify resolvers in super classes.

We want to load datetimes as strings, not dates, because we go on to serialize as json which doesn’t have the advanced types of yaml, and leads to incompatibilities down the track.

renku.core.util.yaml.load_yaml(data)[source]

Load YAML data and return its content as a dict.

renku.core.util.yaml.read_yaml(path)[source]

Load YAML file and return its content as a dict.

renku.core.util.yaml.write_yaml(path, data)[source]

Store data to a YAML file.

Repository

Client for handling a local repository.

class renku.core.management.repository.PathMixin(path=<function default_path>)[source]

Define a default path attribute.

Method generated by attrs for class PathMixin.

class renku.core.management.repository.RepositoryApiMixin(renku_home='.renku', parent=None, remote_cache=NOTHING, *, data_dir='data')[source]

Client for handling a local repository.

Method generated by attrs for class RepositoryApiMixin.

DATABASE_PATH = 'metadata'

Directory for metadata storage.

DOCKERFILE = 'Dockerfile'

Name of the Dockerfile in the repository.

LOCK_SUFFIX = '.lock'

Default suffix for Renku lock file.

data_dir

Define a name of the folder for storing datasets.

property database_path

Path to the metadata storage directory.

property docker_path

Path to the Dockerfile.

get_in_submodules(commit, path)[source]

Resolve filename in submodules.

has_graph_files()[source]

Return true if database exists.

has_template_checksum()[source]

Return if project has a templates checksum file.

init_repository(force=False, user=None, initial_branch=None)[source]

Initialize an empty Renku repository.

is_project_set()[source]

Return if project is set for the client.

is_protected_path(path)[source]

Checks if a path is a protected path.

property latest_agent

Returns latest agent version used in the repository.

property lock

Create a Renku config lock.

parent

Store a pointer to the parent repository.

property project

Return the Project instance.

property remote

Return host, owner and name of the remote if it exists.

renku_home

Define a name of the Renku folder (Default value = ‘.renku’).

renku_path

Store a Path instance of the Renku folder.

property template_checksums

Return a Path instance to the template checksums file.

property transaction_id

Get a transaction id for the current client to be used for grouping git commits.

with_metadata(project_gateway, database_gateway, read_only=False, name=None, description=None, keywords=None, custom_metadata=None)[source]

Yield an editable metadata object.

Git Internals

Wrap Git client.

class renku.core.management.git.GitCore[source]

Wrap Git client.

Method generated by attrs for class GitCore.

property candidate_paths

Return all paths in the index and untracked files.

commit(commit_only=None, commit_empty=True, raise_if_empty=False, commit_message=None, abbreviate_message=True, skip_dirty_checks=False)[source]

Automatic commit.

property dirty_paths

Get paths of dirty files in the repository.

ensure_clean(ignore_std_streams=False)[source]

Make sure the repository is clean.

ensure_unstaged(path)[source]

Ensure that path is not part of git staged files.

ensure_untracked(path)[source]

Ensure that path is not part of git untracked files.

find_ignored_paths(*paths)[source]

Return ignored paths matching .gitignore file.

property modified_paths

Return paths of modified files.

remove_unmodified(paths, autocommit=True)[source]

Remove unmodified paths and return their names.

setup_credential_helper()[source]

Setup git credential helper to cache if not set already.

worktree(path=None, branch_name=None, commit=None, merge_args=('--ff-only',))[source]

Create new worktree.

renku.core.management.git.finalize_commit(client, diff_before, commit_only=None, commit_empty=True, raise_if_empty=False, commit_message=None, abbreviate_message=True, skip_staging=False)[source]

Commit modified/added paths.

renku.core.management.git.finalize_worktree(client, isolation, path, branch_name, delete, new_branch, merge_args=('--ff-only',), exception=None)[source]

Cleanup and merge a previously created Git worktree.

renku.core.management.git.get_mapped_std_streams(lookup_paths, streams=('stdin', 'stdout', 'stderr'))[source]

Get a mapping of standard streams to given paths.

renku.core.management.git.prepare_commit(client, commit_only=None, skip_dirty_checks=False, skip_staging=False)[source]

Gather information about repo needed for committing later on.

renku.core.management.git.prepare_worktree(original_client, path=None, branch_name=None, commit=None)[source]

Set up a Git worktree to provide isolation.

Git utilities.

class renku.domain_model.git.GitURL(href, path=None, scheme='ssh', hostname='localhost', username=None, password=None, port=None, owner=None, name=None, slug=None, regex=None)[source]

Parser for common Git URLs.

Method generated by attrs for class GitURL.

property image

Return image name.

property instance_url

Get the url of the git instance.

classmethod parse(href)[source]

Derive URI components.

renku.domain_model.git.filter_repo_name(repo_name)[source]

Remove the .git extension from the repo name.