Renku Command Line¶
The base command for interacting with the Renku platform.
renku (base command)¶
To list the available commands, either run
renku with no parameters or
$ renku help Usage: renku [OPTIONS] COMMAND [ARGS]... Check common Renku commands used in various situations. Options: --version Print version number. --global-config-path Print global application's config path. --install-completion Install completion for the current shell. --path <path> Location of a Renku repository. [default: (dynamic)] --external-storage / -S, --no-external-storage Use an external file storage service. -h, --help Show this message and exit. Commands: # [...]
Depending on your system, you may find the configuration files used by Renku command line in a different folder. By default, the following rules are used:
If in doubt where to look for the configuration file, you can display its path
Create an empty Renku project or reinitialize an existing one.
Start a Renku project¶
If you have an existing directory which you want to turn into a Renku project, you can type:
$ cd ~/my_project $ renku init
$ renku init ~/my_project
This creates a new subdirectory named
.renku that contains all the
necessary files for managing the project configuration.
If provided directory does not exist, it will be created.
Use a different template¶
Renku is installed together with a specific set of templates you can select when you initialize a project. You can check them by typing:
$ renku init --list-templates INDEX ID DESCRIPTION PARAMETERS ----- ------ ------------------------------- ----------------------------- 1 python The simplest Python-based [...] description: project des[...] 2 R R-based renku project with[...] description: project des[...]
If you know which template you are going to use, you can provide either the id
--template-id or the template index number
You can use a newer version of the templates or even create your own one and
provide it to the
init command by specifying the target template repository
--template-source (both local path and remote url are supported) and
--template-ref (branch, tag or commit).
You can take inspiration from the official Renku template repository
$ renku init --template-ref master --template-source \ https://github.com/SwissDataScienceCenter/renku-project-template Fetching template from https://github.com/SwissDataScienceCenter/renku-project-template@master ... OK INDEX ID DESCRIPTION PARAMETERS ----- -------------- -------------------------- ---------------------- 1 python-minimal Basic Python Project:[...] description: proj[...] 2 R-minimal Basic R Project: The [...] description: proj[...] Please choose a template by typing the index:
Provide parameters ~~~~~~~~~~~~~~~~~-
Some templates require parameters to properly initialize a new project. You
can check them by listing the templates
To provide parameters, use the
--parameter option and provide each
$ renku init --template-id python-minimal --parameter \ "description"="my new shiny project" Initializing new Renku repository... OK
If you don’t provide the required parameters through the option
-parameter, you will be asked to provide them. Empty values are allowed
and passed to the template initialization function.
Every project requires a
name that can either be provided using
--name or automatically taken from the target folder. This is
also considered as a special parameter, therefore it’s automatically added
to the list of parameters forwarded to the
Update an existing project¶
There are situations when the required structure of a Renku project needs
to be recreated or you have an existing Git repository. You can solve
these situation by simply adding the
$ git init . $ echo "# Example\nThis is a README." > README.md $ git add README.md $ git commit -m 'Example readme file' # renku init would fail because there is a git repository $ renku init --force
You can also enable the external storage system for output files, if it was not installed previously.
$ renku init --force --external-storage
Clone a Renku project.
Cloning a Renku project¶
To clone a Renku project use
renku clone command. This command is preferred
git clone because it sets up required Git hooks and enables
$ renku clone <repository-url> <destination-directory>
It creates a new directory with the same name as the project. You can change the directory name by passing another name on the command line.
renku clone pulls data from Git-LFS after cloning. If you don’t
need the LFS data, pass
--no-pull-data option to skip this step.
To move a project to another Renku deployment you need to create a new empty project in the target deployment and push both the repository and Git-LFS objects to the new remote. Refer to Git documentation for more details.
$ git lfs fetch --all $ git remote remove origin $ git remote add origin <new-repository-url> $ git push --mirror origin
Get and set Renku repository or global options.
You can set various Renku configuration options, for example the image registry URL, with a command like:
$ renku config set interactive.default_url "/tree"
By default, configuration is stored locally in the project’s directory. Use
--global option to store configuration for all projects in your home
To remove a specific key from configuration use:
$ renku config remove interactive.default_url
By default, only local configuration is searched for removal. Use
option to remove a global configuration value.
You can display all configuration values with:
$ renku config show [renku "interactive"] default_url = /lab
Both local and global configuration files are read. Values in local
configuration take precedence over global values. Use
--global flag to read corresponding configuration only.
You can provide a KEY to display only its value:
$ renku config show interactive.default_url default_url = /lab
Available configuration values¶
The following values are available for the
renku config command:
|show_lfs_message||Whether to show messages about files being added to git LFS or not||
|lfs_threshold||Threshold file size below which files are not added to git LFS||
|zenodo.access_token||Access token for Zenodo API||
|dataverse.access_token||Access token for Dataverse API||
|dataverse.server_url||URL for the Dataverse API server to use||
|interactive.default_url||URL for interactive environments||
|interactive.cpu_request||CPU quota for environments||
|interactive.mem_request||Memory quota for environments||
|interactive.gpu_request||GPU quota for environments||
|interactive.lfs_auto_fetch||Whether to automatically fetch lfs files on environments startup||
|interactive.image||Pinned Docker image for environments||
Renku CLI commands for handling of datasets.
Creating an empty dataset inside a Renku project:
$ renku dataset create my-dataset Creating a dataset ... OK
You can pass the following options to this command to set various metadata for the dataset.
|-t, –title||A human-readable title for the dataset.|
|-d, –description||Dataset’s description.|
|-c, –creator||Creator’s name, email, and an optional affiliation. Accepted format is ‘Forename Surname <email> [affiliation]’. Pass multiple times for a list of creators.|
|-k, –keyword||Dataset’s keywords. Pass multiple times for a list of keywords.|
Editing a dataset’s metadata
edit subcommand to change metadata of a dataset. You can edit the same
set of metadata as the create command by passing the options described in the
$ renku dataset edit my-dataset --title 'New title' Successfully updated: title.
Listing all datasets:
$ renku dataset ls ID NAME TITLE VERSION -------- ------------- ------------- --------- 0ad1cb9a some-dataset Some Dataset 9436e36c my-dataset My Dataset
You can select which columns to display by using
--columns to pass a
comma-separated list of column names:
$ renku dataset ls --columns id,name,date_created,creators ID NAME CREATED CREATORS -------- ------------- ------------------- --------- 0ad1cb9a some-dataset 2020-03-19 16:39:46 sam 9436e36c my-dataset 2020-02-28 16:48:09 sam
Displayed results are sorted based on the value of the first column.
To inspect the state of the dataset on a given commit we can use
flag for it:
$ renku dataset ls --revision=1103a42bd3006c94ef2af5d6a5e03a335f071215 ID NAME TITLE VERSION a1fd8ce2 201901_us_flights_1 2019-01 US Flights 1 c2d80abe ds1 ds1
Showing dataset details:
. code-block:: console
$ renku dataset show some-dataset Name: some-dataset Created: 2020-12-09 13:52:06.640778+00:00 Creator(s): John Doe<email@example.com> [SDSC] Keywords: Dataset, Data Title: Some Dataset Description: Just some dataset
Deleting a dataset:
$ renku dataset rm some-dataset OK
Working with data¶
Adding data to the dataset:
$ renku dataset add my-dataset http://data-url
This will copy the contents of
data-url to the dataset and add it
to the dataset metadata.
You can create a dataset when you add data to it for the first time by passing
--create flag to add command:
$ renku dataset add --create new-dataset http://data-url
To add data from a git repository, you can specify it via https or git+ssh URL schemes. For example,
$ renku dataset add my-dataset git+ssh://host.io/namespace/project.git
Sometimes you want to add just specific paths within the parent project.
In this case, use the
$ renku dataset add my-dataset --source path/within/repo/to/datafile \ git+ssh://host.io/namespace/project.git
The command above will result in a structure like
data/ my-dataset/ datafile
You can use shell-like wildcards (e.g. , *, ?) when specifying paths to be added. Put wildcard patterns in quotes to prevent your shell from expanding them.
$ renku dataset add my-dataset --source 'path/**/datafile' \ git+ssh://host.io/namespace/project.git
You can use
-d flag to set the location where the new
data is copied to. This location be will under the dataset’s data directory and
will be created if does not exists. You will get an error message if the
destination exists and is a file.
$ renku dataset add my-dataset \ --source path/within/repo/to/datafile \ --destination new-dir/new-subdir \ git+ssh://host.io/namespace/project.git
data/ my-dataset/ new-dir/ new-subdir/ datafile
To add a specific version of files, use
--ref option for selecting a
branch, commit, or tag. The value passed to this option must be a valid
reference in the remote Git repository.
Adding external data to the dataset:
Sometimes you might want to add data to your dataset without copying the
actual files to your repository. This is useful for example when external data
is too large to store locally. The external data must exist (i.e. be mounted)
on your filesystem. Renku creates a symbolic to your data and you can use this
symbolic link in renku commands as a normal file. To add an external file pass
-e when adding local data to a dataset:
$ renku dataset add my-dataset -e /path/to/external/file
Updating a dataset:
After adding files from a remote Git repository or importing a dataset from a
provider like Dataverse or Zenodo, you can check for updates in those files by
renku dataset update command. For Git repositories, this command
checks all remote files and copies over new content if there is any. It does
not delete files from the local dataset if they are deleted from the remote Git
repository; to force the delete use
--delete argument. You can update to a
specific branch, commit, or tag by passing
For datasets from providers like Dataverse or Zenodo, the whole dataset is
updated to ensure consistency between the remote and local versions. Due to
this limitation, the
--exclude flags are not compatible
with those datasets. Modifying those datasets locally will prevent them from
You can limit the scope of updated files by specifying dataset names, using
--exclude to filter based on file names, or using
--creators to filter based on creators. For example, the following command
updates only CSV files from
$ renku dataset update -I '*.csv' my-dataset
Note that putting glob patterns in quotes is needed to tell Unix shell not to expand them.
External data are not updated automatically because they require a checksum
calculation which can take a long time when data is large. To update external
-e to the update command:
$ renku dataset update -e
Tagging a dataset:
A dataset can be tagged with an arbitrary tag to refer to the dataset at that point in time. A tag can be added like this:
$ renku dataset tag my-dataset 1.0 -d "Version 1.0 tag"
A list of all tags can be seen by running:
$ renku dataset ls-tags my-dataset CREATED NAME DESCRIPTION DATASET COMMIT ------------------- ------ --------------- ---------- ---------------- 2020-09-19 17:29:13 1.0 Version 1.0 tag my-dataset 6c19a8d31545b...
A tag can be removed with:
$ renku dataset rm-tags my-dataset 1.0
Importing data from other Renku projects:
To import all data files and their metadata from another Renku dataset use:
$ renku dataset import \ https://renkulab.io/projects/<username>/<project>/datasets/<dataset-id>
$ renku dataset import \ https://renkulab.io/datasets/<dataset-id>
You can get the link to a dataset form the UI or you can construct it by knowing the dataset’s ID.
Importing data from an external provider:
$ renku dataset import 10.5281/zenodo.3352150
This will import the dataset with the DOI (Digital Object Identifier)
10.5281/zenodo.3352150 and make it locally available.
Dataverse and Zenodo are supported, with DOIs (e.g.
doi:10.5281/zenodo.3352150) and full URLs (e.g.
http://zenodo.org/record/3352150). A tag with the remote version of the
dataset is automatically created.
Exporting data to an external provider:
$ renku dataset export my-dataset zenodo
This will export the dataset
zenodo.org as a draft,
allowing for publication later on. If the dataset has any tags set, you
can chose if the repository HEAD version or one of the tags should be
exported. The remote version will be set to the local tag that is being
To export to a Dataverse provider you must pass Dataverse server’s URL and the name of the parent dataverse where the dataset will be exported to. Server’s URL is stored in your Renku setting and you don’t need to pass it every time.
Listing all files in the project associated with a dataset.
$ renku dataset ls-files DATASET NAME ADDED PATH LFS ------------------- ------------------- ----------------------------- ---- my-dataset 2020-02-28 16:48:09 data/my-dataset/add-me * my-dataset 2020-02-28 16:49:02 data/my-dataset/weather/file1 * my-dataset 2020-02-28 16:49:02 data/my-dataset/weather/file2 my-dataset 2020-02-28 16:49:02 data/my-dataset/weather/file3 *
You can select which columns to display by using
--columns to pass a
comma-separated list of column names:
$ renku dataset ls-files --columns name,creators, path DATASET NAME CREATORS PATH ------------------- --------- ----------------------------- my-dataset sam data/my-dataset/add-me my-dataset sam data/my-dataset/weather/file1 my-dataset sam data/my-dataset/weather/file2 my-dataset sam data/my-dataset/weather/file3
Displayed results are sorted based on the value of the first column.
Sometimes you want to filter the files. For this we use
$ renku dataset ls-files --include "file*" --exclude "file3" DATASET NAME ADDED PATH LFS ------------------- ------------------- ----------------------------- ---- my-dataset 2020-02-28 16:49:02 data/my-dataset/weather/file1 * my-dataset 2020-02-28 16:49:02 data/my-dataset/weather/file2 *
Unlink a file from a dataset:
$ renku dataset unlink my-dataset --include file1 OK
Unlink all files within a directory from a dataset:
$ renku dataset unlink my-dataset --include "weather/*" OK
Unlink all files from a dataset:
$ renku dataset unlink my-dataset Warning: You are about to remove following from "my-dataset" dataset. .../my-dataset/weather/file1 .../my-dataset/weather/file2 .../my-dataset/weather/file3 Do you wish to continue? [y/N]:
unlink command does not delete files,
only the dataset record.
Track provenance of data created by executing programs.
Capture command line execution¶
Tracking execution of your command line script is done by simply adding the
renku run command before the actual command. This will enable detection of:
- arguments (flags),
- string and integer options,
- input files or directories if linked to existing paths in the repository,
- output files or directories if modified or created while running the command.
If there were uncommitted changes in the repository, then the
renku run command fails. See git status for details.
Input and output paths can only be detected if they are passed as
Circular dependencies are not supported for
renku run. See
Circular Dependencies for more details.
When using output redirection in
renku run on Windows (with
`` > file`` or `` 2> file``), all Renku errors and messages are redirected
as well and
renku run produces no output on the terminal. On Linux,
this is detected by renku and only the output of the command to be run is
actually redirected. Renku specific messages such as errors get printed to
the terminal as usual and don’t get redirected.
Detecting input paths¶
Any path passed as an argument to
renku run, which was not changed during
the execution, is identified as an input path. The identification only works if
the path associated with the argument matches an existing file or directory
in the repository.
The detection might not work as expected if:
- a file is modified during the execution. In this case it will be stored as an output;
- a path is not passed as an argument to
Specifying auxiliary inputs (
You can specify extra inputs to your program explicitly by using the
--input option. This is useful for specifying hidden dependencies
that don’t appear on the command line. Explicit inputs must exist before
renku run command. This option is not a replacement for
the arguments that are passed on the command line. Files or directories
specified with this option will not be passed as input arguments to the
Disabling input detection (
Input paths detection can be disabled by passing
renku run. In this case, only the directories/files that are
passed as explicit input are considered to be file inputs. Those passed via
command arguments are ignored unless they are in the explicit inputs list.
This only affects files and directories; command options and flags are
still treated as inputs.
Detecting output paths¶
Any path modified or created during the execution will be added as an output.
Because the output path detection is based on the Git repository state after
the execution of
renku run command, it is good to have a basic
understanding of the underlying principles and limitations of tracking
files in Git.
Git tracks not only the paths in a repository, but also the content stored in those paths. Therefore:
- a recreated file with the same content is not considered an output file, but instead is kept as an input;
- file moves are detected based on their content and can cause problems;
- directories cannot be empty.
When in doubt whether the outputs will be detected, remove all
git rm <path> followed by
git commit before running
renku run command.
Command does not produce any files (
If the program does not produce any outputs, the execution ends with an error:
Error: There are not any detected outputs in the repository.
You can specify the
--no-output option to force tracking of such
Specifying outputs explicitly (
You can specify expected outputs of your program explicitly by using the
--output option. These output must exist after the execution of the
renku run command. However, they do not need to be modified by
Disabling output detection (
Output paths detection can be disabled by passing
renku run. When disabled, only the directories/files that are
passed as explicit output are considered to be outputs and those passed via
command arguments are ignored.
Detecting standard streams¶
Often the program expect inputs as a standard input stream. This is detected
and recorded in the tool specification when invoked by
renku run cat < A.
Similarly, both redirects to standard output and standard error output can be done when invoking a command:
$ renku run grep "test" B > C 2> D
Detecting inputs and outputs from pipes
| is not supported.
Specifying inputs and outputs programmatically¶
Sometimes the list of inputs and outputs are not known before execution of the program. For example, a program might accept a date range as input and access all files within that range during its execution.
To address this issue, the program can dump a list of input and output files
that it is accessing in
outputs.txt. Each line in these
files is expected to be the path to an input or output file within the
project’s directory. When the program is finished, Renku will look for
existence of these two files and adds their content to the list of explicit
inputs and outputs. Renku will then delete these two files.
By default, Renku looks for these two files in
.renku/tmp directory. One
can change this default location by setting
environment variable. When set, it points to a sub-directory within the
.renku/tmp directory where
All Unix commands return a number between 0 and 255 which is called “exit code”. In case other numbers are returned, they are treaded module 256 (-10 is equivalent to 246, 257 is equivalent to 1). The exit-code 0 represents a success and non-zero exit-code indicates a failure.
Therefore the command specified after
renku run is expected to return
exit-code 0. If the command returns different exit code, you can specify them
$ renku run --success-code=1 --no-output fail
Circular dependencies are not supported in
renku run. This means you cannot
use the same file or directory as both an input and an output in the same step,
for instance reading from a file as input and then appending to it is not
allowed. Since renku records all steps of an analysis workflow in a dependency
graph and it allows you to update outputs when an input changes, this would
lead to problems with circular dependencies. An update command would change the
input again, leading to renku seeing it as a changed input, which would run
update again, and so on, without ever stopping.
Due to this, the renku dependency graph has to be acyclic. So instead of appending to an input file or writing an output file to the same directory that was used as an input directory, create new files or write to other directories, respectively.
Show provenance of data created by executing programs.
Unlike the traditional file history format, which shows previous revisions of the file, this format presents tool inputs together with their revision identifiers.
* character shows to which lineage the specific file belongs to.
@ character in the graph lineage means that the corresponding file does
not have any inputs and the history starts there.
When called without file names,
renku log shows the history of most
recently created files. With the
--revision <refname> option the output is
shown as it was in the specified revision.
renku log B
- Show the history of file
Bsince its last creation or modification.
renku log --revision HEAD~5
- Show the history of files that have been created or modified 5 commits ago.
renku log --revision e3f0bd5a D E
- Show the history of files
Eas it looked in the commit
Following formats supported when specified with
You can generate a PNG of the full history of all files in the repository using the dot program.
$ FILES=$(git ls-files --no-empty-directory --recurse-submodules) $ renku log --format dot $FILES | dot -Tpng > /tmp/graph.png $ open /tmp/graph.png
--strict option forces the output to be validated against the Renku
SHACL schema, causing the command to fail if the generated output is not
valid, as well as printing detailed information on all the issues found.
--strict option is only supported for the
nt output formats.
Show status of data files created in the repository.
Inspecting a repository¶
Displays paths of outputs which were generated from newer inputs files and paths of files that have been used in diverent versions.
The first paths are what need to be recreated by running
See more in section about renku update.
The paths mentioned in the output are made relative to the current directory
if you are working in a subdirectory (this is on purpose, to help
cutting and pasting to other commands). They also contain first 8 characters
of the corresponding commit identifier after the
# (hash). If the file was
imported from another repository, the short name of is shown together with the
Update outdated files created by the “run” command.
Recreating outdated files¶
The information about dependencies for each file in the repository is generated from information stored in the underlying Git repository.
A minimal dependency graph is generated for each outdated file stored in the repository. It means that only the necessary steps will be executed and the workflow used to orchestrate these steps is stored in the repository.
Assume that the following history for the file
C---D---E / \ A---B---F---G---H
The first example shows situation when
D is modified and files
H become outdated.
C--*D*--(E) / \ A---B---F---G---(H) ** - modified () - needs update
In this situation, you can do effectively two things:
Recreate a single file by running
$ renku update E
Update all files by simply running
$ renku update --all
If there were uncommitted changes then the command fails. Check git status to see details.
In the next example, files
B are modified, hence the majority
of dependent files must be recreated.
(C)--(D)--(E) / \ *A*--*B*--(F)--(G)--(H)
To avoid excessive recreation of the large portion of files which could have
been affected by a simple change of an input file, consider specifying a single
renku update G). See also renku status.
If a tool produces multiple output files, these outputs need to be always updated together.
(B) / *A*--[step 1]--(C) \ (D)
An attempt to update a single file would fail with the following error.
$ renku update C Error: There are missing output siblings: B D Include the files above in the command or use --with-siblings option.
The following commands will produce the same result.
$ renku update --with-siblings C $ renku update B C D
Recreate files created by the “run” command.
Assume you have run a step 2 that uses a stochastic algorithm, so each run
will be slightly different. The goal is to regenerate output
times to compare the output. In this situation it is not possible to simply
call renku update since the input file
A has not been modified
after the execution of step 2.
A-[step 1]-B-[step 2*]-C
Recreate a specific output file by running:
$ renku rerun C
If you would like to recreate a file which was one of several produced by a tool, then these files must be recreated as well. See the explanation in updating siblings.
Remove a file, a directory, or a symlink.
Removing a file that belongs to a dataset will update its metadata. It also will attempt to update tracking information for files stored in an external storage (using Git LFS).
Move or rename a file, a directory, or a symlink.
Moving a file that belongs to a dataset will update its metadata to include its
new path and commit. Moreover, tracking information in an external storage
(e.g. Git LFS) will be updated. Move operation fails if a destination already
exists in the repo; use
--force flag to overwrite them.
If you want to move files to another dataset use
--to-dataset along with
destination’s dataset name. This removes source paths from all datasets’
metadata that include them (if any) and adds them to the destination’s dataset
The following command moves
directory and adds them to
target-dataset’s metadata. If the source files
belong to one or more datasets then they will be removed from their metadata.
$ renku mv data/src README data/dst --to-dataset target-dataset
Manage the set of CWL files created by
$ renku workflow ls 26be2e8d66f74130a087642768f2cef0_rerun.yaml: 199c4b9d462f4b27a4513e5e55f76eb2_cat.yaml: 9bea2eccf9624de387d9b06e61eec0b6_rerun.yaml: b681b4e229764ceda161f6551370af12_update.yaml: 25d0805243e3468d92a3786df782a2c4_rerun.yaml:
*.yaml file corresponds to a renku run/update/rerun execution.
You can export the workflow to create a file as Common Workflow Language by using:
$ renku workflow set-name create output_file baseCommand: - cat class: CommandLineTool cwlVersion: v1.0 id: 22943eca-fa4c-4f3b-a92d-f6ac7badc0d2 inputs: - default: class: File path: /home/user/project/intermediate id: inputs_1 inputBinding: position: 1 type: File - default: class: File path: /home/user/project/intermediate2 id: inputs_2 inputBinding: position: 2 type: File outputs: - id: output_stdout streamable: false type: stdout requirements: InitialWorkDirRequirement: listing: - entry: $(inputs.inputs_1) entryname: intermediate writable: false - entry: $(inputs.inputs_2) entryname: intermediate2 writable: false stdout: output_file
You can use
--revision to specify the revision of the output file to
generate the workflow for. You can also export to a file directly with
Convenience method to save local changes and push them to a remote server.
If you have local modification to files, you can save them using
$ renku save Username for 'https://renkulab.io': my.user Password for 'https://firstname.lastname@example.org': Successfully saved: file1 file2 OK
The username and password for renku save are your gitlab user/password, not your renkulab login!
You can additionally supply a message that describes the changes that you
made by using the
--message parameter followed by your
$ renku save -m "Updated file1 and 2." Successfully saved: file1 file2 OK
If no remote server has been configured, you can specify one by using the
--destination parameter. Otherwise you will get an error.
$ renku save Error: No remote has been set up for the current branch $ renku save -d https://renkulab.io/gitlab/my.user/my-project.git Successfully saved: file1 file2 OK
You can also specify which paths to save:
$ renku save file1 Successfully saved: file1 OK
Show information about objects in current repository.
In situations when multiple outputs have been generated by a single
renku run command, the siblings can be discovered by running
renku show siblings PATH command.
Assume that the following graph represents relations in the repository.
D---E---G / \ A---B---C F
Then the following outputs would be shown.
$ renku show siblings C C D $ renku show siblings G F G $ renku show siblings A A $ renku show siblings C G C D --- F G $ renku show siblings A --- B --- C D --- E --- F G
You can use the
--flat flag to output a flat list, as well as
--verbose flag to also output commit information.
Input and output files¶
You can list input and output files generated in the repository by running
renku show inputs and
renku show outputs commands. Alternatively,
you can check if all paths specified as arguments are input or output files
$ renku run wc < source.txt > result.wc $ renku show inputs source.txt $ renku show outputs result.wc $ renku show outputs source.txt $ echo $? # last command finished with an error code 1
You can use the
--verbose flag to print detailed information
in a tabular format.
$ renku show inputs -v PATH COMMIT USAGE TIME WORKFLOW ---------- ------- ------------------- -------------------...----------- source.txt 6d10e05 2020-09-14 23:47:17 .renku/workflow/388...d8_head.yaml
Manage an external storage.
Pulling files from git LFS¶
LFS works by checking small pointer files into git and saving the actual contents of a file in LFS. If instead of your file content, you see something like this, it means the file is stored in git LFS and its contents are not currently available locally (they are not pulled):
version https://git-lfs.github.com/spec/v1 oid sha256:42b5c7fb2acd54f6d3cd930f18fee3bdcb20598764ca93bdfb38d7989c054bcf size 12
You can manually pull contents of file(s) you want with:
$ renku storage pull file1 file2
Removing local content of files stored in git LFS¶
If you want to restore a file back to its pointer file state, for instance to free up space locally, you can run:
$ renku storage clean file1 file2
This removes any data cached locally for files tracked in in git LFS.
Migrate large files to git LFS¶
If you accidentally checked a large file into git or are moving a non-LFS renku repo to git LFS, you can use the following command to migrate the files to LFS:
$ renku storage migrate --all
This will move all files that are bigger than the renku lfs_threshold config value and are not excluded by .renkulfsignore into git LFS.
To only migrate specific files, you can also pass their paths to the command like:
$ renku storage migrate big_file other_big_file
Check your system and repository for potential problems.
Migrate project to the latest Renku version.
Install and uninstall Git hooks.
Prevent modifications of output files¶
The commit hooks are enabled by default to prevent situation when some output file is manually modified.
$ renku init $ renku run echo hello > greeting.txt $ edit greeting.txt $ git commit greeting.txt You are trying to update some output files. Modified outputs: greeting.txt If you are sure, use "git commit --no-verify".
Renku is not bug-free and you can help us to find them.
You can quickly open an issue on GitHub with a traceback and minimal system information when you hit an unhandled exception in the CLI.
Ahhhhhhhh! You have found a bug. 🐞 1. Open an issue by typing "open"; 2. Print human-readable information by typing "print"; 3. See the full traceback without submitting details (default: "ignore"). Please select an action by typing its name (open, print, ignore) [ignore]:
renku as a hosted service the Sentry integration can be enabled
to help developers iterate faster by showing them where bugs happen, how often,
and who is affected.
- Install Sentry-SDK with
python -m pip install sentry-sdk;
- Set environment variable
User information might be sent to help resolving the problem. If you are not using your own Sentry instance you should inform users that you are sending possibly sensitive information to a 3rd-party service.