Transferring Files To and From a Remote Host#

Transfer a local file to a remote host’s filesystem using Rsync via SSH.

The following example transfers a file located on the remote host’s filesystem at /home/ubuntu/remote_unprocessed_file.png to the local filesystem at unprocessed_file.png using Rsync via SSH.

Prerequisites#

Upload a color image file to a remote host. Make note of the address of the remote host.

Procedure#

Make sure to swap the private_key and remote_host values with your own.

  1. Define an Rsync strategy with the remote host and private key path to be used for SSH.

[1]:
import covalent as ct
from pathlib import Path
from typing import List, Tuple
from skimage import io, color

private_key = "/path/to/private/key"
host_address = "123.remote.host.address.com"
username = "ubuntu"

unprocessed_filename = "unprocessed_file.png"
processed_filename = "processed_file.png"

unprocessed_filepath = str(Path(unprocessed_filename).resolve())
processed_filepath = str(Path(processed_filename).resolve())

remote_source_path = f"/home/{username}/remote_{unprocessed_filename}"
remote_dest_path = f"/home/{username}/remote_{processed_filename}"

rsync_strategy = ct.fs_strategies.Rsync(user=username, host=host_address, private_key_path=private_key)
  1. Generate the FileTransfer objects using TransferFromRemote and TransferToRemote factories.

[2]:
ft_1 = ct.fs.TransferFromRemote(remote_source_path, unprocessed_filepath, strategy=rsync_strategy)
ft_2 = ct.fs.TransferToRemote(remote_dest_path, processed_filepath, strategy=rsync_strategy)

The Covalent Transfer* functions intelligently assign the stage at which each file transfer should take place. The TransferFromRemote takes place before the electron is executed so that the electron can process the file. Conversely, the TransferToRemote takes place after the electron creates the outgoing file.

Note that TransferToRemote is the only case in which the destination path is passed first, then the source. The FileTransfer object generated from it adheres to the (<source_file_path>, <dest_file_path>) convention.

  1. Define an electron, passing the Covalent FileTransfer objects to the files keyword argument in the decorator.

[3]:
@ct.electron(files=[ft_1, ft_2]) # ft_1 is done before the electron is executed; ft_2 is done after.
def to_grayscale(files: List[Tuple[str]] = None):

    # Get the downloaded file's path
    image_path = files[0][1] # destination filepath of first file transfer, downloaded before executing this electron

    # Convert the image to grayscale
    img = io.imread(image_path)[:, :, :3] # limiting image to 3 channels
    gray_img = color.rgb2gray(img)

    # Save the grayscale image to the upload file path
    gray_image_path = files[1][0] # source filepath of second file transfer, to be uploaded
    io.imsave(gray_image_path, gray_img, mode="L")

  1. Create and dispatch a lattice to run the electron.

[4]:
@ct.lattice
def process_remote_data():
    return to_grayscale()

dispatch_id = ct.dispatch(process_remote_data)()
status = ct.get_result(dispatch_id, wait=True).status
print(status)
COMPLETED

Notes:

  • The transfer operations use rsync to perform the transfer.

  • In a typical real-world scenario, this kind of transfer can be used to move data generated by the workflow.

See Also#

Transferring Local Files During Workflows

Transferring Files To and From an S3 Bucket

Transferring Files To and From Azure Blob Storage

Transferring Files To and From Google Cloud Storage