List Existing Files with Python

You can list all the files that exist in a deepset Cloud workspace.

Prerequisites

  1. Install the SDK
  2. Generate an API Key to connect to a deepset Cloud workspace.

List Files Script Examples

List All Files in a Workspace

Here's a basic script example you can use to list all files in a workspace:

from deepset_cloud_sdk.workflows.sync_client.files import list_files

for file_batch in list_files(
    api_key="<deepset Cloud API key>",
    workspace_name="<your_workspace>",
    batch_size=10,
): 
    for file in file_batch: # Lists with length 10 of files
        print(file.name)



from deepset_cloud_sdk.workflows.async_client.files import list_files

async def my_async_context() -> None:
    async for file_batch in list_files(
        api_key="<deepset Cloud API key>",
        workspace_name="<your_workspace>",
        batch_size=10,
    ):
        for file in file_batch:
            print(file.name)

List Files by Name

Here's an example of how to use the list_files() method to list files in a workspace by their name:

from deepset_cloud_sdk.workflows.sync_client.files import list_files

for file_batch in list_files(
    api_key="<deepset Cloud API key>",
    workspace_name="<your_workspace>",
    name="specific_filename.txt" # Uses fuzzy search for file names
    batch_size=10,
): 
    for file in file_batch:
        print(file.name)


from deepset_cloud_sdk.workflows.async_client.files import list_files

async def my_async_context() -> None:
  async for file_batch in list_files(
      api_key="<deepset Cloud API key>",
      workspace_name="<your_workspace>",
      name="specific_filename.txt" # Uses fuzzy search for file names
  ): 
      for file in file_batch: 
          print(file.name)


List Files by Filter

This example uses an OData filter to list files in a workspace:

from deepset_cloud_sdk.workflows.sync_client.files import list_files

for file_batch in list_files(
    api_key="<deepset Cloud API key>",
    workspace_name="<your_workspace>",
    odata_filter="modified gt 2023-01-01T00:00:00Z"
): 
    for file in file_batch:
        print(file.name)

from deepset_cloud_sdk.workflows.async_client.files import list_files

async def my_async_context() -> None:
  async for file_batch in list_files(
      api_key="<deepset Cloud API key>",
      workspace_name="<your_workspace>",
      odata_filter="modified gt 2023-01-01T00:00:00Z"
  ): 
      for file in file_batch:
          print(file.name)