> ## Documentation Index
> Fetch the complete documentation index at: https://agno-v2-studio-tools-doc.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Cloud Storage Sources

> Load content from S3, GCS, SharePoint, GitHub, and Azure Blob into a knowledge base.

Register cloud storage providers on a Knowledge instance with `content_sources`. Each provider has `.file()` and `.folder()` methods that create content references you pass to `knowledge.insert()`.

```python theme={null}
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.remote_content import S3Config

knowledge = Knowledge(
    vector_db=vector_db,
    contents_db=contents_db,
    content_sources=[
        S3Config(
            id="company-docs",
            name="Company Documents",
            bucket_name="my-docs-bucket",
            region="us-east-1",
        ),
    ],
)

# Insert a single file
knowledge.insert(
    name="Q4 Report",
    remote_content=knowledge.content_sources[0].file("reports/q4-2025.pdf"),
)

# Insert an entire folder
knowledge.insert(
    name="Engineering Specs",
    remote_content=knowledge.content_sources[0].folder("specs/"),
)
```

## Supported Providers

| Provider             | Config Class       | Install                                                                                                         |
| -------------------- | ------------------ | --------------------------------------------------------------------------------------------------------------- |
| Amazon S3            | `S3Config`         | `pip install boto3`                                                                                             |
| Google Cloud Storage | `GcsConfig`        | `pip install google-cloud-storage`                                                                              |
| SharePoint           | `SharePointConfig` | `pip install msal requests`                                                                                     |
| GitHub               | `GitHubConfig`     | `pip install requests`                                                                                          |
| Azure Blob Storage   | `AzureBlobConfig`  | `pip install azure-identity azure-storage-blob` (`azure-identity` is only for Service Principal authentication) |

All configs are importable from `agno.knowledge.remote_content`.

## Provider Configuration

### S3Config

```python theme={null}
from agno.knowledge.remote_content import S3Config

s3 = S3Config(
    id="s3-docs",
    name="S3 Documents",
    bucket_name="my-bucket",
    region="us-east-1",
    aws_access_key_id="...",       # optional, falls back to default credential chain
    aws_secret_access_key="...",   # optional, falls back to default credential chain
    prefix="documents/",           # optional, default prefix for browsing
)
```

| Field                   | Type            | Default  | Description                                             |
| ----------------------- | --------------- | -------- | ------------------------------------------------------- |
| `id`                    | `str`           | required | Unique identifier for this source                       |
| `name`                  | `str`           | required | Display name                                            |
| `bucket_name`           | `str`           | required | S3 bucket name                                          |
| `region`                | `Optional[str]` | `None`   | AWS region                                              |
| `aws_access_key_id`     | `Optional[str]` | `None`   | AWS access key. Falls back to default credential chain. |
| `aws_secret_access_key` | `Optional[str]` | `None`   | AWS secret key. Falls back to default credential chain. |
| `prefix`                | `Optional[str]` | `None`   | Default prefix for browsing and listing                 |

### GcsConfig

```python theme={null}
from agno.knowledge.remote_content import GcsConfig

gcs = GcsConfig(
    id="gcs-docs",
    name="GCS Documents",
    bucket_name="my-gcs-bucket",
    project="my-gcp-project",
)
```

| Field              | Type            | Default  | Description                  |
| ------------------ | --------------- | -------- | ---------------------------- |
| `id`               | `str`           | required | Unique identifier            |
| `name`             | `str`           | required | Display name                 |
| `bucket_name`      | `str`           | required | GCS bucket name              |
| `project`          | `Optional[str]` | `None`   | GCP project ID               |
| `credentials_path` | `Optional[str]` | `None`   | Path to GCP credentials file |
| `prefix`           | `Optional[str]` | `None`   | Default prefix               |

### GitHubConfig

```python theme={null}
from agno.knowledge.remote_content import GitHubConfig

github = GitHubConfig(
    id="my-repo",
    name="My Repository",
    repo="owner/repo",
    token="ghp_...",
    branch="main",
)
```

| Field    | Type            | Default  | Description                                         |
| -------- | --------------- | -------- | --------------------------------------------------- |
| `id`     | `str`           | required | Unique identifier                                   |
| `name`   | `str`           | required | Display name                                        |
| `repo`   | `str`           | required | Repository in `owner/repo` format                   |
| `token`  | `Optional[str]` | `None`   | GitHub personal access token (needs Contents: read) |
| `branch` | `Optional[str]` | `None`   | Branch name                                         |
| `path`   | `Optional[str]` | `None`   | Default path filter                                 |

### SharePointConfig

```python theme={null}
from agno.knowledge.remote_content import SharePointConfig

sharepoint = SharePointConfig(
    id="sharepoint-docs",
    name="SharePoint Documents",
    tenant_id="...",
    client_id="...",
    client_secret="...",
    hostname="contoso.sharepoint.com",
    site_path="/sites/Engineering",
)
```

| Field           | Type            | Default  | Description                            |
| --------------- | --------------- | -------- | -------------------------------------- |
| `id`            | `str`           | required | Unique identifier                      |
| `name`          | `str`           | required | Display name                           |
| `tenant_id`     | `str`           | required | Azure AD tenant ID                     |
| `client_id`     | `str`           | required | Azure AD application client ID         |
| `client_secret` | `str`           | required | Azure AD application client secret     |
| `hostname`      | `str`           | required | SharePoint hostname                    |
| `site_path`     | `Optional[str]` | `None`   | Site path (e.g., `/sites/Engineering`) |
| `site_id`       | `Optional[str]` | `None`   | Full site ID                           |
| `folder_path`   | `Optional[str]` | `None`   | Default folder path                    |

### AzureBlobConfig

Supports two authentication methods: **Service Principal** (Azure AD client credentials) and **SAS** (Shared Access Signature) token. Provide one or the other, not both.

<Tabs>
  <Tab title="Service Principal">
    ```python theme={null}
    from agno.knowledge.remote_content import AzureBlobConfig

    azure = AzureBlobConfig(
        id="azure-docs",
        name="Azure Blob Documents",
        tenant_id="...",
        client_id="...",
        client_secret="...",
        storage_account="mystorageaccount",
        container="documents",
    )
    ```
  </Tab>

  <Tab title="SAS Token">
    ```python theme={null}
    from agno.knowledge.remote_content import AzureBlobConfig

    azure = AzureBlobConfig(
        id="azure-docs",
        name="Azure Blob Documents",
        sas_token="sv=2022-11-02&ss=b&srt=sco&sp=rl&se=...",
        storage_account="mystorageaccount",
        container="documents",
    )
    ```
  </Tab>
</Tabs>

| Field             | Type            | Default  | Description                                                 |
| ----------------- | --------------- | -------- | ----------------------------------------------------------- |
| `id`              | `str`           | required | Unique identifier                                           |
| `name`            | `str`           | required | Display name                                                |
| `tenant_id`       | `Optional[str]` | `None`   | Azure AD tenant ID (Service Principal auth)                 |
| `client_id`       | `Optional[str]` | `None`   | Azure AD application client ID (Service Principal auth)     |
| `client_secret`   | `Optional[str]` | `None`   | Azure AD application client secret (Service Principal auth) |
| `sas_token`       | `Optional[str]` | `None`   | SAS token string (SAS token auth)                           |
| `storage_account` | `str`           | required | Azure storage account name                                  |
| `container`       | `str`           | required | Blob container name                                         |
| `prefix`          | `Optional[str]` | `None`   | Default prefix                                              |

Requires the Storage Blob Data Reader (or Contributor) role on the storage account.

## Inserting Content

Each config has `.file()` and `.folder()` methods that return content references for `knowledge.insert()`.

```python theme={null}
# Single file
knowledge.insert(
    name="Architecture Doc",
    remote_content=s3.file("docs/architecture.pdf"),
)

# Entire folder
knowledge.insert(
    name="All Specs",
    remote_content=gcs.folder("specs/"),
)

# GitHub file from a specific branch
knowledge.insert(
    name="README",
    remote_content=github.file("README.md", branch="develop"),
)

# SharePoint file from a specific site
knowledge.insert(
    name="Policy",
    remote_content=sharepoint.file("Shared Documents/policy.pdf", site_path="/sites/HR"),
)
```

## Browsing S3 Files

`S3Config` supports paginated file listing with `list_files()`. This is useful for building file pickers or exploring bucket contents before ingesting.

```python theme={null}
result = s3.list_files(prefix="reports/", limit=50, page=1)

for folder in result.folders:
    print(f"Folder: {folder['name']}")

for file in result.files:
    print(f"File: {file['name']} ({file['size']} bytes)")

print(f"Page {result.page} of {result.total_pages}")
```

| Parameter   | Type            | Default | Description                                          |
| ----------- | --------------- | ------- | ---------------------------------------------------- |
| `prefix`    | `Optional[str]` | `None`  | Path prefix filter. Overrides the config's `prefix`. |
| `delimiter` | `str`           | `"/"`   | Folder delimiter                                     |
| `limit`     | `int`           | `100`   | Files per page (1-1000)                              |
| `page`      | `int`           | `1`     | Page number (1-indexed)                              |

An async variant `alist_files()` is also available with the same signature.

## Multiple Sources

Register multiple providers on a single Knowledge instance.

```python theme={null}
knowledge = Knowledge(
    vector_db=vector_db,
    contents_db=contents_db,
    content_sources=[s3, gcs, github, sharepoint, azure],
)

# Insert from different sources
knowledge.insert(name="S3 Doc", remote_content=s3.file("doc.pdf"))
knowledge.insert(name="GitHub Doc", remote_content=github.file("README.md"))
```

## Using sources through AgentOS

When the Knowledge instance is attached to [AgentOS](/agent-os/overview), every config registered in `content_sources` is exposed through the HTTP API. Discover them with `GET /knowledge/config` (under `remote_content_sources`), upload with `POST /knowledge/remote-content` using the config's `id`, and (for S3) browse files with `GET /knowledge/{knowledge_id}/sources/{source_id}/files`.

See [Remote Content](/agent-os/knowledge/remote-content) for the full API workflow, `source_params` overrides, and per-source behavior.

## Next Steps

| Task                       | Guide                                                |
| -------------------------- | ---------------------------------------------------- |
| Ingest via the AgentOS API | [Remote Content](/agent-os/knowledge/remote-content) |
| Content types overview     | [Content Types](/knowledge/concepts/content-types)   |
| Filter search results      | [Filtering](/knowledge/concepts/filters/overview)    |
| Set up a vector database   | [Vector Databases](/knowledge/concepts/vector-db)    |
