Google Drive
The Google Drive source reads files from Google Workspace shared drives and user My Drives. Each file becomes a record in the pipeline — its contents are uploaded to a new agent session for processing. Authentication uses a Google Cloud service account.
Configuration
A Drive source lists one or more scopes. Each scope is a starting point: a shared
drive or a single user's My Drive, each of which can be narrowed to a folder within
it.
SOURCE FIELD (GOOGLE DRIVE)
Code example with json syntax.1
Fields
| Field | Required | Description |
|---|---|---|
type | Yes | google_drive. |
scopes | Yes | One or more Drive starting points to ingest. At least one. See Scopes. |
client_email | Yes | The service account's email address (the client_email field of the service account JSON key). |
private_key | Yes | The service account's PEM-formatted RSA private key (the private_key field of the JSON key, including the -----BEGIN PRIVATE KEY----- and -----END PRIVATE KEY----- markers and embedded newlines). Encrypted at rest and never returned in responses. |
Scopes
Each entry in scopes is discriminated by its type.
Shared drive (shared)
Ingest from a Google Workspace shared drive. No domain-wide delegation is required:
the service account (client_email) only needs to be a member of the drive or
folder.
| Field | Required | Description |
|---|---|---|
type | Yes | shared. |
url | Yes | URL of the folder to ingest. Use a shared drive's root URL (https://drive.google.com/drive/folders/<drive_id>) to enumerate the entire drive, or any subfolder URL to scope ingestion to that subtree. |
My Drive (my_drive)
Ingest from a single user's My Drive via domain-wide delegation. The service account impersonates the named user.
| Field | Required | Description |
|---|---|---|
type | Yes | my_drive. |
subject_email | Yes | The user whose My Drive the service account impersonates. |
url | No | A folder URL within the user's My Drive to narrow ingestion to that subtree. If omitted, every accessible file in the user's My Drive is enumerated. |
How records are fetched
Each scope starts from its configured folder (or the shared drive root) and walks every descendant subfolder. Trashed files and shortcuts are skipped.
Google Workspace files are exported on download: Docs, Sheets, and Slides become
their Office equivalents (.docx, .xlsx, .pptx). Other Workspace types, such
as Forms and Drawings, can't be exported and are sent to the
dead letter queue.
Source metadata
Each record carries source metadata that the connector resolves at fetch time.
system_metadata captures these Drive fields when present:
| Key | Description |
|---|---|
name | The file name. |
mime_type | The file's Drive MIME type. |
size | File size in bytes. |
modified_at | Last modified time (RFC 3339). |
created_at | Creation time (RFC 3339). |
md5_checksum | MD5 checksum, when Drive provides one. |
web_view_link | Link to open the file in the Drive UI. |
parents | Ids of the file's parent folders. |
drive_id | Id of the shared drive, for shared-drive files. |
user_metadata is empty for Drive.
acl_metadata holds the file's effective permissions in the source-independent
ACL metadata shape. Drive models
groups, so every group_* bucket is populated (empty arrays when there are no
group grants). Permissions are cumulative down the folder tree, so a file's ACL
includes grants inherited from its parent folders. Entries in the user and group
buckets are email addresses. The buckets map to Drive roles as follows:
| Bucket | Drive grant |
|---|---|
owners | the file's owner(s) |
editors | writer, organizer, fileOrganizer roles ("Editor" in the UI) |
commenters | commenter role |
readers | reader role |
group_editors, group_commenters, group_readers | group-email grants by role |
public_access | the "anyone with the link" permission |
org_wide_access | a workspace-domain grant |
Incremental sync
When sync_mode is incremental (the default), the pipeline tracks a watermark
based on each file's modified_at system metadata. On the next run, only files
modified since the last successful run are reprocessed, and unchanged files are
skipped. See Sync mode.
Permissions
Set up access according to the scope types you use:
- Shared drive scopes: add the service account email (
client_email) as a member of the shared drive, or of the specific folder, with at least Viewer access. - My Drive scopes: a Workspace administrator must authorize the service account's client ID for domain-wide delegation with the
https://www.googleapis.com/auth/drive.readonlyOAuth scope. The service account then impersonates eachsubject_email.