> ## Documentation Index
> Fetch the complete documentation index at: https://docs.evidence.studio/llms.txt
> Use this file to discover all available pages before exploring further.

# BigQuery

> Query your BigQuery datasets directly from Evidence without syncing data.

The BigQuery direct connector lets Evidence run queries live against your BigQuery project.

<Note>
  This is the **direct** connector. If you'd like to extract data (to join data across multiple sources), use the [managed BigQuery connector](/data-sources/bigquery) instead.
</Note>

## Connecting

You'll create a service account in GCP, give it permission to run BigQuery jobs and read your datasets, then upload its JSON key to Evidence.

<Note>
  The steps below use the `gcloud` and `bq` CLIs, but every step has an equivalent in the [GCP Console](https://console.cloud.google.com)
</Note>

### Prerequisites

* The `gcloud` CLI, authenticated as a user with **IAM Admin** and **BigQuery Admin** roles on the target project (or **Owner**).
* A GCP project that contains the datasets you want to query.

In the snippets below, replace `<project>` with your GCP project ID.

### 1. Create a service account for Evidence

```bash theme={null}
gcloud iam service-accounts create evidence-bq-primary \
  --project=<project> \
  --display-name="Evidence BigQuery primary"
```

### 2. Grant it BigQuery access

#### Job user

```bash theme={null}
gcloud projects add-iam-policy-binding <project> \
  --member="serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
  --role="roles/bigquery.jobUser"
```

#### Data viewer

```bash theme={null}
gcloud projects add-iam-policy-binding <project> \
  --member="serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer"
```

### 3. Generate a JSON key

```bash theme={null}
gcloud iam service-accounts keys create <some/local/path>/evidence-bq-primary.json \
  --iam-account="evidence-bq-primary@<project>.iam.gserviceaccount.com"
```

### 4. Configure the connector in Evidence

<Steps>
  <Step title="Open the connectors page">
    Go to **Connectors** in the sidebar. In the **Warehouse** card at the top, select **BigQuery**.
  </Step>

  <Step title="Upload the JSON key">
    Click the **Service account JSON** upload area and choose `evidence-bq-primary.json`.
  </Step>

  <Step title="Choose datasets">
    Add the datasets you want to use with Evidence in **Visible datasets** (comma-separated).
  </Step>

  <Step title="Test and save">
    Click **Test Connection**. Once all steps pass, click **Save**.
  </Step>
</Steps>

## Configuration reference

<Tabs>
  <Tab title="Studio">
    #### Credentials

    <ResponseField name="Project ID" type="string" required>
      GCP project that owns the BigQuery datasets you want to query.
    </ResponseField>

    <ResponseField name="Service account key (JSON)" type="{ client_email: string; private_key: string; project_id?: string }">
      Inline service-account key JSON.
    </ResponseField>

    <ResponseField name="Location" type="string">
      Default query location (e.g. US, EU, us-central1).
    </ResponseField>

    <ResponseField name="Default dataset" type="string">
      Dataset used when queries reference unqualified table names.
    </ResponseField>

    #### Visibility

    <ResponseField name="Datasets" type="string[]" required>
      Allowlist of datasets exposed to the editor and schema browser.
    </ResponseField>

    #### Row-level security

    <ResponseField name="Impersonation roles" type="{ name: string; serviceAccountEmail: string }[]">
      Service accounts the primary SA can impersonate for RLS. Each must have iam.serviceAccountTokenCreator granted to the primary SA. Default: `[]`.
    </ResponseField>
  </Tab>

  <Tab title="CLI (connection.yaml)">
    ```yaml theme={null}
    type: bigquery
    project: my-gcp-project
    keyfile: ./evidence-bq-primary.json
    # dataset: analytics
    datasets:
      - analytics
      - reporting
    # roles:
    #   - name: customer_a
    #     serviceAccountEmail: customer-a@my-gcp-project.iam.gserviceaccount.com
    ```

    #### Credentials

    *Provide exactly one of `keyfile_json`, `keyfile`.*

    <ResponseField name="project" type="string" required>
      GCP project that owns the BigQuery datasets you want to query.
    </ResponseField>

    <ResponseField name="keyfile_json" type="{ client_email: string; private_key: string; project_id?: string }">
      Inline service-account key JSON.
    </ResponseField>

    <ResponseField name="keyfile" type="string">
      Path to a service-account key JSON file, resolved relative to connection.yaml.
    </ResponseField>

    <ResponseField name="location" type="string">
      Default query location (e.g. US, EU, us-central1).
    </ResponseField>

    <ResponseField name="dataset" type="string">
      Dataset used when queries reference unqualified table names.
    </ResponseField>

    #### Visibility

    <ResponseField name="datasets" type="string[]" required>
      Allowlist of datasets exposed to the editor and schema browser.
    </ResponseField>

    #### Row-level security

    <ResponseField name="roles" type="{ name: string; serviceAccountEmail: string }[]">
      Service accounts the primary SA can impersonate for RLS. Each must have iam.serviceAccountTokenCreator granted to the primary SA. Default: `[]`.
    </ResponseField>
  </Tab>
</Tabs>

## Row-Level Security

Evidence supports row-level security on the BigQuery direct connector via **service account impersonation**.

### How it works

You create **one BigQuery service account per RLS role** (per customer, per tenant, per region — whatever your access boundary is) and attach BigQuery row-access policies to those service accounts.

Evidence's primary SA is granted permission to impersonate each role SA, and at query time chooses which service account to use based on the user's assigned role in Evidence.

### Setup

The example below creates two roles, `customer_a` and `customer_b`, with
filters on a `client_id` column. Substitute your own role names, table, and
filter logic.

<Steps>
  <Step title="Create a role service account per RLS role">
    ```bash theme={null}
    gcloud iam service-accounts create customer-a \
      --project=<project> \
      --display-name="Evidence RLS role: customer_a"
    ```

    Repeat for each role.
  </Step>

  <Step title="Let the primary SA impersonate each role SA">
    ```bash theme={null}
    gcloud iam service-accounts add-iam-policy-binding \
      customer-a@<project>.iam.gserviceaccount.com \
      --member="serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
      --role="roles/iam.serviceAccountTokenCreator"
    ```

    This grants the impersonation right *on the role SA as a resource* — the
    primary SA can mint tokens for this specific role SA only, not project-wide.

    Repeat for each role.
  </Step>

  <Step title="Grant each role SA BigQuery access">
    ```bash theme={null}
    # jobUser
    gcloud projects add-iam-policy-binding <project> \
      --member="serviceAccount:customer-a@<project>.iam.gserviceaccount.com" \
      --role="roles/bigquery.jobUser" \
      --condition=None
    ```

    ```bash theme={null}
    # dataViewer
    gcloud projects add-iam-policy-binding <project> \
      --member="serviceAccount:customer-a@<project>.iam.gserviceaccount.com" \
      --role="roles/bigquery.dataViewer" \
      --condition=None
    ```

    Repeat for each role.
  </Step>

  <Step title="Attach row-access policies">
    ```sql theme={null}
    CREATE OR REPLACE ROW ACCESS POLICY customer_a_filter
    ON `<project>.<dataset>.<table>`
    GRANT TO ('serviceAccount:customer-a@<project>.iam.gserviceaccount.com')
    FILTER USING (client_id = 'customer_a');
    ```

    Repeat per role. The `FILTER USING` clause is plain SQL — you can use any
    column, comparison, or join.
  </Step>

  <Step title="Decide on a fallback for unassigned users">
    Once any policy exists on a table, **every** principal querying it needs
    a matching policy or they see zero rows — including Evidence's primary SA.

    To preserve the existing behaviour for users without a role assignment
    (they see all rows), add a fallback policy:

    ```sql theme={null}
    CREATE OR REPLACE ROW ACCESS POLICY primary_fallback
    ON `<project>.<dataset>.<table>`
    GRANT TO ('serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com')
    FILTER USING (TRUE);
    ```

    Skip this step if you'd rather unassigned users see nothing. That's the
    more secure default but you'll need every viewer to have a role assigned.
  </Step>

  <Step title="Register the roles in Evidence">
    In **Settings → Access Rules → Roles**, add one row per role:

    | Service account                                | Role label   |
    | ---------------------------------------------- | ------------ |
    | `customer-a@<project>.iam.gserviceaccount.com` | `customer_a` |
    | `customer-b@<project>.iam.gserviceaccount.com` | `customer_b` |
  </Step>

  <Step title="Assign roles to users">
    In **Settings → Access Rules → Roles** and assign a role to each user.
  </Step>
</Steps>

## Security Considerations

Evidence executes queries against your BigQuery instance. You should take pragmatic steps to protect your instance against misuse, whether accidental or malicious.

1. [Scope data viewer to specific datasets](#1-scope-data-viewer-to-specific-datasets)
2. [Keep the service account scoped](#2-keep-the-service-account-scoped)
3. [Cap cost per query](#3-cap-cost-per-query)
4. [Audit logs and alerts](#4-audit-logs-and-alerts)

### 1. Scope data viewer to specific datasets

The setup above grants `bigquery.dataViewer` project-wide, which also gives read access to every other dataset in the project and to `INFORMATION_SCHEMA`. To narrow this, grant `dataViewer` per-dataset instead:

```bash theme={null}
bq add-iam-policy-binding \
  --member="serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer" \
  <project>:<dataset>
```

Then remove the project-wide grant:

```bash theme={null}
gcloud projects remove-iam-policy-binding <project> \
  --member="serviceAccount:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
  --role="roles/bigquery.dataViewer"
```

Repeat for each role SA you've set up for RLS.

### 2. Keep the service account scoped

The setup grants only `bigquery.jobUser` and `bigquery.dataViewer`. Don't add more — broader roles like `bigquery.user`, `bigquery.admin`, or `bigquery.connectionUser` enable behaviour Evidence doesn't need and unlock features like `EXTERNAL_QUERY` and remote functions. Audit the SA's IAM bindings with:

```bash theme={null}
gcloud projects get-iam-policy <project> \
  --flatten="bindings[].members" \
  --filter="bindings.members:evidence-bq-primary@<project>.iam.gserviceaccount.com" \
  --format="table(bindings.role)"
```

### 3. Cap cost per query

Set a project-level ceiling on bytes billed per job so a runaway query can't drain budget.

You can also set [per-query](https://cloud.google.com/bigquery/docs/best-practices-costs#limit-query-costs) `maximum_bytes_billed` on the query itself or [custom quotas](https://cloud.google.com/bigquery/docs/custom-quotas) per user.

### 4. Audit logs and alerts

Enable [BigQuery audit logs](https://cloud.google.com/bigquery/docs/reference/auditlogs) and route them to a sink you monitor. E.g. alert on:

* `EXPORT DATA` jobs (data leaving the warehouse)
* DDL or DCL statements run by Evidence's SAs (they shouldn't be running any)
* Unusually large scans by a single SA
