Getting Started

circle-exclamation

Bulk CSV Downloads provides our entire data catalog as compressed CSV flat files, delivered through AWS S3. The full export is approximately 1 TB compressed, and S3 gives you high-throughput parallel downloads to efficiently backfill your systems with our data.

We will share access to the export using AWS Security Token Service (STS), which grants you a temporary, limited-privilege credential to access the files in S3.

Prerequisites

Please send us:

  • AWS Account ID

  • Confirmation you can use sts:AssumeRole

circle-info

You must use an IAM user or role when downloading files. A root account will not work (this is a limitation of AWS).

We will send you the following credentials so you can then access the data.

  • RoleArn

  • ExternalId

First install the AWS CLIarrow-up-right.

In ~/.aws/config, add a gridstatus profile. Using a named profile will allow the CLI to handle credential refresh automatically.

[profile gridstatus]
role_arn = <RoleArn>
external_id = <ExternalId>
source_profile = default

s3 =
  max_concurrent_requests = 20

Verify it works by listing the available datasets:

If you see dataset folders listed, your credentials are working. See Example Usage for more commands.

Other Options

  • Python with s3fsarrow-up-right - Use S3FileSystem with assume_role_arn and assume_role_kwargs to download files.

  • Python with boto3arrow-up-right - Use RefreshableCredentials via STS AssumeRole to list and download objects with concurrent transfers.

Last updated

Was this helpful?