Skip to main content

Documentation Index

Fetch the complete documentation index at: https://heyhumm.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Humm syncs files from your S3 bucket into your warehouse via Airbyte, then queries the synced tables through your warehouse integration. Use this when source data already lands in S3 and you want it available for analysis alongside your other business data.

Source Type

Sync

Humm copies matching files from S3 into your warehouse on a schedule. Humm then queries the synced tables rather than reading directly from the bucket during each question.
See Choosing a Source Type for help deciding.

Supported Files

S3 data sync supports one configured stream per connector. A stream is a set of files with the same structure that should become one table. Supported file formats:
  • CSV
  • JSON Lines (.jsonl)
  • Parquet
  • Avro
Use file glob patterns from the bucket root to choose which files belong in the stream. For example:
  • exports/usage/*.csv
  • events/**/*.jsonl
  • orders/*.parquet|archive/orders/*.parquet

Credentials

AWS Access Keys

To connect S3 data sync, provide:
  • Bucket: The S3 bucket name
  • AWS Region: The bucket region, such as us-east-1
  • AWS Access Key ID: Access key for an IAM user with read access
  • AWS Secret Access Key: Secret key for that IAM user
  • Output Stream Name: The table name for the synced files
  • File Globs: One or more file patterns to sync
  • File Format: CSV, JSON Lines, Parquet, or Avro
Humm currently supports access-key authentication for this connector. IAM role authentication is not available in the S3 data sync setup.

Required AWS Permissions

Create an IAM user or access key with read-only access to the bucket and prefixes you want to sync. Minimum permissions:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:ListBucket"],
      "Resource": "arn:aws:s3:::your-bucket-name"
    },
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::your-bucket-name/path-prefix/*"
    }
  ]
}
If your file glob reads from multiple prefixes, include each prefix in the s3:GetObject resources. If you need to sync files across the whole bucket, use arn:aws:s3:::your-bucket-name/*.

Sync Behavior

  • Data is refreshed on a schedule after setup
  • The connector reads files matching the configured glob patterns
  • Incremental sync uses file modification history
  • Humm reads the synced warehouse tables after the sync completes
  • Schema can be inferred from files, or you can provide an input schema during setup

Best For

  • Product or usage exports already written to S3
  • Batch exports from internal systems
  • Partner or vendor files delivered to your bucket
  • Joining file-based operational data with warehouse, CRM, billing, and support data