Docker Hub connector
Set up the Docker Hub connector in Kaivo: authentication, configuration, the 1 BigQuery tables it syncs, and answers to common questions.
Written By Lauri Raivio
Last updated About 1 hour ago
Kaivo is a fully managed data platform that syncs your Docker Hub data into a Google BigQuery warehouse and keeps it up to date automatically. There is no pipeline to build and no infrastructure to run, so you can spend your time analysing your data from Docker Hub instead of moving it.
What is the Docker Hub connector
Sync your Docker Hub data into BigQuery with Kaivo to track image pulls and repository activity.
Getting started with the Docker Hub connector
- Sign up for Kaivo and create a workspace.
- Connect your Docker Hub account.
- Choose which tables to sync.
- Wait for the initial sync to finish.
- Query your data in BigQuery or your favourite AI or BI tool.
Authenticating Docker Hub
Follow the steps below to connect your Docker Hub account.
Configuring the Docker Hub connector
When you set up the connector, you provide:
Tables and columns synced from Docker Hub
Kaivo syncs 1 table from Docker Hub into a dedicated dataset in your BigQuery warehouse. Click any table to see its columns and types.
How the Docker Hub sync works
After the first load, Kaivo keeps your BigQuery warehouse up to date for you. Where Docker Hub supports it, each sync pulls only new and changed records so it stays fast; otherwise it refreshes the whole table. Every record keeps its original ID, so you won't get duplicate rows.
Frequently asked questions
How long does the initial sync take for Docker Hub?
It depends on how much history is in your Docker Hub account. Most initial syncs finish within minutes, while large accounts can take a few hours. After that, syncs only fetch new and changed records, so they're much faster.
Can I sync only some tables or columns?
Yes. You pick which tables to sync when you set up the connection and can change the selection later. Tables you don't select are never copied to your warehouse.
What happens when Docker Hub's schema changes?
New fields are never added automatically. You choose which fields to sync, so data you haven't selected (sensitive personal data, for example) never lands in your warehouse. When a new field appears, it becomes available for you to add. What happens to removed or renamed fields depends on a table's sync mode: full-refresh tables always match what's currently in Docker Hub, so dropped fields disappear, while incremental tables keep their existing columns and history, so an old field stays and newly added fields fill in over time.
How do I handle GDPR or data deletion requests?
Your data lives in your own Kaivo-managed BigQuery warehouse, so the most direct option is to delete or anonymise specific records right in BigQuery. If you delete data in Docker Hub instead, full-refresh tables drop it on the next sync, while incremental tables keep it, so you would remove the row in BigQuery or ask us to run a full refresh. To remove everything, delete the Docker Hub connector in Kaivo and all of its synced data is deleted with it.
Common use cases for Docker Hub data
Pull tracking
Use docker_hub data to monitor image pulls and repository usage over time.
Repository view
Report on your Docker Hub repositories alongside your other data.
Adoption trends
Track how usage of your published images changes over time.
Use Docker Hub data in your AI and BI tools
Once Docker Hub data lands in your Kaivo-managed BigQuery warehouse, you can explore it with AI tools or any BI tool that connects to BigQuery. Here's how the most common destinations work with Docker Hub data.
Claude
Use Kaivo's MCP server to give Claude secure, workspace-scoped access to your data. Setup guide β
Power BI
Microsoft's BI tool with a native BigQuery connector. Supports direct query and scheduled refresh. Setup guide β
Data Studio
Free Google BI tool with native BigQuery support. One-click connection to your Kaivo warehouse; great for SMB teams on Google Workspace. Setup guide β
Tableau
The premium analytics standard, with native BigQuery integration. Setup guide β
Google Sheets
Use Connected Sheets to query BigQuery directly from a spreadsheet, with no SQL. Setup guide β
Excel
Connect via Power Query's BigQuery connector. Setup guide β
Metabase
Open-source BI tool with strong BigQuery support. Setup guide β
See our pricing page for Docker Hub connector pricing and plan details.