Connecting to Google BigQuery
BigQuery is Google's fully managed, petabyte scale, low cost analytics data warehouse.
Step 1: Create BigQuery credentials
To connect SILOTA to your BigQuery database, you'll need:
- a Service Account
- a P12 Auth File made for the service account
- your Project ID
Additionally, you'll have to make sure the Service Account has atleast
1a. Create Service Account
1b. P12 Auth file
1c. Project ID
Step 2: Upload credentials to Silota
Step 3: Test that it all works
Run this sample query:
SELECT name, year, count(*) FROM [bigquery-public-data:usa_names.usa_1910_2013] WHERE name in ('Maggie', 'Bart') GROUP BY name, year;
Sample BigQuery Datasets
BigQuery has a public data sets that are free to query and explore. For example, they have the a complete dataset on:
- GitHub Data – a 3TB+ dataset comprising the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions.
- Hacker News Data – this dataset contains all stories and comments from Hacker News from its launch in 2006. Each story contains a story ID, the author that made the post, when it was written, and the number of points the story received.
- Stack Overflow Data – this BigQuery dataset includes an archive of Stack Overflow content, including posts, votes, tags, and badges. This dataset is updated to mirror the Stack Overflow content on the Internet Archive, and is also available through the Stack Exchange Data Explorer.
- Get the full list here