Streamlining Cloud Asset Management: Automating the Export of Google Cloud Organization Asset Inventory to BigQuery and Visualization with Looker

--

In an increasingly cloud-centric IT landscape, maintaining a comprehensive and up-to-date view of your cloud assets is not just beneficial; it’s essential for effective governance, security, compliance, and cost management. Google Cloud Platform (GCP) offers a comprehensive suite of services that, when combined, provide a powerful solution for automated asset management efficiently. However, to leverage this data for analytics, audits, or compliance, exporting it to a service like BigQuery is beneficial.

This article outlines a complete solution to automatically exporting your Google Cloud Organization’s Asset Inventory to BigQuery for storage and analysis, followed by leveraging Looker for intuitive, powerful dashboarding and data visualization.

Introduction

The integration between Google Cloud’s Asset Inventory and BigQuery enables organizations to perform comprehensive analyses of their cloud assets over time. By setting up an automated pipeline to export asset data into BigQuery, organizations can enhance visibility, improve security posture, and make informed decisions based on historical data trends. This setup involves a few key GCP services and requires minimal ongoing management once configured.

Prerequisites

Before starting, ensure you have the following:

  • An active Google Cloud Platform (GCP) account.
  • Permissions to manage IAM roles, Cloud Functions, Cloud Scheduler jobs, and BigQuery datasets within your GCP project.
  • A ServiceAccount e.g. ‘asset-inventory-exporter’ will be used by our Cloud Function to interact with the Cloud Asset Inventory APIs and BigQuery securely.

Implementation Overview

1. Architecture

Here’s a sample pipeline that we will be implementing through this guideline:

Export Asset Inventory Pipeline

2. Enable Required GCP Services

Activate the necessary APIs in your GCP Project:

  • Cloud Asset API
  • Cloud Functions API
  • Cloud Scheduler API
  • BigQuery API

These services form the backbone of our automation and visualization pipeline.

3. Prepare the BigQuery Dataset

BigQuery serves as the repository for your exported asset data. Create a dataset by:

  1. Navigating to the BigQuery section of the GCP Console.
  2. Clicking on “Create Dataset” and specifying your preferences.
Cloud Asset Inventory — BQ Dataset

After configuring your preferences, your dataset is now ready to receive data from the Google Cloud Asset Inventory export process.

Here are some best practices for BigQuery Dataset Configuration:

  • Access Control: After creating your dataset, consider configuring access control settings to manage who can view, query, and modify the dataset and tables. Use BigQuery IAM roles like ‘bigquery.dataViewer’, ‘bigquery.dataEditor’, and ‘bigquery.dataOwner’ to grant appropriate levels of access to different users or ServiceAccounts, including the one used by Looker BI/Studio for analytics.
  • Cost Management: Be mindful of the queries executed against your dataset. BigQuery charges for data processed by queries. Designing efficient queries and managing access carefully can help control costs.

4. Configure IAM Roles & Permissions

A clear understanding of the required IAM roles and permissions is essential. These roles and permissions ensure that each component of the pipeline has the necessary access to perform its tasks without compromising security.

At Organization Level:

  1. Cloud Asset Viewer(‘roles/cloudasset.viewer’): This role should be assigned to the ServiceAccount used by the Cloud Function. It allows the Cloud Function to access and export the Asset Inventory data. The role provides read-only access to Cloud Asset Inventory, which is sufficient for exporting data.
  2. Organization Viewer(‘roles/resourcemanager.organizationViewer’): This role might also be necessary for the ServiceAccount if you plan to export assets from the entire organization, as it allows the account to view the organization’s resources and structure.
SA IAM Permissions at GCP Org Level

At Service Project Level:

  1. Cloud Functions Developer(‘roles/cloudfunctions.developer’): This role enables the deployment and invocation of Cloud Functions. It’s necessary for the ServiceAccount or the user who deploys the Cloud Function. This role allows for creating, updating, and deleting Cloud Functions within the project.
  2. Pub/Sub Publisher(‘roles/pubsub.publisher’): This role should be assigned to the Cloud Scheduler job. It allows Cloud Scheduler to publish messages to the specified Pub/Sub topic, which in turn triggers the Cloud Function.

At the BigQuery Dataset Level:

  1. BigQuery Data Editor(‘roles/bigquery.dataEditor’): This role should be assigned to the ServiceAccount used by the Cloud Function for it to insert data into the BigQuery dataset. It allows for creating, updating, and deleting tables and data within the dataset.
  2. BigQuery User(‘roles/bigquery.user’): This role is necessary for users who need to run queries and load data into BigQuery but does not require the ability to modify or delete tables.
SA IAM Permissions at Service Project

5. Develop and Deploy the Cloud Function

Now, we’ll create a Cloud Function to export our asset inventory data to BigQuery. We will develop with these specifications:

  • Runtime: Python 3.12 (or any supported language runtime).
  • Trigger: Cloud Pub/Sub, with a topic (e.g., ‘asset-export-trigger’).
  • Functionality: Utilizes Google Cloud Client Libraries to fetch Asset Inventory for the Organization and insert it into BigQuery.

Here’s the sample Python Function:

  • Add the code to your ‘main.py’ file
  • Modify ‘organization_id’ , ‘project_id’, and ‘dataset_id’ as needed
from google.cloud import asset_v1
import datetime

def export_assets_to_bigquery(event, context):
# Use the correct Google Cloud Project ID and BigQuery Dataset ID
# Construct the BigQuery destination path
bigquery_destination = "projects/<<project_id>>/datasets/<<dataset_id>>"

# We will generate a unique table name based on the current UTC timestamp
table_id = f'asset_inventory_{datetime.datetime.utcnow().strftime("%Y%m%d_%H%M%S")}'

# Cloud Asset API client initialization
asset_client = asset_v1.AssetServiceClient()

# Prepare the output configuration for exporting assets to BigQuery
output_config = asset_v1.OutputConfig(
bigquery_destination=asset_v1.BigQueryDestination(
dataset=bigquery_destination,
table=table_id,
force=True # Overwrites the table if it exists
)
)

# Formulate the request to export assets
request = asset_v1.ExportAssetsRequest(
parent=f"organizations/<<your ord_id>>", # Replace with your organization's ID
content_type=asset_v1.ContentType.RESOURCE,
output_config=output_config
)

# Error Handling
try:
operation = asset_client.export_assets(request=request)
response = operation.result() # Waiting for the operation to complete
print(f"Successfully exported assets to BigQuery table: {table_id}")
except Exception as e:
print(f"Failed to export assets: {e}")

Note: Do not use the ‘bq://’ prefix for the API request

Error Handling: I added a basic try-except block around the export operation to catch and log any exceptions that may occur during the export process. This helps in debugging potential issues with the export operation.

Before deploying the Cloud Function, ensure the Cloud Function dependencies “Google Cloud Client Library” are defined in the “requirements.txt” file:

Library Definition for Asset Inventory API call
  • Deploying this Cloud Function should correctly export assets to the specified BigQuery dataset when triggered.
Cloud Function Configuration

This function triggers the export process, creating a new table in your BigQuery dataset for each execution with a timestamp.

Test and Monitor

After deploying, test your Cloud Function by publishing a message to your Pub/Sub topic. Monitor the execution in the Cloud Functions logs to ensure it completes successfully and that data appears in your BigQuery dataset.

Here are the sample cloud function logs showing successful calls to the Cloud Asset API and the BigQuery Dataset with the tables created after the successful execution of the Export Asset Inventory Cloud Function.

Cloud Function Logs
BigQuery Dataset with Inventory Tables

6. Schedule Daily Exports with Cloud Scheduler

To automate the daily execution of our Cloud Function, we use Cloud Scheduler. Before scheduling a job please ensure you have the pub/sub topic created in the service project already.

  1. In the Cloud Console, go to the Cloud Scheduler page and click “Create Job”.
  2. Set a name, frequency (e.g., ‘0 0 0 * * *’ for daily at midnight), and the target as Pub/Sub.
  3. Choose the Pub/Sub topic linked to your Cloud Function and configure the payload as shown below.

4. Click “Create” to schedule your job.

Cloud Scheduler Configuration

7. Monitoring and Validation

After setting up the Cloud Scheduler, it’s crucial to monitor the Cloud Function’s executions and BigQuery to ensure data is being exported correctly. You can use Cloud Logging to monitor the Cloud Function and set up alerts for any errors or issues encountered during execution and the monitoring features in BigQuery to keep track of the data.

8. Configuring BigQuery in Looker Studio:

  • In Looker studio, add BigQuery as a data source by specifying the project and dataset where your inventory data is stored.
Adding BQ as a Data Source and selecting the BQ Table
  • Once connected, you can create SQL-based queries or use Looker’s visual builder to design your reports/dashboards based on the Asset Inventory data in BigQuery.

9. Dashboard Creation:

Utilize Looker’s suite of visualization tools to create dashboards that provide insights into your cloud asset inventory. These dashboards can include various metrics and visualizations, such as asset count over time, distribution of assets by type or location, and any other analytics that are relevant to your organization’s needs.

Sample Looker Studio Dashboard with Exported Inventory Data

Conclusion

Automating the export of Google Cloud Asset Inventory into BigQuery simplifies the management and analytics of cloud resources. By following the steps outlined in this guide, organizations can enhance their cloud governance and analytics capabilities, leveraging the power of GCP’s serverless solutions to maintain an up-to-date view of their cloud assets in a highly scalable and efficient manner.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Aravind Kumar Ramaiah Kothandaramaiah
Aravind Kumar Ramaiah Kothandaramaiah

Written by Aravind Kumar Ramaiah Kothandaramaiah

Google Cloud Architect & Data Engineer with 16 yrs of IT experience.

Responses (2)

Write a response