Integrating Firebase with Google Big Query for Realtime Mobile App Analysis

Introduction

Well designed mobile and web apps incorporate platforms such as Google Analytics, Amplitude, Mixpanel and similar services. These services are important tools to achieve one or more tactical objectives:

  • Tracking unique users, launch frequency, clickstream behavior and feature usage helps in planning which features to invest in and which to retire.
  • Understanding user geographic location, language preference and mobile OS usage can help guide application architecture and cloud service configuration decisions.
  • Crash reports and in-app performance metrics help monitor software quality and solving emerging quality issues.

These objectives are all important, but these data often live in disconnected silos inaccessible to key stakeholders, and may exclude qualitative information that could be used to develop better customer relationships.

Solution Business Objective

Customer and application usage data has many stakeholders: customer service, sales, marketing, data science, and so on. Wouldn't it be ideal if we could bring together information about how users experience our applications in one data source all these stakeholders could easily access?

The remainder of this article presents an architectural pattern to bring together various data streams relating to mobile applications into a central repository capable of supporting a range of business decisions, with a specific focus on capturing in-app data streams often missed by app analytics platforms.

At the end of the article:

  • We'll have a cloud data warehouse, and the ability to stream any data created by a mobile app into the data warehouse using just a few lines of application code.
  • We'll have set the table to integrate additional data sources--such as the aforementioned Google Analytics and other clickstream-based analytical tools--into the same DW.

Solution Architecture

This solution architecture provides an integrated data warehouse for app data using the following cloud-native features offered by Google Cloud Platform:

  • Google Big Query (BQ) - a scalable cloud data warehouse. BQ can be accessed by a range of front-end tools--from spreadsheets to data science platforms. Our overarching objective will be to land as much app generated data in BQ as possible to support the widest range of decisions from a single, easily accessed data source.
  • Google Firebase - a mobile backend as a service (MBAAS) platform geared toward supporting authentication and data storage for mobile and web apps. Developers use Firebase (or another MBAAS) to accelerate development. Often, mobile apps are already using Firebase, and we can leverage its more advanced features as part of a data consolidation strategy.
  • Firebase Cloud functions - server-less web APIs used to glue together application layers. We'll use functions to stream realtime data from the application layer back to the data analytics layer.
  • Google Sheets, Power BI, Tableau, R Studio, Microsoft Excel, et al. Front-end applications that can easily access Google Big Query as a data source. Business users can use the applications they already know to access data we consolidate via this architecture.

The completed solution will take the shape illustrated in this diagram:

The Pattern is Portable (i.e. AWS and Azure)

While the architecture in this article focuses on using Google Cloud Platform (GCP) to provide the building blocks for the solution, the pattern can be applied with other ecosystems.

In short, if the pattern is a good fit for the scenario, any of the major cloud platforms has the building blocks to assemble this solution.

Software Stack

Google Cloud Platform (GCP) organizes applications into projects. Firebase also organizes applications into projects. How do we manage these two projects?

Actually in this case, we'll have one unified GCP/Firebase project.

Firebase is basically a mobile/web app developer-friendly skin over GCP. Firebase Cloud Storage, Firebase Cloud Functions and most other Firebase services are GCP resources under the covers. It's not wrong to think of Firebase as an easier to use "GCP Lite" for mobile developers who need only a narrow slice of everything GCP has to offer. Firebase has a well-earned reputation as having an ultra-low learning curve for mobile developers. But the full range of GCP services is always accessible underneath when it's needed.

In this solution architecture, we'll use GCP as the base of the cloud architecture (where Big Query lives), and Firebase as the middle-layer (Authentication, App Data storage). The top layer is the app itself, which connects to Firebase directly, and GCP indirectly.

Firebase Function

In this article we will create a Firebase Cloud Function to allow mobile apps to send qualitative information directly to the data warehouse (Google Big Query). This function is the glue that ties the mobile front-end to the data analytics backend.

Firebase Cloud Function Graphic

What's a cloud function?

A cloud function is essentially a block of code--a RESTful API call--we can deploy without the need to deploy a server instance for the API to run on. This can have several advantages over conventional API deployment:

  • Nearly infinite scalability. The function is running on a server somewhere, but it's not a server we have to provision, fund or administer. GCP will dynamically install our function (and its dependencies) on its own Node.js server(s)--as many as it needs to maintain the function's SLA.
  • Cost savings (usually). Rather than paying to maintain fixed server capacity to host a cloud function, we pay for function invocations. If we have no invocations in the middle of the night, we pay nothing. If we need 10,000 invocations per second for one hour during the Super Bowl, GCP scales the function up to that level--then back down when demand goes down.
Cloud functions are basically "Fee per use" cloud web servers.

While many (probably most) mobile apps using Firebase don't leverage cloud functions, calling these functions from the mobile apps is trivial, and is just a few lines of code added to the mobile app.  The complexity of transmitting data to GCP is encapsulated in the cloud function--so the mobile developers aren't burdened by that complexity at all.

Implementation

Step 1 - Create the Cloud Project

The easiest way to build this architecture is to create the GCP project first, and then create the Firebase project--while logged in with the same Google ID.

Docs: creating a GCP Project

When the Firebase project is created (after the GCP project), Firebase sees that a GCP user is creating Firebase project, and offers to create the Firebase project as an addition to an existing GCP project--we want this. The GCP/Firebase unified project makes it easy to leverage GCP data resources from Firebase.

Docs: Adding Firebase to an Existing Google Cloud Project

Step 2 - Configure the Mobile App to Firebase

I won't go into detail on the process of configuring a mobile (iOS or Android) app to a Firebase project since the Google documentation on this topic is excellent. This architecture builds on mobile apps already setup with Firebase. The Firebase documentation covers this setup better than I could.

Docs: Adding Firebase to an iOS App

Docs: Adding Firebase to an Android App

At the end of this step, the core platform projects are created, and the front-end applications are linked to the backend.

Step 3 - Create a Google Big Query Data Set and Table

Google Big Query (BQ) is a completely server-less, highly scalable database technology geared toward SQL-based data analysis. There's really no setup involved, and it scales itself according to the data added to it.

To support the initial objectives of this architecture, we create a Big Query Data Set containing a single Big Query Table, where the Firebase function will write application events.

We merely need to make a note of the name of the data set and table which we want to write to from the Firebase Cloud Function--and be sure to use the same column names/data types in the cloud function as was defined in the Big Query table.

Step 4 - Create a Firebase Cloud Function

To illustrate the potential to integrate the Firebase mobile backend layer to the data warehouse (Big Query), we'll add a function to Firebase that the app can call anytime it should stream some event or user data directly to the data warehouse.

Note: in this architecture, we're using Google's  @google-cloud/bigquery Node.js package  to add events to Big Query. There are multiple ways to add data to Big Query, and the data ingestion costs will be different depending on which technique is used.  Refer to the Big Query Data Ingestion Pricing Documentation to select the most appropriate data ingestion technology for your workload.

Create the Function

Firebase Cloud Functions are authored by creating a local Node.js development environment using the Firebase CLI, and then publishing the function(s) to the Firebase project using the CLI when they're finished.

This process is quite similar to publishing a Node.js web site to a web server, and most API developers will find the process familiar and straightforward.

The following is a (very simplified!) cloud function that accepts data from a mobile app, and writes the content to Big Query.

const functions = require("firebase-functions");
const {BigQuery} = require("@google-cloud/bigquery");

exports.saveDataToGoogleBigQuery = functions.https.onCall((data, context) => {

  const timestamp = new Date();
  const datasetName = "app_dataset";
  const tableName = "user_feedback_table";

  const bigquery = new BigQuery();

  const dataset = bigquery.dataset(datasetName);
  const table = dataset.table(tableName);

  const row = {
    json: {
      timestamp: timestamp,
      subject: data.subject,
      userFeedback: data.userFeedback,
      starRating: data.starRating,
    },
  };

  return table.insert(row, {raw: true}).catch((err) => {
    return err;
  });
});

Once deployed, the function is available to call via the Firebase API.

Step 5 - Call The Function

App developers need only to add the Firebase Function SDK dependency, and then add code to call the cloud function. The following is an example function call from an iOS application:

    func sendDataToBigQuery(formContent: FormContent) {
      Functions.functions().httpsCallable("saveDataToGoogleBigQuery").call(
                [
                    "subject" : formContent.subject,
                    "userFeedback" : formContent.userFeedback,
                    "starRating" : formContent.starRating
                ]) { result, error in
                if let err = error {
                    sendResultText = err.localizedDescription
                } else if let result = result?.data {
                    sendResultText = "\(result)"
                }
            }
        }

Querying the Data

As the mobile apps call the Firebase cloud function, data is added to the Google Big Query table in realtime. Users can query that data using SQL from the BQ web console:

    SELECT subject, userFeedback, starRating 
    FROM `project-identifier.app_dataset.user_feedback_table`

Query using End-User Apps

Big Query data isn't limited to access within Google's own portal tools, however. Many end-user analytical applications support BQ. Follow the links below to read documentation on how to integrate each of these tools with Big Query data.

Incorporating Additional Data Sources

The above implementation focused on creating the opportunity to stream enriched telemetry from our app code into the data warehouse. This data will typically be additive to clickstream and usage statistics collected by services such as Google Analytics and Amplitude.

Wouldn't it be great to have that data in the data warehouse also? The answer is situational--but I'd argue the answer is often yes.

Fortunately, the paths to aggregate other data sources into a cloud data warehouse are many, and most desired data sources will have a solution.

For example:

Summary

Customers spend a tremendous amount of time interacting with mobile applications, and those interactions create business information.  When this insightful information is delivered to business stakeholders in a timely and accessible format--using tools they're probably already using--they're empowered to make better decisions and provide better services to customers.

This article presented some possibilities to extend current mobile architectures, and integrate them with strategic data platform investments.  These ideas are applicable to each of the leading cloud infrastructure platforms (AWS, Azure, GCP), and can be implemented cost-effectively.