Data products design guidelines

What is a data product, and why should it be worth your attention?

DJ Patil defines a data product as:

“Products that advance end goals through the use of data”.

At first glance, this definition seems quite broad. After all, for the most part, all web products use data. They both use data to facilitate the achievement of an end goal. So is everything on the Internet now a data product?

In this regard, I will point out a very important distinction between the two; the difference between a product that uses data to advance its end goal, and a product whose primary goal is to use data to advance its end goal.

Data products, in the sense that they need to have their own category, are products whose primary goal is data.

Data Products Design Guidelines
Define data products

Why are you so obsessed with details? Well, my personal opinion is that a data product, whether it’s a full customer facing product or a partial backend product, has a different set of characteristics than other technology products.

While many standard product development rules apply – address customer needs, learn from feedback – address customer needs, learn from feedback, ruthlessly prioritize, etc. – But there are subtleties that can make thinking about data products a little different.

The definitions above are used to distinguish whether we should think about products as we normally would, or if we need to think about aspects of product development that are better suited to the world of data.

Some examples

With this approach, let’s take a few examples.

Is media a data product? Media is not, by our definition, a data product; it uses data, but its primary goal is “…to build a better publishing platform—one that allows anyone to contribute their stories and ideas to the world and help greatness of people on the front page”. While data will play a key role in this mission, it is not the main driving force behind it. Nor is it the goal of the media, for the media, data is a means to an end.

If we dig a little deeper into media platforms, we find products that use data to define their purpose. The search function of the media is a data product. The goal is to recommend relevant articles to relevant readers, and data is the key to that.

How is the article subscription function of the media implemented? Yes, data, again, plays a key role in deciding what to display to the reader.

Let’s take another example: is Gmail a data product? No, Gmail is an email service whose primary goal is to allow asynchronous written communication between individuals. However, Gmail categorizes our emails into important and unimportant data products. The main goal is to classify emails, which mainly focuses on natural language processing.

Is Instagram a data product? No, but if you think of it as a discrete product, most of its functionality is a data product – eg: tagging, searching, discovering.

Is Google Analytics a data product? Yes, its main goal is to bring users a quantitative understanding of online behavior. The data here is at the center of the interaction with the user, and unlike the other products mentioned so far, its use is explicit.

Type of data product

Obviously, there are various different types of data products. Even narrowing down the possible product field to fit our definition, there is still considerable variation between these products. As this differentiation is further nuanced in product development.

We can group these data products into 5 broad categories: raw data, derived data, algorithms, decision support, and automated decision-making.

Generally, these product types are listed according to increasing complexity. More specifically, they are listed based on increasing internal complexity and (should) be less complex on the user side.

In other words, the more computation, decision-making, or “thinking” the data product itself has, the less thinking the user needs.

Often (but not exclusively) raw data, derived data, and algorithms have technical users. In most cases, they tend to be an internal product of an organization, but counter-examples would include ad exchanges, or API suites. Decision support and automated decision products tend to have a more balanced mix of technical and non-technical users; although for any given product, the user group tends to be one of them.

Data Products Design Guidelines

Raw data. Starting with raw data, we collect and provide usable data (maybe we are doing some small processing or cleaning steps). The user can then choose to use the appropriate data, but most of the work is done by the user.

export data. We do some processing on our side when providing derived data to the user. For customer data, we can add other attributes, such as assigning each customer a customer segment, or adding the likelihood that they will click on an ad or buy a product from a certain category.

algorithm. Next we have algorithms, or algorithm services. We get some data, we run it through an algorithm – whether it’s machine learning or something – and it returns information or insights. Google Images is a good example: a user uploads an image and receives a set of images that are the same or similar to the uploaded image. Behind the scenes, the product extracts the function, classifies the images and matches them with stored images, returning the most similar images.

policy support. Here, we want to provide users with information to help them make decisions, but we don’t make the decisions ourselves. Analytics dashboards such as Google Analytics, Flurry or WGSN would fall into this category. We do most of the work on our side; our purpose is to provide users with relevant information in an easily digestible format so they can make better decisions. In the case of Google Analytics, this could change editorial strategies, address holes in the conversion funnel, or double down on a given product strategy. The important thing to remember here is that while we make design decisions about data collection, derivation of new data, choosing what data to display, and how to display it, it is still the user’s responsibility to interpret the data themselves. They control the decision to act (or not) on that data.

Automate decision making. Here we outsource all the intelligence in a given domain. Netflix’s product recommendations or Spotify’s weekly discoveries are common examples. Self-driving cars or drones are more of a physical manifestation of this closed-loop decision-making loop.

We allow the algorithm to do the work and provide the user with the final output (sometimes explaining why the AI ​​chose that option, other times it’s completely opaque).

Data interaction

So far, we have discussed the types of functional data products.

Each of these data products can be presented to our users in a variety of ways, with clear implications for their design. What are these interfaces or interactions?

Data Products Design Guidelines

API. For the API, we assume a technical user. We should still follow good product practices and ensure that the API is intuitive to use, well documented, meets the needs of users, and is worthy of our use.

Dashboards and visualizations. For dashboards and visualizations, we assume some statistical knowledge or ability to work with numbers. In the most extreme cases, we can do a lot of the heavy lifting for our users and work hard to ensure that we only present the most relevant information in an easy-to-understand format. By choosing what information to display, we are influencing the decision, but it still leaves the interpretation and decision in the hands (or mind) of the user.

network elements. For the past 5 years or so, the least common technical interface for data products to users has been the network element. Recently, the applications of these interfaces have been widely expanded to include speech, robotics, and augmented reality, among others. While the design details of these new interfaces are all distinctly different, there is considerable overlap, as they all revolve around showing the user the outcome of a decision, and perhaps conveying why or how the AI ​​implemented the decision.

Find out what we’re building

Plotting the types of data products against possible interfaces, we get a matrix of orange dots, each dot representing a different matrix of data products – different products require different approaches.

Data Products Design Guidelines
Data product matrix – different products require different approaches

Each element in the matrix requires design considerations that can make a big difference, both in terms of what users need and in the design process we use to achieve our goals.

Sloping from the top left circle (Raw Data API) to the bottom right circle (Automatic Decision Web Elements) refers to the move away from technical, engineering driven products to more typical software products (i.e. more intuitive for product managers and designers, often products, magazines and articles that appear in the book).

Difficulties & Methods

In my experience, when teams apply a human design approach to more data products, they run into big problems. Of course, this is not to say that engineers are not people. Most are, and those that don’t often have an uncanny resemblance. However, HCD is a holistic approach to product development, and it’s a great approach when designers understand user motivations and behaviors. For technical data products, product boundaries are often artificially constrained by functional organization considerations, and product and UX teams are often not equipped to a) understand the complexities of technical user behavior; and b) lack the ability to explore these complexities.

Well, it’s naive to assume the out-of-the-box design thinking or lean methodology we’ve been reading about.

However, this is not a cause for panic.

Although the output of user research can be very different from a consumer-facing product or a truly typical SaaS product, and the definition of KPIs can be technically wrong, both Design Thinking and Lean are malleable enough to allow us to Adapt our method for this new domain.

My advice, when applying these methods to data products, is to ensure that the problem space is defined in terms of end users and not users of direct data output. Most likely, this means expanding the team to include adjacent products and their managers.

Likewise, if the user is a technical user, then we should adapt to this environment. To be sympathetic to users having engineering problems may mean that we have to open an IDE and code.

Data Products Design Guidelines
Watercress, developed using the HCD lean mix. Photographed by Marcus Spisk

Leave a Reply