Taxonomies and metadata are used to categorize and structure information in digital environments to help users find content. Taxonomies are an important part of information architecture and user experience design. The task of creating and maintaining taxonomies sometimes falls to UX writers and content designers so it’s good to know what they’re all about.
Let’s get a few definitions out of the way:
A taxonomy organizes and structures information, there are generally relationships between the terms of a taxonomy.
A controlled vocabulary is a type of taxonomy. It’s a controlled list of words or terms for a specific purpose like organising a digital library of content. Controlled vocabularies are used to ensure accuracy and consistency in the application of terms to create a frictionless user experience.
Metadata are pieces of information that describe aspects of a digital asset so that asset can be found.
Amazon’s browse tree is a taxonomy that helps customers intuitively navigate the Amazon website for successful online product discovery:
Amazon’s intuitive icons and labels help users to shop by category of book:
Controlled vocabularies can be used to manage image, video or audio libraries. For example Getty Images keywords help customers find visual content:
Metadata might sound like something complicated or mysterious but it’s not. The good news for UX writers is metadata is all about language. Metadata describes a digital asset so that asset can be found and used. For example, song and album titles, as well as writing credits on Spotify are all examples of metadata:
Another simple everyday example of metadata are hashtags. Whenever you add a hashtag to a Tweet or an Instagram post, you’re using metadata to help other users find your content.
If you download this image of a dog from Unsplash to your device, it might be hard for you to find it in a couple of weeks unless you remember the exact location where you saved it. The file will likely have some basic data attached to it like a meaningless filename, the image’s dimensions, and the date and location of where it was captured. But that’s it. There’s no descriptive information attached to the image to grab hold of. Without metadata digital assets such as images, videos, and audio files are just floating in digital space.
You can use metadata to organize and manage digital collections. Think of your favorite podcast platform or streaming provider, when you search for content on these sites you are searching on little pieces of metadata linked to digital assets.
Online retailers also use metadata to manage their collections. When you shop online for clothes and filter items by different facets such as color, size and fit you are interacting with the metadata of products. This is an example of controlled vocabularies at work in the form of facets of information. Metadata makes filtering on facets possible. Below is an example of filtering on the style facet for jeans on Boden’s website:
From a business perspective good taxonomy and metadata management means getting the most out of your assets. Poorly managed assets can be a time suck for your users, employees and company. Companies often waste time and money on recreating assets that can’t be found. Customers who can’t find what they’re looking for will shop elsewhere.
5 key things you need to know about taxonomies and metadata:
1. Consistency and structure
Taxonomies and metadata are structured and organized to create a consistent and comprehensive user experience. When you create metadata for a collection of digital assets you need to be consistent in your approach. Say for example if you’re tagging a bunch of images from a collection of pastel scenic shots, it may seem really obvious to say this but you need to make sure you apply the tags pastel and scenic to all of the relevant images not just some.
Photo by Harli Marten on Unsplash
Creating relationships between the terms of a taxonomy or controlled vocabulary makes your metadata useful. There are a number of different types of relationships that can exist between terms. A taxonomy can be hierarchical. This means that there are broader terms and narrower terms, the broader terms are parents of the narrower child terms. The important thing to remember when creating this kind of relationship is to ensure that all of the narrower terms are part of the broader term. For example, all labradors are dogs. Not all dogs are labradors:
Broader term: Dogs
Narrow term: Labradors
A taxonomy can be polyhierarchical. This means that a product or object can live beneath two different parent categories. Amazon’s taxonomy is an example of a polyhierarchical taxonomy. For example, batteries live under multiple categories:
Polyhierarchical taxonomies are a good idea for consumer products where shoppers are likely to browse in a variety of ways.
2. Inclusivity and accessibility
When creating metadata you need to think about all the different words people might use for the same thing. For a product image of a sweater you must consider the other words people might use for a sweater. This is important if you have a global user base. A user searching in UK English might call a sweater a jumper. Another user might call it a sweatshirt or a pullover. Doing research can help you build a list of synonyms and variants and decide on what the customer facing label should be. Google Trends is a really helpful tool for understanding what people call things in different regions. You might want to also account for spelling variants, and also common misspellings of words to help your users in finding relevant content.
Another consideration when creating metadata for a global audience is the metadata will need to be translated for localised versions of your product. Linguists and language experts will be able to advise you on any potential problems with translating specific words.
3. Make it meaningful
Words have meanings. Some words have multiple meanings. Take the word Orange for example. It can mean orange the fruit or orange the color, it can also be used to refer to different locations like Orange County, California. There are even rivers, lakes and parks called Orange. Messy huh? Ambiguity is an inherent aspect of language but you want to remove ambiguity for your users. Ambiguity causes confusion. You don’t want confused users. Aim to provide clarity within the name, such as: Orange – Fruit, Color Orange or Orange County – California.
4. Research your metadata
Use site analytics and search data to better understand what users are searching for and how they’re searching. Digging into the context of searches will give you a clear idea of the intent behind your user’s searches. Understanding what users call things can help you label terms and categories.
Language is constantly evolving, every day we use words like selfie and photobomb, that weren’t used until a few years ago. Language around climate change has evolved, people are now more inclined to use the phrase climate crisis. Pay attention to the words you read and hear people using around you and incorporate these into your metadata where relevant. Use conversation mining techniques and user research to get a better understanding of your audience’s use of language where possible.
5. Managing metadata at scale
Managing metadata across large content sets can be challenging but the basic principles above underpin the management of all metadata whether it’s a small set of tags for your blog or a controlled vocabulary for a vast library of millions of assets. Taxonomy and metadata management software platforms are available to license from companies like PoolParty, Synaptica, and others. Some companies opt to create their own bespoke software solutions to best fit their specific metadata needs.
Automation and machine learning are increasingly being used to help manage metadata across large collections. Boolean rules and queries can be used to manage metadata at scale. Boolean strings using AND, OR and NOT can help humans to categorize large volumes of content automatically or manually. For example, if you have a website and want to automate the classification of blog posts focusing on UX writing, you could use Boolean logic. You could automatically add the tag UX writing to any blog posts that contain (UX OR User Experience OR OR Interaction Design) AND (Writing OR Content Design) AND NOT (Technical Writing OR Marketing Writing). This is a simple example but the same logic can be used at scale to manage large volumes of content.
Beyond tags there are other important metadata fields associated with digital assets that help people to manage them. These metadata fields can contain all kinds of info like who the creator is, who owns the copyright, any other important licensing information, as well as format information like the size of the file and audio or video length.
Taxonomies and metadata might seem a little intimidating but when you break it down it’s not that complicated. A lot of the skills needed for creating and managing taxonomies and metadata are the same skills needed to be a UX writer: communication skills, research and analysis skills, attention to detail, and empathy for the people using your product. Above all you need a fundamental understanding of your users’ language. And you’ve got that down!
To learn more about metadata and information architecture check out The Accidental Taxonomist by Heather Hedden and the Polar Bear book – Information Architecture: For the Web and Beyond by Louis Rosenfeld, Peter Morville and Jorge Arango.