<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=661978321021605&amp;ev=PageView&amp;noscript=1">

Building vs. Buying a PIM: Part 3 - Data Distribution

Karl Hughes

Karl Hughes

January 21
  •  
8 minute read
PIM data distribution

One of the most critical pieces of infrastructure for a scalable e-commerce platform is its product information manager (PIM). It acts as the source of truth for product data, allows team members to add and modify product information, and makes this product information available to various distribution channels.

This post is the third in a three-part series on the technical decisions that go into building a PIM (catch up on part one here and part two here). In this post, I’ll focus on how you can make your product data available to your distribution network.

In modern e-commerce businesses, it’s common for retailers to maintain a website, mobile application, third-party marketplace integrations, and possibly physical stores as well. As your product catalog grows, ensuring that each of these channels has the correct information becomes a bigger challenge, but that’s where a PIM can help.

In this post, I’ll share some of the major technical considerations your team should make. Whether your team builds its own PIM in-house or uses one from a third party like Fabric, engineers must understand the channels, output formats, and technical considerations that go into building a PIM distribution system.

 

Distribution Channels and Methods

While your PIM stores product data, variations, and images, customers don’t make purchases through your PIM directly. They will access your catalog through one or more distribution channels that use data from your PIM, and you need to be consistent across each of them.

“In order to retain customers, the online shopping experience as well as customer service, needs to match that of the in-store experience. Shoppers are unforgiving when it comes to disruption, and in an age where they are spoilt by choice, a seamless experience online needs to be guaranteed and consistent.” –Fahad Alamoudi, Marketing Automation Manager at Transiris Corporation

 

PIM Data Distribution: Channels

While consistency is essential to prevent confusion and unmet expectations, each channel you use will have unique distribution requirements. This makes the engineering task of creating an interface for your PIM quite challenging. Before you start building the distribution arm of your PIM, consider the channels you need to support and how your team will get data from your PIM to the channel:

PIM Distribution Channels: Physical and Digital

  • Online stores: Most online stores built as web applications use a direct connection to a product database or an HTTP API to request product data. Unless you use a third-party storefront, you’ll have the most control over this channel.
  • Mobile applications: If mobile purchases are common or e-commerce is a component of a larger offering, you might need to support a distinct mobile application. These are typically backed by an API as well.
  • Third-party marketplaces: Many online sellers want to reach customers where they already shop, so they integrate with marketplaces like Amazon or eBay. Most third-party platforms offer API access, but some of the less sophisticated might require you to upload your product data manually as a CSV or XML file.
  • Social media: Instagram and Facebook are becoming more important for B2C retailers as advertising and sales channels. Strong visuals are important here, so make sure your PIM’s data includes relevant images and headlines for social selling.
  • Print resources: Those of us in online retail tend to forget that direct mail is still a large and growing market. If you need to support print catalogs or direct mail, make sure you know how data needs to be sent to your printing partners.
  • Point of sale (POS): Retailers with physical locations or who distribute their products through physical stores must ensure their PIM supports the format required by their point of sale systems.
  • Distributors: Finally, if you contract with a third-party distributor or warehouse provider, they’ll likely need data from your PIM to track and catalog products. These providers might be able to use your API, or they may require you to upload product lists manually.

PIM Data Distribution: Methods

Once you know all the channels your PIM needs to support, you need to figure out how product data will be kept in sync between your PIM and the distributor. Typically, your team will use a manual, automated, or hybrid approach to data synchronization, but the method varies depending on the channel.

  • Manual: Your team must manually add and update products individually. This is common in distribution channels that are not connected to the internet (as in legacy POS systems).
  • Automated: Your engineers set up a data pipeline that automatically feeds product updates to the distribution channel as they happen. For example, your web application could make API calls directly to your PIM to get up-to-date product information on the fly.
  • Hybrid: Some distribution channels require you to upload data in a specific format that your PIM might not support. In this case, you might export data from your PIM, manually adjust it in a spreadsheet, and then send it to the distribution channel.

While it’s great if you can fully automate product distribution, you don’t always get that option, and automation comes with limitations. For example, third-party marketplaces can always change their feed requirements. When this happens, your engineering team will have to update any scripts they maintain to keep the two systems in sync.

 

PIM Output Formats (the Firehose)

Now that you’ve decided which distribution channels you need to support and how you can get data to each, you’re ready to start designing your PIM’s distribution system. Depending on whether you need to manually or programmatically distribute data from your PIM, you’ll likely have to support one or more of the following output formats: product feed, API, and manual data export.

Feed, API, and Data Export

Product Feed

A product feed is a single file or list of files hosted on the internet that contains all of your PIM’s product data. You can create a single feed for all your partners and distribution channels or create a unique version of your feed for each of them.

When I worked in the textbook industry a few years ago, every one of our partners used an XML feed to send us product data. Unfortunately, each of these feeds was unique and had its own quirks based on the retailer’s software stack. This made the engineering task of pulling data and changes from the feeds quite cumbersome.

If you’re supplying a product feed to partners, they might give you their own technical requirements that you’ll have to abide by. You’ll likely need to apply some kind of adapter pattern to your application to handle this.

Application Programming Interface (API)

APIs are an integral part of most web and mobile applications, so while implementation patterns vary (REST, SOAP, GraphQL, etc.), it’s very likely that your PIM will need an API of some form. Of course, building a robust API requires a lot of work, and it doesn’t stop when you release your PIM:

“While developing and testing your API plays a big part in the process, the real work does not end here. You need to continue providing support, even before the code is deployed to production.” –Theofanis Despoudis, Senior Software Engineer, Teckro

Your PIM’s API is likely to be just one part of a suite of services required to power your e-commerce store. You’ll also need an order management system, pricing API, and a service mesh to coordinate between them.

Data Export

Finally, the most widely supported yet time-intensive method of marshaling data from your PIM to your distribution channels is via manual data exports. While most PIMs should support this method, it provides limited value for large retailers.

“[Manual export] is generally only useful for test purposes. However, if your product feed contains fewer than 50 items or if it doesn’t change very often, you may consider manual upload to be a good, simple method.” –Hannah Augur

Typically, data exports allow users to download a CSV or XML file that contains some or all of your product data. If you have a lot of products, you probably need to give users a way to filter your catalog before they export the file because working with 10,000+ line spreadsheets can be troublesome.

Since most PIMs must support at least two of the above methods, the engineering challenge quickly balloons. This is one of the reasons that using a third-party PIM like Fabric can save you a lot of time and headache. Fabric gives you the ability to share data with both online and offline channels using exports or its robust e-commerce API.

 

Technical Considerations

Whether you support a product feed, an API, manual data exports, or all three, there are a few specific things software engineers should consider. While these are always important elements in software development, each of them has unique implications when building a PIM.

Documentation

Creating good docs for your APIs and data feeds is hard. Keeping them up to date as your product catalog evolves is even harder.

“Outdated docs are the biggest pet peeve of API users… Developers often write about updates several days after rolling them out, sometimes limiting themselves to a few sentences because there’s no established process for docs updates and it’s no one’s responsibility.” –Altexsoft

What makes this documentation especially hard when building a PIM is that not all of your users will be developers. Your documentation might be used by integration partners, warehouse workers, interns, or data scientists. This further complicates the process of writing strong documentation for each of your PIM’s output formats.

Reporting, Tracking, and Logging

All software needs logging to ensure that developers can quickly respond to issues, but a PIM also requires internal reporting and usage tracking. This helps your team see which product data is being accessed most frequently and by which channels so they can update it accordingly.

Caching

Because speed is so important in distributing your product data, you’ll likely need to implement some form of caching. I’ve typically seen flat file caches in e-commerce product feeds, but APIs might need more complex caching mechanisms that handle various input parameters. Of course, the trade-off when caching is that the data could be out of date, so you’ll also need to consider cache invalidation.

Maintenance

Every engineering team I’ve ever worked with underestimates the cost of building software. I’ve mentioned this in every part in this series because it accounts for up to 90 percent of the cost of running software.

“When we first started doing e-commerce integrations, we thought all the work was in getting the integration set up… In reality we were missing what would consume as much if not more time: maintaining the integration.” - Nick McHenry, OneShop Retail

If you’re going to build your own PIM and distribute products to multiple channels, you have to realize that each of these channels means a new integration cost that just starts the day your software goes live. You’ll be paying to maintain each channel for years to come.

 

Conclusion

Now that you’ve seen how to store your product data, how to let your team manage it, and how to make your product data available, you’re ready to decide whether building a PIM is truly worth the investment.

If you have an engineering team on hand and you’re ready to incur the cost of building and maintaining a significant new software product, then building a PIM in-house will give you a lot of flexibility. You’ll be able to pick the pieces that you need, optimize them for the performance you expect, and customize data feeds for each of your distribution channels. On the other hand, you’ll likely spend months developing the product.

Using an off-the-shelf PIM from a SaaS provider like Fabric can save you months of development time and untold monthly maintenance costs. With Fabric, you get a robust data storage and optimization system, an interface for your team to edit products manually or programmatically, and an API so you can distribute your product data to any number of channels. While you won’t get to build your data model from scratch, Fabric supports product families, categories, and taxonomies so you can ensure your catalog scales as your product lines evolve.


Karl Hughes

Karl Hughes

January 21

Follow me on:

Have questions?