DataFeedWatch Request demo

RegExp – Ninja Level Feed Optimization

Posted on April 17, 2014 by Mateusz Miodek

Those of you who already are using DataFeedWatch might have noticed the word “regexp” in mapping options. In this article I will explain how RegExp can be used in our app, but let’s first clarify what exactly RegExp is.

A regular expression (RegExp for short) is a special text string to describe a search pattern. You can think of regular expressions as wildcards on steroids. You are probably familiar with wildcard notations such as *.txt to find all text files in a file manager. RegExp works on the same principles but can do so much more.

It takes some practice to get the hang of RegExp, but once mastered it comes it very handy. For those who are interested in learning regular expressions I can recommend this tutorial.

Also it is a good idea to test your RegExp before deploying it. There are many online tools out there to do just that. The tool I use is called Rebular.

So let’s move on to real life examples to see how RegExp can be helpful when it comes to feed optimization.

Example 1

Imagine you need to create a field ‘color’ for your Google Shopping feed. You do not have an field for color in your store but you know that all the titles of your products end with a color name (e.g. Adidas Mens Snova Glide 5 Running Shoes Green).

The best way to deal with this situation is to map color from name and use an additional replace rule with RegExp like this:

regexp1

What it does is:

  1. divide each name into two groups:
    group 1 – everything except the last word represented by (.*) where
    .* => any single character appearing any number of timesgroup 2 –last word represented by (s[^s]+) where
    s => any whitespace character
    [^s]+ => any single character except whitespace appearing at least once
  2. Replace existing value which can be described as (.*)(s[^s]+) with a new value which is group 2 (in RegExp taxonomy written as $2)

The outcome of this mapping for “Adidas Mens Snova Glide 5 Running Shoes Green” would be “Green”.

Example 2

Imagine you create a price field for a channel that accepts 2 decimal points (e.g. 12.45) and your prices have 4 (12.4500). Again replace rule with RegExp comes in handy. To fix the format we need to set it like this:

regexp2

Similarly to the previous example this rule:

  1. divides each price into 2 groups:
    group 1 – everything except the last two decimal points ([0-9]+.[0-9]{2}) where
    [0-9]+ => any whole number
    . => dot character
    ([0-9]{2} => any 2-digit numbergroup 2 – the last two decimal points ([0-9]{2})
  2. replaces existing value which can be described as ([0-9]+.[0-9]{2})([0-9]{2}) with a new value which is group 1 ($1)

The outcome of this mapping for 12.4500 is 12.45.

Be advised that this mapping does not round up the price to two decimal points, but instead cuts off the last two digits.

Example 3

Let’s say you want to set product_type for Google Shopping as a main category of your products (e.g. Car parts) but in your system you have only the whole category paths (e.g. Car parts > BMW > 320i > 2013).

What you need to do here is remove everything beginning from “ >”. The rule that covers this would look like this:

regexp3
where
s>.* => any single character followed by “>” followed by any single character appearing any number of times

The outcome of this mapping for “Car parts > BMW > 320i > 2013” would be “Car parts”.

Example 4

For the last example imagine a channel that requires UPCs, but in your system not all products have UPCs and the UPCs that you do have do not all have a proper format (12-digit). If you send a feed with products for which UPCs are empty or improper, the whole feed could be rejected. What you need to do is exclude those products. This can be achieved with a single exclude rule using guess what … RegExp.

regexp4

What we do here is include only products for which UPC is exactly a 12-digit number. In other words include products only if UPC matches regexp ^[0-9]{12}$

Those are only a few of numberless examples of how RegExp can be used. The rule of a thumb is that whenever there is some complex mapping you need to RegExp is your “weapon of choice”.

If you have any mapping issues please describe them in the comments and I will try to find a proper RegExp to deal with it (if possible).

 

About DataFeedWatch

DataFeedWatch is data feed management software  that enables merchants on Magento, Shopify, Volusion, BigCommerce, 3DCart  and numerous other shopping carts  to optimize their product data feed for Google and 200+ Comparison Shopping Engines

Share Button

Related Post

Dynamic Remarketing on Google Shopping Dynamic marketing is an important part of any online retail marketing strategy. It delivers personalized ads to people who have visited your web s...
Exclude Products with 100 Clicks and 0 Conversions... Many products get a lot of clicks but never get sold. If only you would know which products they are, you would stop advertising for them right away. ...
Client Case: How to improve the product Titles in ... DataFeedWatch enables you to modify anything in your product feed. Customers continuously ask us how to improve their data feed to increase their RoI....
Amazon Order Management for Shopify Users Order Management is a tool that enables Shopify merchants to sync their Amazon orders with their Shopify store. All products sold on Amazon are up...

Posted in: Data Feeds,DataFeedWatch News,Features & Functionality,Tips & Tricks

  • http://www.i7midia.com.br paulo rossini

    Hello,

    Great post with good examples, congratulations !

    How would a Regex for setting limited caracteres be ? Google Shopping, recommends using no more than 70 caracteres in product tittle, for example.

    • Jacques van der Wilt

      The regexp for that would be: replace (.{70}).* with $1 however there is no need to do this as DataFeedWatch automatically truncates titles to 70 characters now.

  • http://www.webprofits.com.au Mark

    Love it, awesome article, thanks!

How to Double your Google Shopping Sales in 1 hour

Download our free eBook!