Pseudonymization and you – optimizing data protection

Reducing the risk of data leaks is important, but it doesn't have to be challenging. Pseudonymization is great first step in the right direction.



Everyone hated the privacy policy email armageddon, businesses included. Not because some of their emails were going straight to spam, but because a lot of businesses had to take a second look at their security measures. Revamping security measures can suck – but losing a ton of personal information is the absolute worst.

What can we do?

Businesses can reduce their risk of having information stolen by implementing the right practices for data protection – and some are pretty simple to get behind. One method we’re pretty fond of is Pseudonymization, which is the fancy way of saying sensitive data camouflage. Pseudonymization replaces identifying information in a data record with fake identifiers (pseudonyms) which makes it difficult to trace any given data point. If you think about it, it’s kind of like that fake myspace you had in 2007.

Sounds complicated, why do this?

The great thing about Pseudonymization is that it’s the same data just under an assumed name, or in this instance, a very long string of characters. While no one protection is enough on its own, combining it with other practices like encryption, hashing, or tokenization help reduce the risk of re-identification. Applying pseudonymization to your data is relatively simple, and there is more than one way to accomplish it. In this example, we’ll be looking at the Logstash Fingerprint filter plugin, but you can also try a generic file script using a Ruby filter plugin if this doesn’t work out for you. Both methods will mask the username and IP fields, so keep that in mind!

Implementing Pseudonymization

Before we get started, grab some Mountain Dew because nothing makes you feel more like a computer mastermind than questionable soda choices. Once you’ve cracked open that cold one, download the files in the repository to a local directory. Here is some code from GitHub that makes it easier to download the files individually:

Check that the directory with the downloaded files is shared with the docker. Then go into the directory and execute the following command:

Look for the logline below. This will tell you Logstash has started and can now accept data.

Now take a sip, babes. We’re almost there.

For the Fingerprint filter plugin, execute the following command:

Ta-da! You can now inspect and use the data! The pseudonymized information will be indexed to an events index which you can access through the following query:

It should look a little something like this:

Cool, right? But wait, there’s more! The key-value pair lookups are in an identifies index, which you can access with this query:

If you’re wondering what that looks like, take a peek below: 

You should always have 200 documents in a pseudonym index no matter how many times you index the data. There is one document for each unique value in the table and in this case, we have the username and IP address. Need to reidentify a value? You can look it up by ID in the identities index. ICYMI – this is what a pseudonymized value looks like:


If you need the original value, you can get it with this command:


BAM! Pseudonymization! It’s like the witness protection program for data – we’re a big fan. All of that pseudonymized data makes it difficult for bad actors to do anything with it even if they’re good at what they do. With a solid data retention policy your risk for theft can be drastically minimized, and who doesn’t love that? If you’re curious about Mailgun’s data processing, check out our website! We get real technical with email, real fast.

Related readings

How to send transactional email in a NodeJS app using the Mailgun API

Sending transactional emails is easy regardless of your tools. If you use a NodeJS helper library, this walkthrough will help you get set up in Mailgun. Read more...

Read more

Why you shouldn’t count on the ADPPA and Privacy Shield 2.0

There’s been a lot of buzz around bipartisan U.S. legislation that may eventually become a federal law on data privacy protection. Plus, the U.S. and EU have come to an agreement...

Read more

HIPAA compliance and email: What you need to know

HIPAA is all about protecting and securing patient information. Even if you don’t send marketing emails, you’ll still send transactional and informational emails to...

Read more

Popular posts

Email inbox.

Build Laravel 10 email authentication with Mailgun and Digital Ocean

When it was first released, Laravel version 5.7 added a new capability to verify user’s emails. If you’ve ever run php artisan make:auth within a Laravel app you’ll know the...

Read more

Mailgun statistics.

Sending email using the Mailgun PHP API

It’s been a while since the Mailgun PHP SDK came around, and we’ve seen lots of changes: new functionalities, new integrations built on top, new API endpoints…yet the core of PHP...

Read more

Statistics on deliverability.

Here’s everything you need to know about DNS blocklists

The word “blocklist” can almost seem like something out of a movie – a little dramatic, silly, and a little unreal. Unfortunately, in the real world, blocklists are definitely something you...

Read more

See what you can accomplish with the world's best email delivery platform. It's easy to get started.Let's get sending
CTA icon