Weekly product update: 50% better database performance and moving to Mailgun.com
We're now operating at only 10% of our maximum capacity and can comfortably handle the giant traffic spikes without a noticeable increase in latency. Read more...
After a few weeks rolling out some frequently requested new webhook-related features, we worked on a some less visible things to keep Mailgun new and shiny for you. Specifically, we decreased the number of queries per second on our main MongoDB instance by 50%, increasing the performance of Mailgun and making it easier to scale as we grow. We’re now operating at only 10% of our maximum capacity and can comfortably handle the giant traffic spikes, which we’re seeing more and more as we grow, without a noticeable increase in latency. We also redirected all the vestiges of mailgun.net on our documentation and control panel to mailgun.com, speeding up future development efforts.
Redis caching leads to over 50% reduction in queries per second to main Mongo database
This week we worked to make our main application databases more performant. When customers send and receive email through Mailgun, our main MongoDB is constantly being queried. Databases can’t scale infinitely though, so we are always looking to improve performance and do more with less.
When we examined the queries coming into MongoDB, we realized that if we used Redis to cache one particular value (Redis is great for caching), we’d be able to dramatically reduce the number of queries. Take a look at this graph of a period directly before and after caching was implemented. We were able to decrease the average number of queries per second from between 13,000-15,000 (with peaks of 20,000 or more) to 4,000-6,000. That means that we can more easily handle the huge spikes in emails sent by our beloved customers.
It’s a little known fact (mostly because no one ever asked!) that when Mailgun launched, mailgun.com was not available. So we started with mailgun.net as our main domain, only subsequently buying the .com TLD too. Operating two different domains- mailgun.com for our main website and mailgun.net for our documentation and control panel, as well as API- has been a real pain and was confusing to customers. Simple tasks, like implementing Google Analytics to track product usage, have been overly complicated, requiring extra code to track the most basic actions about our product. This makes it hard to understand what customers like about Mailgun and what they don’t. And most importantly it takes away engineering time for all the things customers are asking for. Let this be a lesson to all you startups out there: pick one domain and stick with it! Otherwise you are going to constantly be battling to understand what your users are doing across your application. Cross-domain tracking is never as easy as anyone would have you believe.
When we started to look at this migration, this is what we wanted the code to look like for the 301 redirect of all the .net pages to the equivilant .com page (we needed to do a 301 redirect so google and other search engines would pass over link equity built up over the years–it’s an SEO thing):
However, because of the way Mailgun was structured, there were number of edge cases to handle and things got complicated. Rather than hack it together and move on to the next thing, we decided to restructure the way that users interact with Mailgun via our website and API, reducing the number of moving parts and decreasing the likeihood of introducing some bug that we didn’t think about. But we didn’t want users to have to do anything themselves, even changing their bookmarks, let alone their code.
Cloud Load Balancers to the rescue
Since we joined Rackspace back in August 2012, we’ve gotten a lot of really cool toys to play with including the Dell R720s that run our main API & SMTP processes (this beast of a server deserves a whole blog post on its own and we hope to write that soon). We’ve also played around with the Rackspace Cloud and decided to split Mailgun into two parts as part of our refactoring.
Front-end processes like website, control panel, documentation running on Rackspace Cloud Servers, behind a Rackspace Cloud Load Balancer to make scaling very easy. Adding new nodes is just a few simple API calls.
Core back-end processes like API and SMTP running on Dell R720s behind an F5 load balancer (we’re hoping to visit the Rackspace datacenter where these machines are this summer to bow down before their sheer power)
Structuring Mailgun in this way let us shorten our nginx configs from 393 to 261 lines of code. Not a bad improvement. We’ll miss you mailgun.net but will still see when using the API which is staying at
That’s it for this week folks.
Stay tuned next week when we talk about some more of the things we’ve been working on.