The scalability of Socialping hasn’t been looking too good lately. We’ve got hundreds of users in our beta and thousands of items being watched, collectively bringing in hundreds of thousands of tweets each and every day. These aren’t little terms that only get a few tweets an hour either, many of them are terms like Twitter, Facebook, Google, etc – you know, really popular things that really stress our systems. And we love you guys for it.
Part of this whole “beta” thing that we are in is to see what our limits are, when things are going to break, and how they break when they do. We’ve got more data on what was and wasn’t working, and how it was working (when it did) than we could have hoped for – and the sum of that data? We couldn’t scale to where we want to go with our current setup.
Recently we’ve had to make a lot of changes to our systems to handle the load. Those changes included rolling out Redis (a NoSQL key/value store and (sort of) database) and migrating from MySQL where it made sense. Other than one subsystem that gets used in our reporting systems, we’ve pretty much completed this migration and our load has been reduced considerably.
Sorry, I’m about to get all geeky, but if you have the time, take a look at Redis’ new “hash” types, they are perfect for some of the stuff we do. We also quite heavily use the Sorted Sets (zsets) and the regular Set types. I want to commend Salvatore Sanfilippo (@antirez) for all of his hard work on Redis, it’s truly one of the best key/value databases available.
So, back to the point of this post, lately we’ve been forced to scale, scale, scale and now that we’ve done enough to give us some headroom, we’ve put together a plan that’ll hopefully let us scale to the point that we can open things up the rest of the beta users that are still waiting to get in and eventually to the public as a whole.
If you care what those changes will be: we’re planning on continuing to use Redis for the long term, both as a cache and as data store, but also adding a cluster of servers that will use Cassandra for storing our tweets and while we’re still evaluating search options (spoiler alert!), likely ElasticSearch will be used to allow you guys to better search through your tweets. We’ll still use MySQL for managing some of the account stuff, like your watchlist. I know I am glossing over a lot of details, but this is the main stack we intend on using. As more details finalize and we start rolling these out, expect updates on how it’s going.
Thank you to all of the beta testers in there breaking things, and thank you for your patience as we work through our growing pains. For those of you still waiting to get in, we’re trying our best to get you in as soon as possible. Thank you for your patience as well.
P.S. Oh, and Salvatore, if you read this, when Redis Cluster is ready for testing, we want to be part of it. We are definitely going to need it.









Watchlists: How To Get What You Want
Socialping prides ourselves on the many diverse notification options we offer, as well as the speed at which we can get them to you (normally within seconds). But sometimes what you’re watching for is more complex than just a single keyword or phrase, sometimes you need to use boolean matching or you want to match on partial words. Today we’re formally announcing support for both!
AND Support
We’ve long mentioned using “AND” in between keywords to make words match in tweets and in emails to our users, but I don’t think it’s ever been formally written anywhere that you can do this. So some of the below information may already be known to you, if so, please consider it a refresher.
Before I go any further, I’d first like to explain how watchlist items work by default. By default watchlist items are phrases, meaning if your watching for “new car” we’re only going to match tweets that have “new car” somewhere in it. Unfortunately, that means that a tweet that said “I got a car, and it’s new” wouldn’t match (sorry about the bad example!). So, back to where I started, to get around this, we introduced AND support a while back, and that would let you watch for “new AND car” and you’d get the tweet that had my bad example of a tweets text in it. It doesn’t matter what order the terms are in either.
Need to use “and” in your phrase? Use +AND. For example:
Garfield +and FriendsPartial Matches
Recently we’ve been letting in people into our beta program left and right, and since it’s a beta program and we’re trying to figure out how people are actually using Socialping, we’ve been spying on the terms they use. Because of this, and with the increasingly large number of users, we noticed a trend forming: people want to be able to match things that change… the most comon of these, are URLs, like 4sq.com and bit.ly where the beginning is the same, but the ending changes.
Our matching system prior to today doesn’t allow for partial matches, this is for multiple reasons, but the most prevalent is that it almost always doesn’t give people the results they want when we supported it. Yes, when we originally launched, we did support partial matches, but after seeing a number of the very early users struggle with it, we opted to remove it.
So, now that we’ve added it back, sort of, how exactly does partial matching work? Using our short URL example above, 4sq.com (Foursquare), say you wanted to get every tweet that mentioned “mayor” and had a 4sq.com URL, you would add this as your watchlist item:
mayor AND http://4sq.com/*What if Foursquare ever got more complex, and decided to use sub-domains? No problem, just use this:
mayor AND http://*.4sq.com/*There are some caveats to partial matching…
Between these two features we’re pretty sure you can match anything in a tweet. But if you’re still unable to find a way to match something, let us know, maybe it’ll be our next feature announcement.
Got any comments or suggestions to this feature, let us know in the comments below, or on our Get Satisfaction site.