John Quinn

New Digg API Features and More Developer Control

Hi all,

Today, we’ve introduced some exciting new additions to the Digg API, so that developers can create increasingly useful and interesting applications using Digg data.

One of the most requested features has been for a search endpoint and we’re excited to make it available in this release. This feature utilizes the same search functionality introduced a couple months back with the overhaul of Digg search and provides a powerful solution for finding specific content on Digg. You’ll be able to use the advanced shortcuts, common search tricks, as well as search by source (domain). See the Search API documentation for more details.

We’ve also added a series of related stories endpoints, so that you can find related information for any story on Digg. One variation finds stories similar diggers have dugg. Another returns stories with similar keywords. Finally, we’ve included an endpoint for favorites on Digg, which are an indication of the stories that people find exceptional in some manner.

These features are just the beginning of some changes that we’re currently working on and plan to introduce in the coming months. These will include endpoints for participating — such as Digging and burying — so that even richer and full featured applications can be created off the Digg platform.

Finally, a significant but subtle change we want to announce is a reworking of our API license. We want to give developers more control over the applications they develop by removing most of the commercial limitations. Developers should now have the confidence that they can benefit from the works they create using the Digg API, with full ownership free of fees!

We hope you enjoy these changes and, as always, welcome your feedback.

Cheers,

John

Micah Snyder

DUI.Stream and MXHR

Hola,

Although we usually wait until a feature is fully baked before showing off the technology behind it, we’re just too excited about this new project to keep it to ourselves. Digg’s Front End Architecture team has been experimenting a lot with performance improvements, and we’re just about done with one that we think will change the way you get data from high-traffic websites. First though, a bit of background:

One of the ways that high-performance websites like Yahoo suggest speeding up load times is by reducing the number of HTTP requests per page. We started thinking about what we could do to reduce HTTP overhead, and where we could get the biggest benefits from it. Well, one thing led to another and the next thing we knew we were talking about writing a generalized framework for bundling files, sending them through a single request, then separating them for use once they head down the pipe.

We call this technique MXHR (short for Multipart XMLHttpRequests), and we wrote an addition to our Digg User Interface library called DUI.Stream to implement it. Specifically, DUI.Stream opens and reads multipart HTTP responses piece-by-piece through an XHR, passing each chunk to a JavaScript handler as it loads.

Why do this? Well, DUI.Stream will allow developers to drastically improve the speed of uncached page loads by bundling most of their resources into a single HTTP request, with a single time-to-first-byte and no request throttling by the user agent. Additionally, the size of the response has no effect on the rendering time of each chunk, as the client handles each piece of the response on the fly and can inject it into the DOM for rendering immediately, in the exact order you specify. On a high traffic, high-activity site like Digg, we have to display incredible amounts of data on each permalink — typically hundreds of user images within the first 50 comment threads on a page alone, not to mention the UI chrome and actual comment data. (You can see this for yourself: notice the number of HTTP requests that queue up when you expand a page of comments). So our primary use case for DUI.Stream is turning that first long, arduous page load on an empty cache into something nearly indistinguishable from a page of data with fully cached resources.

Let’s talk a bit about the architectural benefits of implementing MXHRs with DUI.Stream. Back when the web was based largely on a page metaphor (i.e.: one central document with external references), whenever you loaded the page, the page requested its images, stylesheets, etc, then you were done. These days you’re just as often loading an application; the page progressively enhances into a stateful UI by loading extra stylesheets, scripts and a whole mess of UI chrome after the initial request. Yet, we’re still using the old model flow of get markup –> render markup –> request external resources –> load and display externals.

Take our modal login dialog box for example. In order to reduce requests we bundle its JavaScript in with the rest of the page, we put its CSS up in the header with the rest of the styles, then we request only the markup for the dialog box, render it, and let it fire its own HTTP requests for the images that make up its chrome. In this broken model, HTTP connections and rendering behaviors split our UI architecture up into different parts of the page that all render at different times at the browser’s discretion. Even if we put everything into one cohesive structure and loaded the CSS link, script tag and markup together, they’d still all fire their own HTTP requests and the images would still come in afterwards on the first page load. This just won’t do.

Now, let’s rethink how our login dialog could work using DUI.Stream. We can request a Stream that contains everything needed to render and use the dialog box. As each part comes in, it gets passed through to be built, and renders immediately with no image backfill or delayed JS behavior. The DUI.Stream framework can then pass those resources back into cacheable elements for our next page load, which can happily 302 its way quickly through the rendering process. Pretty sweet right? Right.

Before we get into actual code, I have to stress that the DUI.Stream client, with all its aforementioned badassitude, is an alpha release — it’s better than a proof of concept, but it’s definitely not ready for prime time. Think of it like Miles and Hurley’s time travel discussion on Lost:  There are still a few plot holes, but what we’ve seen so far is pretty cool. With that in mind, let’s save the talk about future features for a bit later and get into what DUI.Stream does right now.

Here’s a TLDR example:

var s = new DUI.Stream();
 
s.listen('text/html', function(payload) {
    $('#stream').append(payload);
});
 
s.listen('complete', function() {
    alert("D. Plainview: I'm finished!");
});
 
s.load('testStreamData.php');

DUI.Stream works on a pretty simple mechanism: create a DUI.Stream, register listeners for each mimetype you plan to send (as well as an optional onComplete), then send a request to a URL that implements an MXHR responder. I’ll leave the details of DUI.Stream’s inner-workings to those of you who want to look through the source. As always, our code is released under a open-source license so if you find any fugly parts, feel free to let us know or send us a patch.

You can see the demo code in action, and I’ve included some rough timing numbers to demonstrate just how much faster DUI.Stream is than a regular uncached page load. YMMV with the demo, so don’t be surprised if it falls over in your browser or occasionally reports wacky numbers. Check out the roadmap below for cross-browser info. Also, we have a demo that handles images that performs atrociously well in Firefox and Safari, but at present IE doesn’t play nice with it (shocking, I know). You can test drive it at your own risk.

If you’re curious about responders, we’ve thrown together a few demos in Python, Ruby, Perl and Java in the DUI.Stream GitHub repository. We’ll be releasing a full set of documentation once we hammer out the rest of the DUI.Stream client. Rest assured you won’t have to implement anything based on a blog post and some demos.

Here are the things still on our short-term roadmap:

  • Cache detection. If you wanted to implement DUI.Stream right now, you’d have to switch between regular requests and MXHRs on your own. We don’t recommend that.
  • Background caching. A big part of implementing this is going to be loading the MXHR through DUI.Stream, then turning any cacheable chunks into their normal, cache-friendly tags.
  • Support for multiple headers per chunk. Specifically, we’ll be adding a set of custom headers to allow for UI-specific information to be sent, like CSS selectors for greater control in handlers.
  • XMLHttpRequest.multipart support. Right now we’re not using this flag, since we aren’t keeping connections open for a server-push implementation yet. Like I said, It’s an alpha ;)
  • IE7 and IE8 object tag workarounds. In the current build, bouncing binary streams into object tags in IE has some interesting results.

Anyways, I hope you like what you’ve seen so far. Several of my teammates have been busting their asses to get this ready to show off — specifically Jordan Alperin, Arsenio Santos, Chris Goffinet, and Arin Sarkissian — and we’re all pretty excited about it. As always, if you have any questions, comments or thoughts, let us know!

Cheers,
- Micah

John Quinn

DiggBar Changes Live Today

Hi all,

As promised last week, we pushed a few key changes to the DiggBar today.  If you are logged-out, you will no longer see the DiggBar by default. If you miss the DiggBar, we encourage you to log-in, as we believe it creates a much more seamless Digging experience.

You can continue to use Digg to create short URLs from anywhere on the web by simply typing http://digg.com before any URL or using our bookmarklet.

If you are logged-in and have the DiggBar enabled, you may notice a few additional changes. We shortened the height of the DiggBar to make it lighter and more compact. We also temporarily removed the ‘view count’ number for now, as the data no longer represents the global views for that content.

Stay tuned for more enhancements. As always, let us know what you think, as we appreciate all your feedback!

John

Kurt Wilms

PWDTAD: Item-Based Collaborative Filtering

You may have noticed the “People who Dugg this also Dugg” feature on the Digg story list pages as an interesting new way to sift through all the available items on Digg and discover new content.

We recently rolled out a few subtle changes to the placement of this feature, along with Related by Keyword. Many users wanted to see the comments directly under the story summary, so we are exploring different locations of these features based on what’s more relevant to you.

So how are the items that appear in the “People who Dugg this also Dugg” section chosen? The short answer is that it is based on an new item-item collaborative filtering algorithm recently developed at Digg.

The Digg Recommendation Engine predicts what items on Digg you might like by finding the users most similar to you, and then using their digging history in order to make predictions for you. There are various tweaks to the algorithm, but that is the basic model. The “People who Dugg this also Dugg” feature explores the relationships between items on Digg rather than the relationship between users. Instead of computing correlations between people, here we compute correlations between pairs of items on Digg. Computer scientists call this type of collaborative filtering ‘item-item’ because the algorithm is focusing on items rather than users. When we want to show “People who Dugg this also Dugg” items for a target item, we look at all the other items on Digg and select those that are most similar to the target item. Again, there are various tweaks to the algorithm, but that is the basic model.

The critical step in calculating the “People who Dugg this also Dugg” items is to compute the similarity between items. The basic idea in similarity computation between two items is to first isolate the users who have dugg the two items you are comparing and then to apply a similarity computation technique to determine their similarity. There are various strategies for determining this value. For those interested I would suggest this paper for a good overview. We’ve chosen to use an implementation of Jaccard coefficients in our calculation.

Launching a recommender on Digg comes with a unique set of challenges, many of which have to do with the tens of thousands of items submitted daily and the incredible volume and rate of diggs occurring on those items. As you can imagine, computing similarity for all theses items on digg fast enough to produce relevant recommendations can be a tricky problem to solve! Our recommender, written in python, computes similarity in real time as digging activity occurs. When an item on Digg receives at least five diggs, we compute similarity scores, select the most similar items, and store the results in a special data structure. As digging activity continues, the similarities are recalculated and the most similar items are reselected. At any given point in time, the recommender is calculating similarities for tens of thousands of items on Digg.

Of course, even with some of the technical challenges solved, our work is far from finished. We’ll keep on studying how our users interact with the related stories and we’ll continue thinking of innovative ways to use the data we have to improve the Digg experience. Since this is the beta version of the feature we want to know what you think. Please contact us with your thoughts and suggestions.

-Kurt

Joe Stump

Introducing Digg’s IDDB Infrastructure

I’ve been hinting at a mysterious piece of infrastructure, called IDDB, for some time in various talks, blog posts and mailing list posts. Today, with the introduction of Digg’s DiggBar and URL minifying service, IDDB is finally powering a significant feature. I say that because it’s actually been quietly powering user IM’s and links for about a month with no issues (this is why that feature temporarily disappeared a few weeks ago only to reappear after a major data migration).

So what, exactly, is IDDB? IDDB is a way to partition both indexes (e.g. integer sequences and unique character indexes) and actual tables across multiple storage servers (MySQL and MemcacheDB are currently supported with more to follow). It started out as a harebrained idea that a few guys from operations and I would chat about during lunch. After a few lunches Ron and I decided to put the marker to the whiteboard and hammer out the details. A few afternoons later we had the basics whiteboarded and I’d prototyped a working package.

From there Ian and the Core Infrastructure team took over implementing and refining IDDB. There were tools to be written for operations, features to be fleshed out and tests to be written (In fact, 41% of IDDB’s code is its unit test suite). In the end some of the features that were cooked up for IDDB include:

  • Ability to partition data across multiple storage nodes using arbitrary node assignment. The hashing framework allows node assignment to be pluggable (e.g. We could create hashing algorithms that hash ID’s by colocation facility, geological location of the user, racks, etc.). In addition to this, we allow for multiple types of storage nodes.
  • Ability to break IDDB indexes, which store ID and location meta data, across up to 16 machines.
  • Each ID maps to N storage nodes and has an individual status on each node. All data, by default, is written to three storage nodes. MySQL and MemcacheDB both support replication so their slaves are added to each ID as a read-only node.
  • Each ID type can be served from its own cluster of servers. This means that the user ID sequence can be on a totally different cluster than the user email unique character index.
  • IDDB utilizes Gearman to move data around. For instance, we can take a user who’s consuming a lot of resources and migrate them to a less loaded set of storage nodes. We also use this to find ID’s that are in a state of error to fix them (e.g. A user ID requires 2 copies, but there’s only 1 so we lock it and make another copy).
  • Storage nodes can be arbitrarily added to the pool at any time and new ID’s will instantly start mapping to them. Additionally, we can set a storage node’s status to “full”, which means it can no longer accept new ID’s, but will continue to accept new data for existing ID’s. When combined with our migration tools we can elastically grow the storage clusters out and rebalance ID’s across them as needed. This also means we can keep heterogeneous hardware operating in unison (e.g. Older machines can hold 10,000 users, but new ones can hold 25,000).
  • All meta data about ID’s, which is pretty much entirely static, is kept in Memcached to reduce reads from the index cluster.
  • Ability to track number of queries being ran against a storage node or against a single ID.
  • An entire suite of CLI utilities to manage the clusters.

What’s nice is that IDDB abstracts all of this, making it quite easy for developers to partition data without having to worry about the basics. Here’s how you’d fetch a user record and query for the user’s IM links:

<?php

require_once 'IDDB/ID.php';

// Fetch a user's IDDB record, which contains basic meta data and location data.
$id = IDDB_ID::factory('User', 1234);

// Get all of the user's data from a random node.
$links = $id->db()->getAll('SELECT * FROM UserLinks WHERE userid = ?', array($id->id));
print_r($links);

?>

Pretty simple. But what about creating a new ID? Let’s say we’re creating a new user and want to add some IM’s to their table. What would that look like?

<?php

require_once 'IDDB/ID.php';

// Creates a new unique, auto-increment, ID in IDDB.
$id = IDDB_ID::create('User'); 

// The execute() method will write this query to all of the ID's writable nodes within individual
// transactions. If it fails to write on all nodes then the exception is surfaced. If it fails on 1
// of, say, 3 writes it will mark the ID as being in an error state on that 1 node while being
// live on the other two nodes.
$id->db()->execute('INSERT INTO UserIMs SET userid = ?, service = ?, handle = ?', array(
    $id->id, $_POST['service'], $_POST['handle']
);

?>

The DiggBar and URL minifying service is powered by a 16 machine IDDB cluster, which includes 8 write masters in the index and 8 MySQL storage nodes. It’s, to date, the largest IDDB cluster Digg has pushed into production, but we have plans for much bigger IDDB clusters.

On a final note I really want to call out IDDB’s test suite and how putting extra effort into tests has helped us rapidly iterate over IDDB. The test suite contains 2,300 lines of code, 990 tests and a whopping 8,081 assertions! In fact, IDDB’s test suite has surfaced (and fixed!) a number of bugs in PHPUnit itself. It’s imperative that such a crucial piece of code be completely unit tested and I’m insanely impressed with Ian’s work in this area.

Thanks!

Joe

plathrop

Master of Puppets #2: speaking the language

Last time we talked about Puppet, I told you about ralsh, the tool which allows you to access Puppet’s Resource Abstraction Layer. It was a bit esoteric for many of you; I hope today’s entry will make up for it. I’m going to talk about the Puppet language today, as well as going over the development strategy I use when working on my Puppet manifests.

Let me start by giving you an overview of the language. The Puppet language is a domain-specific langage (DSL) for describing system configuration. The language is declarative: you specify the configuration you desire, it is up to Puppet to figure out how to get there. The fundamental “atom” of the Puppet language is the resource. Get used to that term, ladies and gentlemen, you’ll be hearing it a lot. A resource is a logical component of configuration: a file, a package, a user, a cron job; these are all resources which Puppet can manage natively. Puppet also gives you the ability to create defined resources which are collections of native resources that you can parameterize and treat as a single logical resource. I’ll talk more about that later.

Puppet also has variables. Because it is a declarative language, Puppet has scoping and assignment rules that are different from other (imperative) programming languages. Many people find these rules frustrating and/or counter-intuitive, but they turn out to be less limiting in practice than you’d expect. The first thing to remember about Puppet variables is that you cannot change the value of a variable within a single scope. Since Puppet is declarative, you cannot rely on file order to represent the order in which a configuration is instantiated; changing a variable within a single scope would require file ordering to be consistent with instantiation. The second thing to remember about Puppet variables is that they are dynamically scoped, which essentially means that scope hierarchies are created based on where code is evaluated as opposed to where it is defined. The Puppet documentation has a very good example which helps illustrate this constraint in practice.

Collections of resources, defined resources, and variables can be grouped into a class. Although these classes share some of the semantics of object-oriented programming classes (inheritance, ability to over-ride certain declarations in a subclass), one should be careful to remember that Puppet is declarative, not object-oriented. One of the biggest sources of frustration I see in new Puppet users is not trying to think in line with Puppet’s model. Puppet is a somewhat opinionated tool; the faster you get used to the Puppet Way, the more benefit you will get from using it.

Collections of classes, definitions, and resources can be grouped together in a module. Modules are a special helping of Puppet’s awesome sauce, and I’ll get into them a bit more later in this article, when I walk you through the creation of a module.

Finally, Puppet allows you to define nodes, which can include classes, definitions, and resources. Nodes are similar to, but distinct from classes; nodes can inherit other nodes, and they define a new variable scope when declared. However, as your Puppet infrastructure grows, you will probably transistion away from using Puppet’s internal node declarations and use External Nodes instead (a topic I’ll cover another time.)

Let’s talk about how I combine these elements when working on Digg’s Puppet manifests. Like many other programming languages, the development process in Puppet is one of iteratively creating abstractions to encapsualte lower-level details. From a top-down perspective, I tend to think of server roles - these are logically-contained units of functionality that a given node can provide. For example, a Digg webserver might have the following roles:

base node
This is the basic role that all Digg nodes fulfill. It encapsulates all the things we want to be present on every server; packages we always want, administrative scripts, ssh configuration, etc.
memcache
A server fulfilling this role has memcached running and is able to serve as part of a memcache pool.
digghttp
This role encapsulates all of the configuration we need in order for a server to be part of our web cluster.

Once I’ve come up with the high-level role I want to create, I will usually create a module for this. In my Puppet modulepath I create the minimal module structure; for example: mkdir -p memcached/manifests && touch memcache/manifests/init.pp. Since I know a high-level role like this will require config files and templates, I might create those as well: mkdir -p memcache/{manifests,files,templates} && touch memcache/manifests/init.pp. At this point I’ll often start an incremental development process. I’ll define a few resources I know I need, make sure they work, and then add a few more. This development style has caused me problems; it is easy to forget to set up proper relationships if you are adding things to your manifest as you go. Still, this works well for me and I usually catch ordering issues in later testing.

So, what do we need to configure a memcache node? Well, we need to get memcached installed, so we’ll need a package resource:

package {
  "memcached":
    ensure => installed;
}

After memcached is installed we need to configure it. This calls for a file resource. If our configuration is relatively standard across the cluster, we can use a static file. We put memcached.conf in the files/ subdirectory of our module and define a file resource:

file {
  "/etc/memcached.conf":
    source => "puppet:///memcached/memcached.conf",
    require => Package["memcached"],
}

If we’d like to parameterize this a bit more, we can use a template instead. We put memcached.conf.erb in the templates subdirectory of our module and define that file resource like so:

file {
  "/etc/memcached.conf":
    content => template("memcached/memcached.conf.erb"),
    require => Package["memcached"],
}

Puppet templates give you the full power of Ruby (they are just ERB with access to Puppet variables!) memcached.conf.erb might look like:

# AUTOGENERATED BY PUPPET
# memcached config file
# Run memcached as a daemon. This command is implied, and is not needed for the
# daemon to run. See the README.Debian that comes with this package for more
# information.
-d
# Log memcached's output to <%= memcached_log_dir %>
logfile <%= memcached_log_dir %>
# Be verbose
<% if verbose -%>
-v
<% end -%>
# Start with a cap of 64 megs of memory. It's reasonable, and the daemon default
# Note that the daemon will grow to this size, but does not start out holding this much
# memory
-m <%= memcached_memory_cap %>
# Default connection port is <%= memcached_port %>
-p <%= memcached_port %>
# Run the daemon as root. The start-memcached will default to running as root if no
# -u command is present in this config file
-u <%= memcached_user %>

I won’t go into detail about how to use ERB but, as you can see, if you define variables (perhaps in your site.pp file) like memcached_log_dir, they will be inserted into the appropriate places in the template.

To complete our memcached module, we probably want to be sure that the memcached service is running, and configured to start at boot. A service resource will do the job nicely:

service {
  "memcached":
    enable => true,
    ensure => running,
    subscribe => [ Package["memcached"], File["/etc/memcached.conf"] ],
}

In a Puppet service, “enable” defines whether the service will be started at boot, while “ensure” defines whether the service should be running or not. It is a subtle distinction, but an important one. The “subscribe” parameter tells Puppet to refresh the service when one of the referenced resources is changed; if the memcached package is upgraded, or the config file changes, Puppet will restart the service.

Putting it all together, we’ll end up with a memcached directory under our Puppet modulepath with a manifests subdirectory and a templates subdirectory. The templates subdirectory will contain the ERB file we defined above (memcached.conf.erb.) The manifests subdirectory has a file, init.pp containing the finished memcached manifest:

class memcached {
  package {
    "memcached":
      ensure => installed;
  }
  file {
    "/etc/memcached.conf":
      content => template("memcache/memcached.conf.erb"),
      require => Package["memcached"],
  }
  service {
    "memcached":
      enable => true,
      ensure => running,
      subscribe => [ Package["memcached"], File["/etc/memcached.conf"] ],
  }
}

Now, to set up a memcached node, all you have to do is add: include memcached to the node definition. Puppet is smart enough to autoload classes from your modulepath and knows how to map the template path we used (memcache/memcached.conf.erb) to the actual location of the template file in modulepath/memcached/templates/. This autoloading magic is a big win and is the reason we use modules for all of our Puppet manifests.

I intentionally chose a fairly simple example this time around. Next time I hope to delve deeper into Puppet and show you how to get the real work done. See you then!

Scott Baker

Moving Digg

Most Digg users probably don’t think about the technology it takes to
keep Digg up and running. That’s a good thing. That means hopefully
the site always loads when you want to use it and the pages load fast
enough so you don’t give up and go somewhere else. That’s the
operations team’s primary goal: that Digg is always available.

That’s not to say we don’t occasionally perform maintenance that
limits your access to the features of Digg, or that there aren’t
occasional glitches in the systems. Right now, for example, Digg
is experiencing growing pains. We’re running out of space in our
current primary datacenter (the place where the servers, databases and
networks that run Digg live). So, that we wouldn’t wake up one day and
just say, “Oops, gee, I guess no Digging today,” several months ago we
began the process of moving Digg to a new, larger datacenter space,
with more room for expansion.

Our goals were to find a space with the same (or better) ample power
and cooling as our current facility; to find a space with plenty of
room we could expand into as Digg grows; and to make sure the people
who ran the facility are as competent as we are used to with our
current datacenter provider, Equinix. Perhaps, not surprisingly, we
found all these fulfilled in another, newly-renovated Equinix
facility! (Okay, the ops team is going to chide me for the use of the
bang (!) but I felt it warranted. : P)

Our cage in the new facility has room, power and cooling for 40
cabinets worth of servers, with easy expansion to twice that number.
The amount of power we can draw per cabinet has also increased by
roughly 50%. Also, we’ve taken the opportunity to upgrade some of our
hardware to new dual, quad-core Intel-based systems, that use
marginally more power but provide twice the computing power over our
current configuration.

One of the things making the move possible without huge amounts of
downtime is our effort to more easily and robustly automate the
deployment and configuration of servers (web servers, database
servers, mogilefs servers, recommendation engine servers, solar
servers; we have a lot of different types of servers). Ron Gorodetzky
and his team of systems engineers have been working to “Puppetize” all
of our server configurations (see Paul Lathrop’s most excellent post
about Puppet here). Puppet is an open-source
configuration management tool and Paul’s blog post describes why we
use Puppet, along with some examples of how we use it.

One of the great things Puppet allowed us to do is to tell Paul
to create a new pool of recommendation engine servers in the new
datacenter. A few hours later, voila, there they are ready to use. We
change Digg’s configuration files on the webservers to look at the new
servers and the service has moved to the new DC (I get tired of
writing “datacenter”, so DC from now on). There are a few other home-
made tools we use and, hopefully, will be releasing as open-source
software in the not too distant future.

Right now, we are in the final stages of this migration. During the
next few weeks the remaining parts of Digg’s infrastructure will be
moved to the new DC and we’ll turn out the lights in the old cage.
Well, actually, they will stay on and belong to someone else in that
DC that’s been eyeing our space for some time.

My point in delving into this migration is to let Diggers know that
there may be more glitches as we get closer to completing the move.
So, you may try to go to Digg late some nights Pacific Daylight
Savings Time and find us slow or not there. We’ll warn users before we
start doing these moves with a message on the home page, but,
if the site’s down and you weren’t there when we posted the message,
then you might be troubled by our absence.

Hopefully, all our plans will succeed flawlessly and you won’t notice
a thing. Digg will keep serving immaculate packets and your
queries will be returned with answers, fast and sure. But, just in
case our servers find a typo in a script before our engineers do, be
assured that we are watching and are working to get the site back to
you as soon as we can.

Joe Stump

New PECL extension libmemcached released

Hey all -

It’s no secret that Digg uses Memcached. Along with our friends at Flickr, Facebook, StumbleUpon, Yahoo!, etc. we use it to alleviate stress on our databases. For those of you who aren’t sure what Memcached is, it’s essentially a way to make RAM available over a network for caching purposes.

PHP has, however, lagged behind some of the exciting new developments in the Memcached world. CAS, asynchronous features, consistent hashing, multi-get, etc. were either missing or buggy in the PECL extension. When I asked around I found out that Facebook and Yahoo! were maintaining their own Memcached clients, for the reasons I listed above, and didn’t have plans to release them anytime soon. Well, crap. What’s a PHP coder to do?!

Luckily, Digg recently hired PHP hacker extraordinaire, Andrei Zmievski, specifically for stuff like this. So we sat down with Andrei, laid out our current problems, white boarded what we’d like the API to look like and estimated out a sprint.

Andrei has been hacking away on this for about two months over at Github, and today we’re happy to announce that version 0.1.0 of the new memcached PECL extension has been released publicly. Here’s a short list of the goodies this new extension supports:

  • Built entirely on top of libmemcached, which has emerged as the standard library in the community (Python and Ruby’s clients use it as well).
  • Support for Memcached’s CAS functionality
  • Support for asynchronous mulit-get
  • Read-through cache and value callbacks
  • Support for Memcached’s binary protocol
  • Buffered writes

We haven’t started using this in production yet, but plan on integrating it quickly so I’m sure more work will be going into it over the coming months. If you end up using it on your own projects we’d love to hear about how it works out for you and, of course, any bugs you find.

Yay!

-Joe

Bryan Whitehead

How Obama Helped Digg Fix Bugs

A historic moment in history has occurred with the election of Barack Obama as the next President of the United States. For many, this moment was shared with the digg community that included a furious level of digging and commenting. In fact, election night generated the most traffic and activity in the history of Digg.

At about 8pm PST, most TV networks called the election with Obama as winner. Some interesting stats for the 8pm hour (PST):
Submitting was 108% of normal.
Digging was 202% of normal.
Burying was 137% of normal.
Commenting was 278% of normal.
Comment Digging was 619% of normal.
Comment Burying was 689% of normal.

Note the resulting traffic on one of our DB chains:
Load
CPU
Wow. Talk about making the DBAs sweat. If you were on digg at this time you might have noticed a couple of annoying issues. Logging in wasn’t pleasant. Page load times were longer than usual. Digging was… a bit… unresponsive, submitting a new story may have been prone to failure.

If you recall the previous blog from our lead DBA timeless, you’ll remember we have a concept of ’selectors’ where developers will select a pool and purpose for their db related code. An example might be “main_write” as a selector. This would give a handle to the main database chain for the purpose of updating or inserting a row. A handle like “main_read” would give a handle to one of the many slaves of our main master db. Transactions going to “*_write” are supposed to be very quick and cheap. Our write masters do not run the dbmon.pl software to kill off connections – that is reserved for the massive pool of read slaves.

Now, here’s the bug. A small number of our queries on “*_write” are not writes. They are reads for a very, very small subset of queries that need the absolute latest information. Any slave lag might cause weird problems with how we keep certain elements (like new user handles) unique. Unfortunately, a bug in our code that does this ‘quick check’ was generating 6 or 7 thousand quick checks… Multiply this by the huge amount of traffic Obama generated and our ‘main_write’ selector ran out of connections:
Connections

While the graphs show problems - they are only clues an investigation needs to start. For the above, I parsed all the queries from the hour before as a baseline measurement of ‘normal traffic’, and then took a look at the 8pm hour. Most of the work is in figuring out ways to classify and group similar queries together to pinpoint anomalies. Often after grouping everything together, it is quite obvious that a certain class of queries is causing problems. For example, we were seeing a very disproportionately high number of queries of one class in relation to other classes of queries. Once the issue had been isolated, I sent the results via email to one of our software engineers. In this case, Kurt, looked at the anomaly and tracked down this bug introduced some time ago.

After pushing the code we have seen much more consistent normal traffic:
Questions
Queries
Thanks to Obama, the many people who took interest in the elections, and our digg community, we have been able to fix the bug. We’re always grateful for your participation and feedback on Digg. It’s incredibly helpful. So, keep it coming and stay tuned as more improvements are on the way.

Digg on.

Driver

plathrop

Master of Puppets #1: ralsh

Allow me to introduce myself. I’m Paul, Sr. Systems Engineer and resident Puppetmaster for Digg. “Puppetmaster?” I hear you say, “What do puppets have to do with Digg?” Well, that’s what I thought I’d write about today.

To answer the question, it isn’t “puppets”, it’s Puppet, the open-source configuration management tool, and a relatively new addition to the Operations Secret Sauce here at Digg. Dr. Timeless and Joe have already given you a good overview of the complex architecture that makes Digg possible; in my posts I’m going to go into detail on how we manage some of the components of that architecture using Puppet (and touching on other tools along the way).

First off, let me answer a couple of questions that I’ve had to answer in the past. Why configuration management? Why Puppet, specifically? Configuration management buys you a number of benefits, and other people have written about those benefits more extensively, and more eloquently, than I can. Suffice it to say the Ops team at Digg believes configuration management is an incredibly powerful tool for managing complexity in a large-scale architecture (which, by complete coincidence, is exactly what we have!)

We chose Puppet after evaluating a number of other options (cfengine, and bcfg2 among others). Like many people, we thought that Puppet, cfengine, and bcfg2 were the top contenders in this space. Cfengine is, in a way, the grand-daddy of open-source configuration management tools. Unfortunately, certain design decisions and philosophical stances it assumes leave it somewhat behind the curve. Bcfg2 showed a lot of promise, but in the end it lost out due to a few key factors. First, there was a lack of “Bundles” supporting the things we needed to manage; Puppet had “types” which more closely matched our needs. Second, the philosophy of Bcfg2 is a “total management” philosophy; you can’t deploy a Bcfg2 configuration that only manages a small portion of a machine’s configuration. For our needs, the ability to implement configuration management incrementally was very important, and Puppet gave us that flexibility. Third, the overall design philosophy of Puppet makes a lot of sense to us. Last, and least importantly, I had more experience with Puppet that I could draw on as we rolled it out.

For those of you that are unfamiliar, Puppet is a multi-tentacled beast project with several components: a declarative language for describing system configuration, a standalone parser for that language, a resource abstraction layer for providing platform-agnostic manipulation of the resources described by the language, and a client-server daemon for distributing and applying configurations described by the language. (Phew, what a mouthful!) The foundation of Puppet’s power is the resource abstraction layer, which I will talk about below. The rest of the Puppet components (and how we use them) will have to wait for another post.

Puppet’s resource abstraction layer (the RAL, to save me some typing) allows us to think of various aspects of system configuration as “resources”. Puppet comes with a tool called ralsh which allows you to interact directly with the underlying RAL; you can use it to list resources or manipulate them in the same way Puppet does. “But what exactly is a resource!?” I hear you cry. A resource is a discrete component of system configuration. Some examples are: cron jobs, users, host file entries, mount points… Puppet’s RAL knows how to manage a wide variety of resources natively; it is also fairly easy to extend it to manage resources it doesn’t already understand.

ralsh is an incredibly powerful tool, and decidedly underused in the Puppet community. Let’s pretend you’ve inherited a decidedly non-homogeneous architecture, filled with various flavors of Linux and *BSD. Want to add a user? Just remember that on linux you can use adduser, or useradd, but on FreeBSD you probably want pw, and who knows how many different command-line switches you need to remember, or maybe just consult the man pages every time… Or, use ralsh. It provides an interface that works on every platform Puppet works on. The interface is the same on every platform. Want to know what users are defined? ralsh user lists them. Just want to know about this guy named baduser? ralsh user baduser will give you the dirt. Want to add a user? Watch this:

~$ ralsh user newuser uid=9999 gid=9999 home=/home/newuser shell=/bin/bash ensure=present
notice: /User[newuser]/ensure: created
user { ‘newuser’:
uid => ‘9999′,
gid => ‘9999′,
home => ‘/home/newuser’,
shell => ‘/bin/bash’,
password => ‘!’,
ensure => ‘present’
}

This doesn’t get really cool until you realize you can use the exact same tool and syntax to manage any type of resource that Puppet understands. Check it out:

~$ ralsh package xmlstarlet ensure=present
package { 'xmlstarlet':
ensure => '1.0.1-2'
}


~$ ralsh cron check_my_mail command="/usr/bin/fetchmail mail.my.mail.server" user=plathrop hour='*' minute='*/30' ensure=present
notice: /Cron[check_my_mail]/ensure: created
cron { ‘check_my_mail’:
command => ‘/usr/bin/fetchmail mail.my.mail.server’,
monthday => ‘absent’,
hour => ['*'],
environment => ”,
target => ‘plathrop’,
special => ”,
minute => ['*/30'],
user => ‘plathrop’,
ensure => ‘present’,
weekday => ‘absent’,
month => ‘absent’
}

This is totally cool! Suddenly I don’t have to care what platform I’m on, and I can think of these things in an abstract, encapsulated manner. I can choose platforms based off of what they are good at instead of requiring homogeneity in order to minimize the costs of management. Like OpenBSD’s firewall, but you have a Debian network? By all means, throw an OpenBSD box in there; you can use the same commands to manage both. Want the performance of a FreeBSD network stack for a certain application? Go for it, you already have the tools you need to administer it.

Not only that, but if you are building a Puppet infrastructure, you can use ralsh to explore your existing configuration; the output you see above is valid Puppet code, and can be used in a Puppet configuration as-is!

Next time I’ll talk more about the higher-level components of Puppet; the RAL, though awesome, is just the foundation. The Puppet language allows us to treat configurations as code; we can use the same techniques software engineers use to manage their codebases to manage our systems configurations. The Puppet client/server allows us to apply consistent configurations across a cluster of machines, and manage the entire life-cycle of each system, from deployment to retirement. Combined with a flexible and robust automated deployment system like Debian FAI, Puppet can help you drastically reduce the intervention required to bring a machine up from bare-metal to production; saving both time and money as well as giving you a chance to focus on more important issues.

See you next time.

–Paul