Shop More Submit  Join Login
We've found a simple method for creating a circular buffer using a normal MySQL table. This technique is obvious once you've seen it, and I'd be surprised if it hasn't been done before.

Why Would You Want a Circular Log?

Say you want to log messages, but you don't need to keep old ones. If you were logging to files, you could use a log rotation program. But what if you're logging to a database?

Couldn't you just regularly truncate the table? Well, that's what we tried at first. But when someone wanted to see a message from 22:00 the night before, and the truncation had run at midnight, they were out of luck. What we wanted was a way to keep at least 24 hours worth of entries at all times.

Features of the Circular Log
  • Each log entry requires only a single SQL statement.
  • The maximum number of rows in the table can be strictly controlled (and resized).
  • It's fast.
  • There's no maintenance required.

Rolling Your Own

First, create the log table.

CREATE TABLE circular_log_table (
    log_id BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    row_id INTEGER UNSIGNED NOT NULL UNIQUE KEY,
    timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    payload VARCHAR(255),
    INDEX (timestamp)
);

Next, decide on the number of rows you'd like to retain. We'll call that number MAX_CIRCULAR_LOG_ROWS.

Finally, to add new rows:

REPLACE INTO circular_log_table
SET row_id = (SELECT COALESCE(MAX(log_id), 0) % MAX_CIRCULAR_LOG_ROWS + 1
              FROM circular_log_table AS t),
    payload = 'I like turtles.'

That's it.

The payload column is here as an example. Any number of additional columns of any type should work, as long as they're all set in the REPLACE statement.

How Does it Work?

If you've used Linux, you're probably familiar with one circular log: the kernel's ring buffer, accessed and controlled through dmesg. The buffer has a fixed size. Once it fills up, it loops back on itself and starts overwriting old messages with new ones. That's essentially what happens with the MySQL log table as well.

Carrying the analogy dangerously far: the modulo of the log_id and the buffer size acts as a pointer to the address (row_id) in the table to write to.

Watching it In Action

Let's say that MAX_CIRCULAR_LOG_ROWS was set to 100. When there's no rows in the table, the subselect will give us 1 for the row_id (COALESCE(MAX(log_id), 0) % 100 + 1 = COALESCE(NULL,0) % 100 + 1 = 0 % 100 + 1 = 1). This means that the first row inserted will have log_id = 1, row_id = 1. So far so good.

When it's time to insert the second row, MAX(log_id) will evaluate to 1 (since we haven't yet inserted the second row) and so the row_id will be 2, which again matches the log_id of the row upon insert (log_id = 2, row_id = 2).

This proceeds as expected up until 100 rows have been inserted into the table (log_id = 100, row_id = 100).

On insertion of the 101th row, row_id rotates back to 1. (COALESCE(MAX(log_id), 0) % 100 + 1 = 100 % 100 + 1 = 1) Now, when the row is inserted it, due to the unique constraint on row_id, it will replace the row with row_id = 1 and the new row will have log_id = 101, row_id = 1.

The process continues to repeat itself now thanks to the modulo. At log_id 201 we'll be back to row_id 1, and again at 301, ad infinitum.

Resizing the Log Table

To grow the table, just increase MAX_CIRCULAR_LOG_ROWS. There will be a lag until the row_id reaches the old MAX_CIRCULAR_LOG_ROWS and then it will grow to the new limit.

To shrink the table, decrease MAX_CIRCULAR_LOG_ROWS and then DELETE all rows with log_id < MAX_CIRCULAR_LOG_ROWS. Again, there will be a lag until all entries are continously ordered without gaps. And keep in mind that the DELETE could lock the table and take a while.

Is It Stable?

We've been using this technique for almost 2 years now on a 2,000,000-row table with a dozen columns and multiple composite indexes. The log_id is up to 615,069,600 at the time I write this. The table has accumulated some overhead, but the overhead is still a fraction of either the table's data or index size.

Eventually the log_id column will be exhausted, but even at 10,000 inserts per second it'll take 3.5 billion years.

DWait and Dependencies

Mon Nov 15, 2010, 5:46 AM by kemayo:iconkemayo:
It is a truth universally acknowledged, that a website in possession of much JavaScript, must be in want of a way to reduce HTTP connections.

The more files you include on a page the longer it takes to download everything. Even when all you have is a lot of tiny files like JS, there's still a large limitation in the form of browsers limiting the number of HTTP connections they'll make to a single website at once. This limit is normally 2 connection to a domain. So only two files at once can be downloaded, and there's a certain amount of negotiation overhead when moving to the next file.

Since page rendering is held up by all of the scripts and CSS in the head, that means you really want to have as few files as possible load in the head. Otherwise your viewers are left watching a blank page for precious fractions of a second while the 20 files in your head are downloaded two at a time.

deviantART has a lot of CSS and JavaScript. I counted right now (ack -G ".js$" -f | wc -l), and we have 560 JavaScript files and 310 CSS files. Not all of them are needed on every page, of course... but our core set of JS that gets loaded everywhere consists of 53 files, and the equivalent CSS is 34 files.

Back when we were a young site we just stuck all the JS into the head, because we didn't know better, and also because we didn't have much JS back then. But then we added more functionality, and we noticed just how slow it was making us. So we set out to develop a way to not suck.

Nowadays we use a system of automatically bundling up our JS and CSS into big files, so a single HTTP connection can fetch them all at once. So the 34 CSS files I mentioned become this big file. In the case of the JS it's a bit more complicated, and we also minify all of the files using the YUI compressor, resulting in something like this.

We define all of these bundles in "list files". These are simple text files listing the other files that should be combined. So v6core.js.list contains a bunch of file names, and it gets bundled together as v6core.js.

The bundling occurs in a svn commit hook. So whenever a developer makes a commit that touches a .css or .js file, it triggers a rebuild of the .list file that contains those files. The rebuild happens on our staging server, and the files get copied out to production when we do a release.

Dependencies

Now, because we know this .list system exists, we get to make lots of small JS files that contain single pieces of functionality. These components wind up having dependencies on each other... jQuery is used almost everywhere; lots of code creates modal windows; etc. So now we're faced with the problem of only including the .list files that contain the code we need for the current page.

We used to have to manually declare all of these dependencies in PHP when adding JS/CSS to a page, like so:


$gWebPage->addJSDependency("lib/json2.js");
$gWebPage->addJSDependency("lib/difi.js");
$gWebPage->addJSDependency("lib/events.js");
$gWebPage->addJSDependency("pages/awesome.js");


This is obviously somewhat unwieldy, and is prone to us forgetting a dependency but having it work because at the moment the missing dependency is in a .list file that's being included anyway. Then breaking later because we rearrange the .list files so that less commonly used code is only loaded when needed.

So now what we do is have some special comments at the top of our JS/CSS files which look a little like this:


/* This is the hypothetical pages/awesome.js
@require jms/lib/difi.js
@require jms/lib/events.js
*/


Then in the PHP we just have to do:


$gWebPage->addModule("jms/pages/awesome.js", MODULE_FOOTER)


...and it'll take care of the rest without us having to think about it. It guarantees that the dependencies (and their dependencies) will be loaded before the requested file. The second argument ("MODULE_FOOTER") is a priority; the caller can say whether they need this JS to be output in the head, the top of the body, or the end of body. This makes sure that the only JS in the head is the JS that really needs to be there.

The dependency mapping is built in the same commit hook that I mentioned earlier, and is serialized out into a file that's loaded when we need to resolve dependencies.

DWait dwhat?

When we were trying to remove as much JS as possible from the head, because it blocks rendering, we encountered the problem of JS in the head that controls behavior on the page. It obviously needs to be there as soon as possible, because otherwise a user who quickly clicks somewhere might see an error, or just have nothing happen. But in the vast majority of cases people won't click really quickly, and if we put it in the head we'll have delayed rendering for nothing.

Our solution to this problem is called DWait. It's way for our JS to request that an action be delayed until a dependency has loaded. This lets us stick a lot of code in the very footer of the page, without worrying about whether some link in the page depends on it.

So you'll see a lot of code like this on dA:


<a onclick="return DWait.readyLink('jms/pages/gruzecontrol/gmframe_gruser.js', this, function () { GMI.query('GMFrame_Gruser', {match: {typeid: 62 }})[0].loadView('submit') } )" href="#" id="blog-submit-link" class="gmbutton2 gmbutton2plus">


This says that the click handler for the link depends on gmframe_gruser.js. If the file is already loaded in a .list then it'll execute the handler immediately. Otherwise it'll remember the click and run the handler as soon as the load has happened.

To detect the loading every bundle file created by our commit hook gets a line of JS added to the end which tells DWait that the individual files within it have been loaded.

There are also a few JS files that have a special command in their header called "@@fastcall". This means that the file is so important to the page that it has to be output directly in the head of the page as an inline script. We cache a minified version of the JS in the dependency map so that this case doesn't involve extra file reads on the webservers.

There's one more trick that DWait has for cutting down load time, and it goes back to the priority argument to addModule that I mentioned earlier. We can tell it to use the priority "MODULE_DOWNLOAD" which means that dependency information is passed to DWait, but that the JS file itself isn't loaded. Instead it waits until a DWait.ready call asks for it and then dynamically loads the file. This is fantastic for rarely used functionality, with the tradeoff being a slight delay when the user first uses it.

Takeaway

These techniques are important for any website, no matter how small. Page load speed has a major effect on how people perceive your site, and there's a lot you can do to to improve it. As a first step, just get as much as possible bundled up and out of the head of your page, and see how much effect it has.

Devastating scrabble play

Wed Nov 3, 2010, 1:27 AM by randomduck:iconrandomduck:
Author: chris
Date: 2010-11-03 00:20:40 -0700 (Wed, 03 Nov 2010)
New Revision: 129415
Log: I am a bit drunk.

[11/3/10 12:42:05 AM] randomduck: bolt, did you declare all those  $secdb you making inserts on?
[11/3/10 12:43:28 AM] chris: thank you
[11/3/10 12:45:10 AM] pachunka: it's my fault
[11/3/10 12:45:16 AM] pachunka: my last scrabble play really shook him
[11/3/10 12:45:21 AM] pachunka: it was devastating

Author: chris
Date: 2010-11-03 00:46:38 -0700 (Wed, 03 Nov 2010)
New Revision: 129417
Log: I am sorry.


At the close of my first week with the deviantART dev team, I thought I would take a moment to reflect on my observations.  Now that I've got my VM environment set up (which is awesome, btw…  anyone w/ a complex PHP development environment and more than a handful of developers should invest in this route) and have already fixed a few bugs on the site, I'm taking a moment to step back and reflect on my first week.

Getting set up

The set up process was actually a breeze: download the vm, follow the instructions in the "Getting Started" page on the internal wiki, and within less than an hour I had a copy of nearly the entire deviantart.com website running on my laptop.  This is truly amazing, especially when I consider at my last job it took a new developer 1-2 days, sometimes more, to download and set up local copies of Apache, PHP, Java, Ruby, MySql, and all the necessary libraries, plus get everything configured and working, even with clear instructions and scripts to help automate the process.

Other than that, with my deviant account upgraded to administrator privileges, access to email, the code repository and developer wiki, in practically no time I was ready to take a stab at a ticket or two, faster than I think I've ever been set up in a new company before.  Low overhead!

Organization

The deviantART dev team is is a very modern, well run, distributed team with developers collaborating all over the world, as far away as Europe and South America in a variety of time zones (all speaking English, thankfully!).  Working within a flat organization hierarchy, team members are encouraged to raise ideas and ask questions with little reason to fear stepping on anybody's toes or going over somebody's head.  Most communication happens over chat (instead of email, which is growing outdated).  And while I find chat to be a distraction at times, I recognize it is incredibly efficient at disseminating knowledge rapidly.  The team does a good job of archiving its knowledge on the internal wiki pages.

As a new developer facing an advanced codebase over ten years in the making, I obviously had (and still have!) a number of questions.  Hitting up any number of the tech team's chat rooms for help usually results in me getting an answer within just a few minutes.   I am very impressed how happy everyone on the team is to help.  This is a culture of collaboration which makes me feel very much a part of a team, even as I write this many thousands of miles away from the main corporate office.

To maintain unity, everyone is brought together once per week in a massive conference call to review the progress of all active projects.  I was weary of such a large call at first, wondering how so many people could collaborate effectively on a single call, however the participants are very respectful of an implied etiquette not to interrupt or "talk over" when someone has the "floor" and is speaking.  During the week, self managing project teams meet as needed.  

Cool technology

Delving into the internals of what makes deviantART tick reveals a treasure trove of really cool, "state-of-the-art" technology.  These are the guys behind the first skinnable mp3 players and online music communities from way back, and they are very talented, forward thinking computer scientists.  deviantART started building social networking technology before there was really much social networking technology around, and I find that there's innovation underneath the covers that predates modern convention in almost every corner.  

I've been delving deep into the JavaScript layer, for example, and I find a very advanced dependency framework, with dynamic loading and a super slick event handling system, never mind what Mike D's been doing w/ deviantART muro  …  that's hot!  

As I dig into more layers, from the front end JavaScript through to the app layer, search engine, database, server infrastructure, and more, I see more evidence of very modern, progressive thinking.  I'd love to pour into the details of it all like the component driven architecture system, or the developer runtime environment, but all I can say right now is to stay tuned to the dT tech blog as I know the developers are planning to share some of the details moving forward!

Close interaction with a passionate community

What impresses me the most so far is how well integrated the tech team is not just with each other, but with the users of the site.  This part is really cool, and I think it's the most valuable part of the deviantART business, which clearly puts community building and user satisfaction at the top of the priority chain.

I fixed a rather simple bug with the ShoutBox, a small public chat room that hangs off the side of the main chat area, as well as optionally on group pages.  When I entered the ShoutBox, just to confirm the ticket I was assigned, I was immediately identified as a developer and greeted from the users there.

See, deviantART's website places a unique character identifier in front of each user's username indicating their role on the site.  For example, a tilde ('~') will represent a common user.  An asterisk ('*') will represent a premium member.  There's others I have yet to understand like equals sign ('=') and more.  My user, $saladoche, has a dollar sign, which means I have admin privileges on the site.   Additionally, when you go to my deviant profile page, you'll instantly see I'm a member of the #dt tech team (note the pound sign to identify a group.  For more, see  this FAQ entry about symbols)

So basically, when I entered the public ShoutBox, everybody could see I was one of the site's admins, and greeted me kindly like anyone else, but since they could see I was a developer, one of the users informed me of an additional glitch that was interfering their experience, different than the ticket I was assigned.  I told everybody I'd get on the case.

A few hours later, after making my way through the code, finding the bug, and fixing it in my local VM, I pushed the change live and returned to the ShoutBox to let everybody know it was fixed (actually, my partner kemayo , who assisted me with my first deployment, let them know before me, but for the purposes of this blog post, I'll take credit ;)) Everybody was so happy.  Instant gratification -- I think the person who reported the problem to me was still in the ShoutBox.  I've worked on a number of websites, including top-ten portals (from eons ago), and small local community sites, but have never found one where the developers are so close to the users as they are at DA.  This is awesome, I thought to myself, and powerful.

This is the very customer-first mentality that the likes of Tony Hsieh recently wrote a book about, happening right here in front of my eyes.  Developers interacting with customers, and putting the customer needs at top priority.  That's rare!  Most places hide the developers as far away from the customers as possible.  In some cases, that's a good thing.  However, in other cases, it works really well to have the developers interacting with the customers, especially in cases where the developers are really passionate about the product their building, which is exactly what I find at deviantART.

Open to ideas

I posted my first journal entry commenting that it took me three days to figure out how to change my avatar picture on the site.  I got a few comments on that journal post (I didn't think /anybody/ would read it), but what surprised me most was that the user interface team brought it up in the weekly developers meeting and committed to fixing it, as they'd heard other people with the same problem.  (Since then I've raised a few more issues internally and they are **on it**, working to solve problems and improve the user experience, with some really sleek items in the pipeline)

A work of art

My dad, a long time computer programmer, imparted to me that software development is very much an art form, an ideology I believe more and more with each passing year.  Donald Knuth expressed this concept in the title of  his book, influencing a number of famous computer scientists to agree (Richard Stallman, Guido van Rossum, Bjarne Stroustrup, just to name a few…)

Certainly my happiest realization of being a part of the deviantART dev team is that I've discovered deviantART's community of artists includes not just the users who contribute digital works on the public facing side, but also the software developers contributing their digital works on the back end.  When I take this into consideration, that I'm contributing to a piece of art, suddenly the work becomes so much more enjoyable.  

And here my conclusion lies: deviantART the site, and the community around it is, by its own composition, a work of art.  From the digital media the users contribute, to the code the developers develop, to the digital interactions that take place between everyone involved, it is all creative in nature, and artistic.  I think the more I take this philosophy to heart, the more I will enjoy my job, and the more the website, company, and community as a whole will benefit.  And that feels fabulous.

Think you've got what it takes?

If you're passionate about programming, and being creative, this is surely the place to be.  I'm happy I joined the team; now I've got to go be creative.  Hasta lluego...

Quoting Donald Knuth, "A programmer who subconsciously views himself as an artist will enjoy what he does and will do it better."

deviantART muro and HTML5

Mon Oct 18, 2010, 9:34 AM by mudimba:iconmudimba:
I was recently asked by the folks at HTML5Rocks.com to write a "case study" about deviantART muro and how it uses HTML5.  For those of you who don't know, HTML5 is a new web standard that quite a lot of the tech community is excited about.  Here is the article for your reading pleasure:

Case Study: HTML5 in deviantART muro



Recent Journal Entries

We're Hiring Developers

We're looking for talented web developers to join our team! :la: Interested? Check out deviantart.theresumator.com/ap…

Journal Writers