Shop Mobile More Submit  Join Login

DT 2012 Baguette meet at Breadlands

Tue Jul 17, 2012, 1:19 AM
Breadlands Logo By Randomduck-d52tawr by randomduck


Our dt development team is 100%  remote. We documented our first foray in cautiously  meeting each other last summer in the blog DT 2011 Army Barracks Raccoon-Con, the event that took place in Marin Headlands near San Francisco. Or team is also fairly evenly distributed across a couple continents and several timezones, with many of the developers hailing from UK, France and other European countries. This year, those Europeans managed to coerce North America based developers to get their papers in order and tricked them into embarking on the long flight. Destination - beautiful village of Les Sauvages (which literally means “the savages”), France, countryside near Lyon and Beaujolais region. This event got named "Baguette meet" and the camp we would be staying in for the next two weeks as "Breadlands" in homage to our previous year meet at Marin Headlands.

Les Sauvages by micahgoulart

Planet Breadlands by ArtBIT

And this is our own germ city on the tennis ball!

Untitled by randomduck   Untitled by randomduck

The morning ride by samshull


We did presentations, ignite talks, many pair-coding sessions encompassing work on small ideas and tech-debt. We completed two project hackathons with 19 mini project sprints. And what did not we have this year? We did not have a story of thousands of trunk commits, hundreds of deployments, rogue servers, long read-only and broken site disasters! And why? Because as first order of business during the meet was to switch our repository and deployment system from svn to git. Several of months planning went into this switch and its a subject of its own future blog post. All the work was done in private or shared feature branches, as well as code reviewed using our now standard phabricator process. Some successful bits have been merged already and the rest are to follow in coming months.

DT Discussions by micahgoulart

Untitled by randomduck  Sergey by randomduck

Untitled by randomduck  Untitled by randomduck

The morning when we found out about our remote workers' attendance practices


Last year, a little over 20 people converged on San Francisco for our meet. This year we had 39 people in attendance. DT has grown and in addition to usual crew, we had some new faces to meet and welcome. And also, we invited our friends and occasional nemesis from Product and Product-Marketing to join us. And on the second day we had our first infirmary patient, striken down by what later became diagnosed as a "Tahitian Plague". This a cautionary tale of the dangers of gathering so many remote developers together and exposing their untrained immune system to germs, presumably spread by office workers in Product Marketing! Also, a word of caution to others who might plan on having a meetup of large group of net addicts in the tranquil countryside. Do not count on the internet to function! Even with having top-of-the line bandwidth shaping system and after throttling every bandwidth hungry service and chasing down every rogue downloader, the internet was .. spotty at best. Complaints about reddit withdrawal suffering rolled in regularly, but what else would you expect from a group where you wont blink an eye when you see two people sitting shoulder to shoulder in an empty room, having a very lively conversation on ... Skype chat!

456342 10100887311732097 310159810 O by randomduck

Pachunka insists that he is "on the mountain"

Breadlands 2012 by micahgoulart

Untitled by randomduck      Untitled by randomduck

Common scene: Product team is pacing and thinking, while DT developers are eating gravel!

Untitled by randomduck   Untitled by randomduck

Untitled by randomduck  Coding Hat By Artbit-d545fje by randomduck

Untitled by randomduck  Untitled by randomduck

Performance of a Traditional Canadian folk music band

Untitled by randomduck

Introducing Mr. June, from "Accordian Babes" pinup calendar

Da Breadlands 2012 By Knightar Web By Micahgoulart by randomduck

This is the group shot where we were told to act normal ..


When we called our camp "Breadlands", we had no idea how much bread we are gonna see and eat over the next two weeks. And cheese, and sausage, and more cheese and bread. And wine, of course. But despite three healthy, regularly scheduled meals each day, the fine fast food establishments of nearby town were raided regularly with an aid of a gypsy-cab.

Untitled by randomduck    Untitled by randomduck

Here is about a three hours supply of beverages. And there is jekor wondering  how to say "omelet" in french

Untitled by randomduck     Untitled by randomduck

"I would kill for the cheeseburger"

Untitled by randomduck   Untitled by randomduck


The tradition of very large bonfires continued. And it appears that several people have spent the offseason practicing their ax handling. We also, unlike last year, did not have any contractual obligations not to consume alcohol. Which interestingly lead to fairly mellow pace of said consumption.
Untitled by randomduck Untitled by randomduck


Attentive reader might remember that one of the hits of last year meetup were encounters with ravaging troops of raccoons that possessed no fear of humans. In contrast, Breadlands had an abundance of rather mellow farm animals that had no interest in stealing candy and ibuprofen.

2012-06-13 21.30.26 Hdr 2 by randomduck

Where DT sees a real cow for the first time

Untitled by randomduck  Untitled by randomduck Untitled by randomduck

We've just finished moving every message in every user's message center — all 6 billion of them — to a brand new set of database servers, and we managed to do it without anyone noticing. We didn't find much published material about the best way to do this, so we wanted to share a few details with you. It does get a little bit technical though so I won't be offended if you skip over some parts!

The data that powers message center was stored on one of our main sharded MySQL database clusters. That same cluster powers a lot of different features of the site and we recognised a lot of traffic was caused by this relatively simple but large set of data.

To free up capacity on that cluster, and help keep the site quick as we continue to grow, we decided to move the message center data to its own servers.

Some fun figures:
  • ~6 billion rows of data in 13 tables across 16 shards
  • Average 5k, daily peak of 10k queries per second for this data across the cluster
  • Average 100k, daily peak of over 150k rows read per second
  • Average 500, daily peak of over 3k rows written per second

So we had a pretty large set of data and the high traffic to it meant that we'd have to be very careful how we copied it over without losing any data, taking the message center down for a few days or otherwise affecting users.

To add a bit of fun to the project, the old tables were still using MyISAM storage engine as they've been around from before InnoDB's more recent rise to dominance. We decided it would be a good time to switch to InnoDB. If this doesn't mean much to you, then it means we decided to switch to a different type of database which has become a much better choice than it was a few years ago when we first designed message center.

How Much Hardware?

We started out not totally sure of how much hardware we would need because it was quite hard to get meaningful measurements of our current database server load when it was only a part of the traffic on the old cluster.

So to enable us to experiment and prove the new setup, we developed an extension for our PHP database API to allow us to duplicate query traffic from the current tables to the new cluster. We made it possible to control the percentage of read, write and delete queries mirrored individually though our distributed config system, and ensured that failures in the new cluster would not impact the real traffic.

After some experimentation we found a setup which worked well and gave us much more capacity to grow.

The new cluster has 8 servers as masters, replicating to 8 identical machines as a hot-backup.

Each machine has these specs:
  • 12 Cores
  • 96GB RAM
  • 200GB Intel SSDs

Since our existing cluster is sharded over 16 masters, we kept the data split into 16 logical shards and put 2 on each master machine in separate databases.

Syncing the data

Satisfied with the hardware setup, we needed to find a way to synchronise all the data. The initial import step was made more difficult by the move to InnoDB - it took several days to convert the most recent backup to InnoDB before the new cluster could start receiving new write traffic.

This left us with a gap in the data between the old and new clusters and we needed to find a way to get them back in sync without impacting the performance of the live cluster.

The nature of the data meant that even if we could efficiently copy only rows inserted within the missing period, that would still leave the data inconsistent where deletes happened during that time. People would start to see notifications they had already dismissed popping back up again.

So to be thorough, we needed to crawl all the data and ensure its consistency. Rather than copying everything again, we looked into ways to compare consistency between the live data and the new cluster without impacting users.


We couldn't find any documented technologies or techniques used by others that would work for us without taking the production cluster off-line so we came up with a solution that although not perfect, seems to have proven useful in this case.

It's all based on checksums.

A checksum is a number which is obtained by doing some calculations on the numbers that make up any piece of data on a computer. In general, if the checksum for two pieces of data is the same, they are very very likely to be identical. If the checksums are different, the two pieces of data are guaranteed to be different. The important thing is that using a checksum we can compare many thousands of rows with just one query to each cluster and very little data transferred.

We came up with a query that could calculate the CRC32 checksum of all the primary keys in a batch of in-boxes surprisingly efficiently. In this dataset, rows are never updated - just inserted or deleted - so primary keys are the only data points needed to ensure consistency.

The query looks something like this:

SELECT CRC32(SUM(row_cs)) as cs
  FROM (
      SELECT CRC32(CONCAT(folderid, type, itemid)) as row_cs
        FROM table
       WHERE folderid IN (1,2,3,4,...)
    ORDER BY folderid, type, itemid
  ) as tmp

[edit] rehael pointed out correctly in the comments below that the ORDER BY is unnecessary here. SUM is commutative so it doesn't actually matter if natural order is inconsistent in the inner query. This must have slipped in from an earlier attempt to accumulate rows without the sub query. In this case sort is on primary key so I guess it made little difference, but a good spot. I've left it in for posterity.[/edit]

[edit2] Another interesting potential flaw here is that CONCAT could produce false positives on some data sets. A more correct solution would be to use CONCAT_WS() with a delimiter char such that a row like (123, 456, 78) would not falsely checksum equal to (12, 345, 678). I'll credit pt-table-sync for this realisation (see bottom of article).[/edit2]

Using MySQL's EXPLAIN showed that this query didn't create a temporary table, and could select and sort efficiently using the primary key. This meant it is more likely to run quickly and not cause problems for real users of the site. In testing, we found it ran in under 200ms even for the biggest batches of folders we could find.

Despite the checksum being apparently quick, we still couldn't actually copy the out-of-sync data from big in-boxes from our master servers without locking extremely high-traffic tables, so we ran the checksums and copied the data from our hot backups.

To balance the amount of data copied with number of checksums needed, we first checksummed an entire batch of 50 in-boxes at a time. If there was a mis-match, we checksummed each in-box in the batch in turn and any that didn't match we deleted all rows from the new cluster and copied all rows from the old. There are more sophisticated options we could have used at this level to optimise for big in-boxes where not a lot changed, but none that were without downsides.

In reality the checksum query turned out not to be super-fast all the time, but we found more than 99.5% of the few million we ran in the end executed in under 200ms and over 99.8% in less than 1 second. In general checksum query load was minute compared to the load of the data-copying queries for big in-boxes.

You may not want to try this at home

There are a few flaws in this technique.

Firstly, the fact that we had to sync from replication slaves meant that any replication lag between master and slave caused inconsistency in the copied data. Once there were a few slow copy queries running, the replication lag would get worse. We found generally it kept up very well but we couldn't guarantee 100% consistency in one pass.

Secondly, even ignoring the replication lag, checksums could not be not atomic between the clusters - a user may have deleted or received a message in between the time you get the checksum from one cluster and the other.

So we knew a single pass through the data would not be sufficient. However, thanks to the checksum trick, we could continue running the sync job through the dataset and each time a few orders of magnitude less data would be found out of sync and need to be copied. Each pass also got substantially quicker as less data needed to be transferred.

After just 3 passes (several days for the first, a few hours for subsequent) we measured 99.9995% consistency. We called the job good and switched users over to read and write primarily from the new cluster. We're still writing to the old one for a week or so until we are totally sure of the new cluster.

Obviously this sort of slightly fuzzy consistency would not suitable for other data that may require absolute integrity (like customer's orders). For us, it's acceptable because:
  • a significant proportion of the < 0.0005% of in-boxes still measured as out of sync will actually be fine - just reported due to the measurement error noted above
  • statistically it is likely that any in-boxes genuinely out of sync are very large ones (100k+ notifications) and that their owners never look through and curate or remove them
  • in the highly unlikely case that a user did notice a discrepancy, it is more likely to be a notice they deleted re-appearing (they can fix that in one click)
  • these are not personal messages or unique content, just notifications of activity elsewhere on site

The useful take-aways

Despite not being a perfect solution to all live migration problems, I hope this has been somewhat interesting and may even help others with similar tasks. The exact plan we used is pretty specific to our situation but I think there are a couple of more general points to take away from this:
  • Migrating large amounts of data from one MySQL server cluster to another without affecting operation or losing writes is hard and apparently no one has come up with a really good solution yet
  • Under different circumstances, it may be possible to be more robust than this although probably at the expense of taking longer, requiring more resources, or making the service unavailable for some time
  • Checksum queries, although not a panacea, are a potentially useful tool that are not widely discussed for on-line MySQL consistency checking

Big thanks to chris and randomduck for their hard work on this upgrade.

[edit] A number of people pointed us at pt-table-sync from Percona's Toolkit. That script actually ends up doing something almost exactly like what we did here (with different checksumming techniques). I didn't know of it and somehow didn't find it when looking for similar solutions. It looks like a much more robust solution in general although would not (easily) have been able to take advantage of our distributed job cluster which enabled our migration to run sync queries on many folder chunks in parallel, so it probably would have taken much longer.[/edit]

We've launched a set of APIs for deviantART and, some of which were previously available but yet to be documented in a central place.

What can be done with these APIs?

Embed artwork from deviantART on your website or app, get feeds of quality art based on any keyword, submit any content to for instant sharing.

The API calls we've made available are deliberately simple but open many doors. For example, we provide the ability to update a item that has already been submitted through to the API. That opens the possibility of creating a live link between your app's instance of the artwork and the version we host. It makes it even possible to create live deviations that are updated automatically: This is the kind of possibility one might not consider at first glance when reading the specification.


The APIs have gone through a beta period and several products are already using them:

Other APIs?

We have already received requests for new APIs to integrate with deviantART and and we will consider all suggestions we receive. Bear in mind that the decision process for adding new API calls can be lengthy, because when we add an API call we want to support it forever.

We don't want to be like some other large social network's API for which calls are deprecated every 6 months and rules changes all the time. We intend to provide stability, if we add an API call, it's meant to stay. Which is why we're being careful about what we add.

Get started

All the information and working code samples be found on our new developers page:

5962999454 1633eeeb7d B by randomduck

The Place
dt developer group is 100% remote. We wrote about it in We're all remote before. Well, what we did not mention is that once per year, we get to see each other (often for the first time), and work in the same room on exciting new projects.

Untitled by drigh we all decided to go to this place!

Untitled by randomduck
(it looks much nicer in HDR)

Untitled by drigh
Among the many sights: A rare action shot of Pachunka racing swiftly to the conference room at night, shooting beams of light and magic and rollerskates

Untitled by drigh
Ah, Sunny San Francisco!

DT People
Ten years ago, a crack commando unit was sent to prison by a military court for a crime they didn't commit.... Wait, no. That's not us. We're an assortment of developers from around the world who largely are not in prison.

Conehead by mudimba
A typical Irishman.

DEVlANT always has bugs in his sights. :lmao:

P7170245 2 by randomduck
chris, enamored by the prospect of installing Java, hardly able to contain himself.

We're still not sure who this is. Possibly poodude.

Quiet Moment by mudimba
A strict requirement for joining DT is the ability to look randomduck killer-cool in a night shot like. Do you have what it takes? Apply Now

mr $jekor by allixsenos
jekor, unable to see straight after a 20-hour fillibuster about why emacs is the ultimate editor... His real reasoning? M-x tetris

Wait'll They Get a Load of Me by mudimba
We admit our Product Managers here are as creepy and demented as anywhere else.

sneaking up on ArtBIT by allixsenos
Cell phone reception at Marin Headlands was so spotty that ArtBIT resorted to wearing antennae'ed helmets in hopes of picking up a better signal.

The Raccoons
There were a lot of raccoons there. They possessed no fear of humans. They have the ability to unzip bags and rummage through them. Apparently they desire our medication. It's a bit weird.

Raccoons! by 20after4
I can haz ibuprofen?

5972542452 Ecc75c6036 B by randomduck
Seecret Raccoon Brotherhood Alliance headquarters, mortal foes of the elite red scooter gang

Img 20110721 214136 by randomduck
Contrary to popular belief, any site breakage during the meetup was caused by these masked jerks with freakin' laser beams.

5963257462 5b4c749784 B by randomduck
"Suhweeeet, it looks like they're leaving! Now's my chance to sync AND enter read-only mode!"

Untitled by randomduck
"I bet I can pee on ArtBIT's jacket from here!"

5963007498 20b80eb808 B by randomduck
20after4 vanished mysteriously one chilly July night to engage in hand to hand combat with a horde of graham-cracker wielding raccoons. Some say they were after his tomato router.  drigh assissted, notoriously denying after the fact that he *only* kills man. A svelte raccoon sporting LGnome's gray jumper still parades the headlands, high on ibuprofen. Best part: free decals!

The Work
During our Pair Coding and Two Hackathons sessions, we got to build a lot of stuff together, in the same space, for one of the few times in DT history. So here goes of the story of thousand commits, 127 deployments, 17 tech-talks and 7 ignite-talks, 55 project pitches and 16 completed ones, two lost+found developers, 1 rogue server, 19 pair-coding sessions, chat session with community and a long read-only.

Untitled by randomduck
Now, some of the work involves thinking. It's not easy.

Pairing off session by KnightAR
The wonderful room where the work happened. Gosh, there's a lot more of us than there used to be. Pair coding partners were picked by drawing random ducks out of two boxes. Yes, we found over 20 unique ducks. This possibly explains a sudden change in dt member avatars toward .. ducks.
:iconbanks: :iconamoniker: :iconartbit: :iconrandomduck: :iconkouiskas: :iconloopstart: :iconartbitfailplz:

Untitled by randomduck
mudimba and kouiskas working hard(ly?).

We code in helmets by KnightAR
The Flow helmet...

Untitled by randomduck
Now, this shot is a blurry night shot. Why? We can't get DT to stop working even at 1AM.

Untitled by randomduck
kemayo is oblivious to base murder being perpetuated behind him on poor departed jekor

Untitled by randomduck
Yes we do have a lot of macbooks, why do you ask?

Untitled by randomduck
Admit it; you like the song, says ArtBIT. Little known fact: dt is working on a new primitive web language, similar to LOLCode, based around the collected works of Rick Astley. This should be a significant upgrade from the COBOL/LOGO combination that we are currently using.

Untitled by randomduck
aMoniker pair codes with a raccoon sneaking on the roof ledge

Untitled by randomduck
A mutant sneak peek of sorts? A combination of nearly every single Hackathon project in one screenshot.

The Food
Three healthy, regularly scheduled meals a day was a mind-blowing new experience for most of dt. Several varieties of raccoon basted finely in the juices of other raccoons - a culinary treat had by all, served on the side with biscuits, not scones.

Untitled by randomduck
8 A.M. breakfast. Turnout is staggering. Never before has mudimba been seen at this hour.  And apparently the food was tied with our phones at capturing our interest.

a working lunch by allixsenos
Sometimes you have to finish a demo during lunch. Well, LGnome does, anyway.

P7170240 2 by randomduck
spyed loves corn, amirite?

P7170258 2 by randomduck
Nah, too obvious.

The Fire
Approximately every other night there was a campfire. There we sat, and were contractually obligated to not consume alcohol.

Untitled by randomduck
Bip bideh bip bideh bip bideh beaaaah  (modem-off).  Participants attempted to communicate entirely in modem mode. Alas, handshakes were broken due to beer gulping interference and inability to negotiate the baud rates. mccann told a cautionary campfire tale warning us that all men in the entirety of the universe use 69camaro as their password while all women have a password declaring that the object of their infatuation is "dreamy."

The Barracks
You smell that? Take a deep breath through the nose. Really let that seep in. What are you getting? Because to me, that's part man-smell, and the other part is really bad man-smell. I don't know why, but overall it just smells like the color brown. Your thoughts?

Untitled by randomduck
mccann and xraystyle enjoy the ambience of nuclear outpost on arrival

Untitled by randomduck
We may have gone overboard on providing mouthwash.

Untitled by randomduck
Are mudimba, Pachunka and ArtBIT grinning because they just found out that 20after4 stole randomduck's smashed-up wake-up trumpet?

And the music video
Look, we've already rickrolled you in this post. It must be safe to play a video now.

No raccoons, other animals or deviantART site were harmed during filming of this episode. $chris was hungover, and someone shot $pachunka but thats about it. All characters appearing in this work are fictitious. Any resemblance to real persons, living or dead, is purely coincidental.

#DT Live Chat Today

Tue Jul 19, 2011, 12:23 PM by randomduck:iconrandomduck:
As our About Us page attests, dt, or Devious Technology, is deviantART's largest department.  These shy, elusive creatures work furiously behind the scenes to make deviantART run smoothly, zap bugs with precision, and create brand new tools to make dA a better place.

Today at 1:30 PM PST (a mere 3 hours from now!), dt will make a rare, exclusive appearance in the #Auditorium, allowing you the opportunity to observe them in their natural habitat.  This is your chance to get those tech gears in your brain a-churnin’ with some wonderfully geeky dialogue!

:star: Have you ever wondered:

:bulletblue: How deviantART functions technically from day-to-day?
:bulletblue: What kind of server power it takes to run a website as large as this?
:bulletblue: How bugs are detected and fixed?
:bulletblue: Why so many dt members have a rubber duck as their avatar?

These questions and more can be answered by attending our chat with dt today in the #Auditorium!  If you have a question you want to ask, leave it in a comment on this news article, where you can find all the details about this event!

While members of the dt team live all over the world, they've banded together to fight crime and talk tech for the past few days in the headlands of San Francisco.  They will be coming to you live from their secluded cabin, ready and able to answer your questions.  Join us in the #Auditorium at 1:30 PM PST (click here for your local time) for a tech-y good time! :woohoo:

Recent Journal Entries

We're Hiring Developers

We're looking for talented web developers to join our team! :la: Interested? Check out…

Journal Writers