This Basecamp project has been made public so you can follow along. Hey, what's Basecamp?

BCX: Exporting

BCX: Exporting: Where should we run exports?

just moved this page.

See it in its new location

You just copied this page.

Go back to the original or See it in its new location

Message

Project: BCX: Exporting

Where should we run exports?

Posted by Jeffrey Hardy

Jeffrey Hardy
Hey guys,

Full account export for BCX is nearly ready to ship. But before we do that, we've got to make sure we've got the horsepower to handle it!

It's quite similar to Classic's exporter, and I expect its memory footprint to match. It iterates over an entire account's contents. That can potentially be a lot of records.

I've picked the low-hanging fruit already (batch-finding, query caching, and template compilation) -- the stuff we added to the exporters in Classic and Highrise after initial performance was dismal.

We can make adjustments later, once we see how it performs with real accounts.

One big difference from Classic's exporter that it includes files. We iterate over each attachment, symlink it into our local export sandbox, and shell out to the system zip command to archive the whole thing.

Here's where we symlink:

And here's where we archive:

When we're done, we move the archive from the sandbox to the depot (a soon-to-be-written service will remove these from the depot after 3 days):

So, we'll plenty of memory and plenty of space.

My question is: on which meaty server should we run these jobs?

I'm guessing export-01.

Discuss this message

Noah Lorang
Noah Lorang
A suggestion I've made before, but maybe we can do this time:

Can we avoid putting the archive back on the depot? Either by serving it directly from the export host or sticking it on S3?

Seems silly (and hard on the poor little Isilons) to pull a bunch of files off the depot, zip 'em up, and stick them back on the Isilon, only to then run a delete a couple days later.
Jeffrey Hardy
Jeffrey Hardy
The depot is certainly convenient (and I'm lazy). Uploading to S3 before serving will take longer, and means more moving parts to maintain. Happy to defer to the experts on what's best, though.  
Jeffrey Hardy
Jeffrey Hardy
Something I forgot to mention: we can run exports against the slave (for both BCC and signal_id's databases). Exporters only need read access.

I haven't written the code to do this yet, but it's easy to pop it in. Any reason not to?
Jeffrey Hardy
Jeffrey Hardy
Is S3 the consensus then? I'm going to push back a little. If the depot can handle it, we save a lot of complexity by taking advantage of it.
Taylor Weibley
Taylor Weibley
What about just adding a tiny app that does nothing but auth and putting the file on the local fs of export-01 (and thus letting it get served from there).

NFS is dead. We want to build less dependency, not more.
Jeffrey Hardy
Jeffrey Hardy
The problem with just adding a tiny app is that it's more moving parts. Isn't this what our depot abstraction is for? We're already dependent on NFS. I don't think this will make our divorce any more difficult.
David Heinemeier Hansson
David Heinemeier Hansson
Also, wouldn't we be paying S3 transfer costs if we put them there? That might not be peanuts if people are exporting GBs of data regularly.

My vote would be for local NFS storage as well. If we are moving away from NFS, presumably the same solution can be applied to everything at once?
Taylor Weibley
Taylor Weibley
Queue my increasing frustration (not with you Jeff, but with NFS)...

I'm pushing for an alternative, which yes requires more work up front, but decidedly less pain and work later.

By the same logic, adding a non nfs depot should be easy since the abstraction is already there.

There are real performance problems with our existing nfs depot that we deal with and encounter every single day. We've made zero, no less than 0 progress with Isilon in resolving the performance problems.  Yet we continue to deploy more and more onto the same depot.

My opinion is that we have a habit of deploying things to production that hurt too much and/or verge on broken (performance wise) because it's good enough or that's the way we already do it. Then, once things begin to crash we all rally around Jeff, or Jeremy, or Jamis while some work around is developed to relieve the pressure. At the same time we spend more time troubleshooting the broken system instead of working on something without the same issues.

Large files cause lots of pain for NFS. Every. single. day. 2G files stall BC and BCX nfs depots on individual servers.

My suggestion remains that we use S3, Kennel, or any local storage. It can be wired up by making X storage target publicly available (via private uri), and then returning that storage location once the user is auth'ed. I've talked this over with JK a number of times and the most painful part is migration of the attachment table, which we can do online now.

I'm on board with helping however I can. It's a lot better than fighting NFS.
Jeffrey Hardy
Jeffrey Hardy
David, that's what I was hinting at with "depot abstraction". We used the same depot client when we were on MogileFS. If we weren't using that abstraction, changing storage backends would be painful.
Jeffrey Hardy
Jeffrey Hardy
I'm sympathetic to your NFS frustration, Taylor. If 2GB files are causing problems already, then perhaps we don't have a choice.

I can start investigating S3. I'm going on support next week. If I'm not able to finish this up in time, someone else can take it the last mile.
Taylor Weibley
Taylor Weibley
Sounds good. Thanks for being sympathetic to the reality of how NFS works today and the problems that causes.

Jeff, one thing that would be good to track is a.) how often some grabs the export (vs never), and how many times.

Again I'm happy to help however possible.

For everyone's reference: There is no charge for bandwidth-in on S3. We also have a free 1G pipe from Nlayer so we don't even use our paid bandwidth. Total cost last month was $38.
David Heinemeier Hansson
David Heinemeier Hansson
If NFS is so unworkable, let's get a group going on coming up with something else. Or better yet, find something else out there. We can't be the only ones out here with this problem.

I just don't see how coming up with some one-off for exporting is going to do us that much good. If this is a big problem, let's treat it as such and come up with a proper, comprehensive, and system wide solution.
Jeremy Kemper
Jeremy Kemper
I'd say pile this on and hope it's the straw that breaks the camel's back trollface

Run the export jobs on export-01 itself, put the staging directory on local disk, symlink files to NFS, and build the zipfile on local disk. Serve the .zip download URL via nginx secure link or internally reproxy it using our new nginx hotness.

Let's lean on NFS where it works well, like this case, and pay for that luxury by putting pressure on NFS -- and eliminating it -- where it's a painful misfit, like all the apps still doing per-request file stats.
Taylor Weibley
Taylor Weibley
Jeremy,

I'd be willing to "trade" fixes in all of the major applications around transactions and depot.put/copy for this. As long as we all agree that once this starts causing problems it goes to the top of the list and doesn't get put on the back burner.
Jeffrey Hardy
Jeffrey Hardy
Before we go there, looking at the FileRepository::S3 depot you showed me last night, Taylor, it seems like I'll be able to use it without much hassle.

BCX.s3_depot = FileRepository::S3.new
Export.depot = BCX.s3_depot if Rails.env.production?

Export doesn't care about its backend. It just wants to talk to a depot.

Exposing a general S3 depot will let us consolidate our S3Backup concerns too, which currently talk to S3 directly. Seems a worthy investment.

I'll take a stab at it today. If it proves too much of a pain, NFS is waiting in the wings.
Noah Lorang
Noah Lorang
And old right back again:
(Sorry, couldn't resist. I know there are more in the archives too...)
Jeffrey Hardy
Jeffrey Hardy
Update: I switched us over to S3 for hosting and serving completed exports. I think we're about ready to roll!

The only remaining question I have is how to force these jobs to run on export-01. I haven't checked on how to do this yet. Anyone know off the top?
Jeremy Kemper
Jeremy Kemper
Could do a separate resque-pool config for export-01, like we have for beta, and only enable the job there. Need to ensure the queue name doesn't match bcx_*_production, though, or else the main worker pool will pick it up.
Jeffrey Hardy
Jeffrey Hardy
John, its just a resque job, but it performs the export. So we need to make sure we're only running it somewhere with enough juice.

Jeremy, makes sense. Having a hard time thinking of a name that doesn't match bcx_*_production though! We could use export_* as the prefix. If other apps need to use export-01 similarly (we've discussed it for HR), they could follow the same convention.
Jeffrey Hardy
Jeffrey Hardy
Ok! Getting closer. We're running against the slave database now and the export queue is named such that it won't get picked up by the main worker pool. (I think, anyway. It'll be named "export_bcx_production" in production --  kosher?.)

I could use a little help on the final bits. Making sure export-01 is set up as a deploy target (it's being used in the :asset_precompiler role) and configuring it as the sole processor of the export queue, and whatever else needs to be configured for it to assume its new role.

Is anyone from ops interested in huddling with me on this? I'm on call this week and haven't had much spare time. I'd really appreciate a hand.
Taylor Weibley
Taylor Weibley
Adding a request:

  • We need some method of making the config on export-01 include that queue in it's resque-pool.yaml file.
  • deploy/production.rb changes to add the host to the right role.
Avatar
Add a comment or upload a file…
To attach files To attach files drag & drop here or select files from your computer… or Google Docs…

    Connect your Google account
    Before you can attach Google Docs in Basecamp, we’ll need your OK first. Do you want to connect your account now?

    or Cancel

    Basecamp couldn’t access your Google account
    To attach Google Docs, you’ll need to give Basecamp permission. Do you want to try again?

    or Cancel

    The client won’t see your comment
    By-the-minute history for this message...
    Test beacon Test beacon

    Uh-oh, that page isn’t loading.

    This is our fault, not yours. We’re sorry.

    Try refreshing the page or clearing your browser's cache.

    It’s likely this is a temporary glitch that will be fixed within a few minutes. If you continue to see this error: