Skip to content

Background Processing of Long running Tasks

Grunthos edited this page Mar 6, 2012 · 4 revisions

a.k.a. Back-to-the-Future, a.k.a. We Get to write a Batch System, a.k.a. if it's good enough for Oracle, it's good enough for us, a.k.a. if it's good enough for Oracle, we should reconsider...

###Status: Implemented

Implemented in a separate open source library stored in git: TaskQueue. The jar files are included in the BookCatalogue project.

###Why?

  • GoodReads integration (especially export/import/sync) will create very long running tasks for large libraries (perhaps 1 hour for 1000 books).
  • It is unacceptable to lock a user out of an app for that time.
  • We have existng code ('update fields from internet') that already locks up the app
  • The user may just quit the app and lose progress
  • the phone may die (we sucked the battery dry!)
  • the network may become unavailable half way through

So...we need a way to recover and restart with minimal loss of data, and maximal utility for the user.

###Can we do it any other way?

Probably.

  • We could use a background thread and save the thread state when the app exits. To acheieve flexibile results, this would be almost as complex as a 'service' running a batch queue, while not as cleanly defined in the Android world -- the Android dev docs recommend services for long-running background tasks, and once you have a service, you need to manage it, so it become a batch queue in all but name.
  • We could do it in foreground and recognize that 90% of users would be ok. But I have > 1000 books and would be really annoyed.

###How?

Research needed, but current plan is:

  • Create a Service
  • Create a new 'background_tasks' table in the database
  • Create a new BackgroundTask base abstract class that is serializable
  • Subclass it for specific tasks
  • Service reads table in ID order, runs tasks, perhaps more than one at a time, but probably not, and saves them on failure, deletes them on success.

####BackgroundTask Definition

Methods:

  • boolean run(int id): the id is the ID from the DB in case it stored context
  • load(int id) . maybe. probably not. see if it is needed. Called after creation
  • whatever is needed to support deserialization

The run() method will return true on success, false on failure, and an exception otherwise.

####Service processing

Jobs will be removed from the queue in ID order and the run() method called.

  • Any exception will result in the job being marked as 'failed, user intervention required'.
  • A 'false' return will result in the job being queued for execution later using a back-off strategy (assumption being that it is a transient error). The job will have a retry counter, and will retry up to a certain number of times. Each retry will result in a longer delay.
  • A 'true' result will mean that the job is deleted (or marked for deletion).

In the first pass, if a job fails with an exception or 'false' return, then the next job will start.

In an ideal world, it would be possible to mark one job as depending on 0 or more other jobs...but that can wait.