Scaling with Concurrency

Scalability Pattern: Concurrency

  • Do more with the available resources
  • Do more things at the same time
  • Resource idle time is your enemy
    • CPU
    • Network
    • Disk
    • Database

Scenarios when concurrency is an option

  • Users can ask for a detailed report about their twitter traffic
    • It can be a one shot, or regenerated at midnight
    • It can be displayed on the screen or emailed
    • Inline is no good because of computational cost and error handling
  • User enters a new value in a spreadsheet
    • The cell can update immediately
    • The recalculation can be done concurrently (asynchronously)
  • Note: AJAX uses a very common form of concurrency
  • Note: Count the users’ own computer as one of the processors!

Scaling pattern: concurrency on a single computer

  • Difference between syncrhonous and asynchronous
  • Difference between concurrent and parallel
Each Team Plan to use concurrency?

Processes and Threads - General

  • Key Operating System facility for concurrency
  • Processes (“forking”)
    • Use more memory (new VM for each process)
      • for the data + the program + everything
      • “Copy on write”
    • If parent dies before children, they can become “zombie” processes
    • Context switching very expensive
    • Communication expensive (IPC or file system)
    • Slower to create and destroy
    • Less hard to program and debug (not easy!)
  • Threads
    • Use less memory (Shared memory space)
    • All threads die when oricess dies
    • Context switching cheap
    • Communication cheap (via queues and shared memory)
    • Fast to create and destroy
    • Harder to program and debug
  • Ruby and Pyton “GIL”
    • Global interpretter lock
    • Essentially they become single threaded
    • Except for asynchrony provded by OS via IO operations
    • “It’s complicated”
  • Ruby libraries that use processes
    • Resque - background processing framework
    • Unicordn - http server
  • Ruby Libraries that use Threads
    • Sidekiq - background processing framework
    • Puma - http server
    • Thin - http server

Thread-safe

  • A property of software, or a routine or a class
  • Does it behave ‘well’ when running in a thread (sharing memory)
  • Deadlock (“mortal embrace”)
    • Example with two people and two tools
  • Race Condition
    • When the results vary due to
    • How to avoid: using semaphors, queues, and other techniques
  • Higher level constructs, e.g. Actors (threads that only talk through queues)
Demonstrations…