MIX09 – Windows Azure Storage

Blobs

  • Two kinds of storage in Azure. SQL Data Services and Windows Azure storage. This is about the base storage offering
  • Three kinds of storage, blogs, tables and queues
  • All accessible via a REST API
  • Access secured via 256 bit (SHA256) key
  • Two separate data centres in US (Northwest, Southwest)
  • Affinity for storage and computation to reduce latency (available April)
  • Blobs – named objects, accounts, containers and blobs. (containers are like S3 buckets)
  • Tables – structured storage (like SimpleDB)
  • Queues
  • Sharing policies are set on a container basis
  • 8kb of name/value pairs can be associated  with each container
  • Listing abstractions for blobs in a container
  • Blob name space http://<Account Name>.blob.core.windows.net/<Container>/<BlobName>
  • Blobs can be up to 50GB in size
  • PutBlob, GetBlog, DeleteBlob
  • 8kb of metadata per blob
  • Support for MD5 checksum native to Storage API
  • Can use range get to retrieve a part of the  blog
  • Support for block level upload to allow interruptible uploads (S3 doesn’t do this)
  • PutBlock 1-N, then commit with PutBlockList
  • Blocks can be uploaded out of order or in parallel
  • Blocks can be uploaded twice, newer overwrites older
  • PutblockList will delete unused blocks
  • Blocks can be up to 4MB
  • Blocks can vary in size
  • Each block has a 64 byte ID, scoped by blob name
  • Overlapping get and put? Get will always see a single version of the blob. So while put is in process old blob is all that is seen
  • First PutBlockList wins in the case where multiple PutBlockLists occur.
  • Conditional Put/Get operations to support optimistic concurrency
  • Use a hash of Block to represent block ID

Tables

  • Billions of entities, TB of data
  • Highly available, durable
  • Account, table, entity are the key concepts
  • Table names are scoped by storage account name
  • A table is a set of entities (rows)
  • A entity is a set of propeties (columns)
  • Every table has a partition key column
  • Table partition, all entities in a table with the same partition key
  • Application controls granularity of partition key
  • A heavily partitioned table makes it easier to load balance
  • Entities in the same partition will be stored together
  • Multiple operations over multiple entities can be handled atomically in the future
  • Partition key and row key gives primary index
  • If partition key is part of query its fast, if it isn’t then the query ends up scanning
  • Each entity can have up to 255 properties, mandatory properties are partition key and row key
  • All entities have a system maintained version
  • No fixed schema, just name/value pairs
  • Access via ADO.NET Data Services (supports REST API)
  • Default number of connections is 2
  • 100-continue is default. Turn this off to save round trips.
  • Turn tracking off for read only queries
  • Bug in ADO.net relating to de-serialisation fix is to name the entity class the same as the table name
  • Be prepared for partial results from your queries
  • Query is limited to 60 seconds. After this results are returned and you must continue to get the rest
  • Not a relational database, no joins, foreign keys

Queues

  • Web Roles, Worker roles
  • Reliable message delivery
  • Access via REST
  • Account, Queue, Message
  • No limit on messages in queue
  • A message is stored for at most a week
  • Messages <= 8kb
  • http://<Account&gt;.queue.core.windows.net/<QueueName>
  • Create/Delete/Clear Queues
  • Enqueue/Dequeue/Delete
  • Dequeue makes message invisible. You delete after processing. If delete doesn’t get called timeout will make message visible once invisible time expires.
  • Queues are designed to be idempotent. Each message can be processed at least once, may be processed twice.
  • No fixed ordering for dequeue of messages, but approximates to FIFO
  • Use queue length to scale your worker tasks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.