Skip to content

Latest commit

 

History

History
264 lines (219 loc) · 13.3 KB

3.1.1. Storage - Azure Storage.md

File metadata and controls

264 lines (219 loc) · 13.3 KB

Azure Storage

  • Massively scalable to support big data scenarios & small amount of data.
  • Access via REST API or SDKs: .NET, Java/Android, Node.js, PHP, Ruby, Python, PowerShell.
  • Auto-partitioning with automatically load-balancing based on traffic.
  • Billing is calculated by usage, not capacity.
  • Elastic & decoupled from application.
  • All resources in Storage can be protected from anonymous access and can be used in the Valet-Key pattern.
    • Valet-Key Pattern
      • Access resources using valet key (=key with granular & time-limited access for pre-defined operations/scopes).
      • E.g. SAS

Storage Types

Azure Blobs

  • VHDs and large blocks of data (e.g. images or documents)
  • 📝 Soft delete allows you to have a retention policy for files up to X days after deletion up to 365 days.
  • Entities
    • Page blobs
      • Lend themselves to storing random access files up to 8TB in size
      • Ideal for VHD storage for Virtual Machines.
    • Block blobs are for images or other documents and files up to 4TB in size.
    • Append blobs are similar in format to block blobs but allow append operations.
      • Ideal for auditing and logging applications such as log or monitoring data from many sources.

VM disks

  • Azure IaaS VMs can attach OS and Data Disks.
  • Disks can be premium storage (SSD based) or Standard storage (HDD).
  • ❗ Each storage account has a limit of 20000 IOPS throughput.
    • 💡 To be able to provide maximum performance with multiple disks => those data disks across many storage accounts.
  • Deployment
    • Through Portal, PowerShell or Azure CLI.
    • You can deploy as top-level disk resource that'll be attached later or can create within VM creation template
Azure Disk Encryption
  • Azure Disk Encryption is a capability that helps you encrypt your Windows and Linux IaaS VM disks.
  • Disk Encryption leverages the industry standard BitLocker feature of Windows and the DM-Crypt feature of Linux to provide volume encryption for the OS and data disks.
  • The solution is integrated with Azure Key Vault to help you control and manage the disk-encryption keys and secrets.
  • The solution also ensures that all data on the VM disks are encrypted at rest in your Azure storage.
Disk types
Unmanaged disks
  • Available in Standard and Premium tiers.
  • Any Azure VM size can have multiple standard disks attached.
  • A storage account can only store one type of disk.
  • Can have disk snapshots
  • ❗ Maximum of 20,000 IOPS per account, managed disk has not IOPS limit.
Managed disks
  • Simplifies the creation and management of Azure IaaS VM Disks.

    • Azure manages the storage accounts.
    • Only choose storage type and the disk size.
  • Other benefits:

    • Upgrade a standard disk to a premium disk and downgrade a premium disk to a standard disk.
      • ❗ Requires you to detach the disk from a VM before it's upgraded or downgraded.
    • The ceiling of 20000 IOPS disappears without needing to manage additional storage accounts.
    • ❗ Allows up to 10000 VM disks per subscription.
      • In VM Scale Sets allows up to 1000 VMs per scale set using a Marketplace image.
    • Better reliability for Availability Sets
      • All the disks in a VM are stored in the same scale unit
      • Each VMs disks will be stored in separate scale units
    • 99.9% availability
    • Granular access control
    • Managed disk snapshots
      • Stand-alone objects and can be used to create new disks.
      • 💡 You are only billed for the used size of the data on the disk.
        • E.g. if a 4 TB disk that holds 500 GB of data, the snapshot will be billed for 500 GB.
    • Images: Managed Disks support creating a managed custom image
    • Images vs. snapshots
      • An image is a copy of the VM and all disks attached.
      • A snapshot is a single disk copy.
      • A snapshot is not a suitable choice for the scenario with multiple striped data disks. No snapshot co-ordination is available.
    • Managed Disks and Encryption
      • By default Managed disks are encrypted by Azure Storage Service Encryption
        • Provides encryption at rest for disks, snapshots, and images
        • Also available at the VM level
          • In Windows it uses BitLocker Drive Encryption
      • Azure Key Vault integration is included which allows users to bring their own disk encryption keys.
  • Sizing and pricing

    Type Allowed disks Pricing
    Premium P4 (32 GB) -> P50 (4 TB) Pay for allocated size without transaction charges
    Standard S4 (32 GB) -> S5 (4 TB) Billed only for the number of transactions performed

Azure Tables

  • Structured data (a NoSQL store)
    • Schema-less entities with strong consistency.
    • Entity group transactions for atomic batching.
    • Best for Key/value lookups on partition key and row key.
  • 💡 Ideal for: User, device and service metadata, structured data.
  • Supports OData protocol.
  • Tables scale as demand increases.
    • No limits on number of table rows or table size.
    • Dynamic load balancing of table regions.
  • Components
    • URL format: http://<storage account>.table.core.windows.net/<table>
    • Storage Account : All access to Azure Storage is done through a storage account.
    • Table
      • Collection of entities
      • Don't enforce a schema on entities
        • which means a single table can contain entities that have different sets of properties.
      • The number of tables that a storage account can contain is limited only by the storage account capacity limit.
    • Entity : An entity is a set of properties, similar to a database row. An entity can be up to 1 MB in size.
    • Properties
      • Name-value pair.
      • Each entity can include up to 252 properties to store data
      • Each entity also has 3 system properties:
        • Partition key
        • Row key
        • Timestamp

Partitioning

  • Rate limit
    • ❗ A partition has a scalability target of 500 entities per second.
      • The throughput may be higher during minimal load on the storage node, but it will be throttled down when the node becomes hot or very active.
  • PartitionKey & RowKey properties
    • Can store up to 1 KB of string values.
    • Empty strings are also permitted; however, null values are not.
    • Clustered index sorting: Ascending PartitionKey then by ascending RowKey.
      • The sort order is observed in all query responses.
    • Each partition server can serve one or more partitions.
    • You should use more partitions, so that the Azure Table service can distribute the partitions to more partition servers.

Azure Queues

  • For message passing in applications and for backlog processing
  • Ideal for: Data sharing, Big Data, Backups. Functions
  • Functions:
    • Decouple components and scale them independently.
    • Scheduling of asynchronous tasks.
    • Building processes/work flows.
    • No limits on number of queues or messages.
    • Message visibility timeout to protect from component issues.
    • UpdateMessage to checkpoint progress part way through.

Azure Files

  • 📝 Shared storage for applications using the standard SMB 3.0 protocol.
  • Access data
    • VMs and cloud services: mounted shares via file I/O APIs.
    • On-premises: in a share via the File storage API.
    • E.g. on linux you run sudo mount
  • Typical uses
    • Migrating on-premises applications that rely on file shares.
    • Transfer file between vm in different subnet without vpn.
    • Storing shared application settings, for example in configuration files.
    • Storing diagnostic data such as logs, metrics, and crash dumps in a shared location.
    • Storing tools and utilities needed for developing or administering Azure virtual machines or cloud services.

Azure Files Components

  • Storage Account
  • Share
    • SMB 3.0 file share in Azure.
    • All directories and files must be created in a parent share.
    • An account can contain an unlimited number of shares.
    • A share can store an unlimited number of files, up to the capacity limits of the storage account.
  • Directory: An optional hierarchy of directories.
  • File: A file in the share. A file may be up to 1 TB in size.
  • URL format: https://[account].file.core.windows.net/<share>/<directory/directories>/
    • E.g. http://[account].file.core.windows.net/logs/CustomLogs/Log1.txt

Azure File Sync

  • On-premises -> File share in Azure
    • Turns your file server into a cache of the Azure-based file share.
      • Centralizes your file shares in Azure Files, whilst maintaining the compatibility of an on-premises file server.
    • Any protocol installed on the Windows Server can access the file share, including SMB, NFS, and FTPS.
Components
Storage Sync Service
  • Azure resource created to host the Azure File Sync service.
  • Required since service can sync between multiple storage accounts, hence an additional resource is required to manage this.
  • A subscription can contain multiple storage sync services.
Sync Group
  • Defines and controls the hierarchy and topology of the files to be synced.
  • The sync group will contain Cloud and Server endpoints.
  • Async service can contain multiple sync groups.
Registered Server
  • Before adding a server endpoint to a sync group, the server must be registered with a storage sync service.
  • A server can only be registered to a single sync service.
  • Async service can host as many registered servers as you need.
Azure File Sync Agent
  • To register a server, you need to install the Azure File Sync Agent.
  • This is a small downloadable MSI package comprising three components:
    • FileSyncSvc.exe: Monitors changes on Server Endpoints, and for initiating sync sessions to Azure.
    • StorageSync.sys: A file system filter, which handles tiering files to Azure Files.
    • PowerShell Management cmdlets: PowerShell cmdlets for the Microsoft.StorageSync Azure resource provider.
Server Endpoint
  • Once registered, you can add a server to a Sync group, this then becomes a server endpoint.
  • A server endpoint is synonymous with a folder or a volume on the server that will cache the contents of the Azure File Share.
  • Cloud tiering is configured individually by server endpoint.
Cloud Endpoint
  • When added to a sync group, an Azure File Share is a cloud endpoint.
  • One Azure File Share can only be a member of one Cloud Endpoint and thereby can only be a member of one Sync Group.
  • Any files that exist in a cloud endpoint or a server endpoint before they are added to the sync group, automatically become merged with all other files in the sync group.
Features
  • Adding files to the File Share
    • Adding and removing files directly within the Azure file share.
    • Syncs every 24 hours as detection job runs every 24 hours.
    • Only from on-prem to azure
  • Cloud tiering
    • Caching mechanism to save space.
    • Flow:
      1. File is tiered
      2. The sync system filter replaces the local file with a pointer to the location if the Azure file share.
      3. When accessed locally the file is downloaded and opened for use, e.g.Hierarchical Storage Management (HSM).
  • Supported versions of Windows Server: Windows Server 2012 R2 and Windows Server 2016
  • Access control lists (ACL)
    • Supported and enforced on files held on Server endpoints.
    • ❗ Azure Files do not currently support ACLs.
  • NTFS compression
    • Fully supported, and Sparse files are fully supported but are stored in the cloud as full files, and any cloud changes are synced as full files on server endpoints.
  • Failover Clustering
    • Supported for File Server for General Use but not for Scale-out file server for application data.
    • Cluster Shared Volumes are not supported.
    • To function correctly, the sync agent must be installed on every node of a cluster.
  • Data Deduplication
    • Fully supported for volumes that do not have cloud tiering enabled.
  • Encryption solutions
    • Azure File Sync, is known to work with BitLocker Drive Encryption and Azure Rights Management Services.
    • NTFS Encrypted File System does not work with Azure File Sync.

Storage Performance and Pricing

  1. Standard
  2. Premium
    • SSD disk support for VMs.
    • Using multiple disks gives your applications up to 256 TB of VM storage
    • Up to 80,000 I/O operations per second (IOPS) per VM, and a disk throughput of up to 2,000 megabytes per second (MB/s) per VM.
    • Only LRS redundancy
    • Applications that require consistent high performance and low latency such as SQL Server, Oracle, MongoDB, MySQL, and Redis can run happily in Azure Premium Storage.
    • More expensive: The storage is allocated on creation and is charged for the whole disk size rather than the data used.

Disk Types

Disk Type Description 💡 Suggested Use
Azure Files SMB 3.0 interface, client libraries and a REST interface access from anywhere to stored files. Application lift and shift using native file system API, Windows File Shares.
Azure Blobs Client libraries, REST interface, for unstructured data stored in Block blobs Streaming and random access, access application data from anywhere.
Azure Disks Client libraries, REST interface for persistent data accessed from a VHD, Access to data inside a Virtual machine on a VHD, lift and shift file system API apps that write to persistent disks.