Computing Infrastructure — Appunti TiTilda

Indice

Data Centers

As internet adoption grew, computing shifted from local machines to centralized data centers, large facilities housing thousands of servers and providing computational power and storage for various applications.

Data centers are geographically distributed in areas with favorable cooling and power conditions, reducing user latency and improving fault tolerance.

Benefits of Centralized Computing

User Benefits:

Vendor Benefits:

Infrastructure Benefits:

Warehouse-Scale Computing

Warehouse-scale computing (WSC) is a type of data center architecture that treats thousands of interconnected servers as a single unified system.

This enables running large-scale applications (search engines, social media platforms, online gaming services) that require significant computational resources to be efficiently managed and scaled.

Many such providers also offer cloud services, virtualizing their infrastructure for external customers, allowing a traditional data center to be built on top of a warehouse-scale computing infrastructure.

Geographic Distribution

Global cloud infrastructure is hierarchically organized for redundancy and low latency:

Physical Architecture

Data center architecture is similar to that of personal computers but at a massive scale.

Computing Components:

Support Infrastructure:

Server

Servers are fundamental computing units in data centers, designed for performance, reliability, and scalability.

Form Factors

Servers are made in different standard form factors such us:

Components

All components are standardized for quick replacement and maintenance, with hot-swappable parts to minimize downtime.

Thermal Management

Data centers uses cold aisle/warm aisle configuration to maximize the air cooling efficiency:

Storage

WIth time the data have been moved from local towards cloud providers. This is due:

File System Abstractions

OS manages data through hierarchical abstractions:

During data deletion, the cluster is only flagged as deleted, allowing it to be overwritten.

Space Allocation

The storage unit is represented as a multiple of the cluster size:

\text{Disk Size} = \lceil \frac{\text{File Size}}{\text{Cluster Size}} \rceil \times \text{Cluster Size}

when a file is smaller than the cluster size, sime of its space is wasted leading to internal fragmentation:

\text{Wasted Space} = \text{Disk Size} - \text{File Size} When a file’s clusters are non-contiguous (fragmented), read/write operations require multiple seeks, degrading performance. In these cases it’s useful to perform defragmentation to rearrange sectors into sequential blocks.

Hard Disk Drives

Physical Structure:

HDDs contain rotating magnetic platters coated with ferromagnetic material. Data is stored as magnetic patterns organized into:

The platters are mounted on a spindle and spin at high speeds (RPM). An actuator arm with a read/write head moves across the platters to access data.

The entire assembly is enclosed in a sealed case to protect against dust, scratches, and environmental contaminants, while also providing shock resistance.

Access Time Components

During the read/write process, several time components contribute to the total access time:

The total access time is the sum of these components: T_\text{Access} = T_\text{Seek} + T_\text{Rotation} + T_\text{Transfer} + T_\text{Controller}

The Data locality, the tendency for related data to be stored close together, can significantly reduce access time by minimizing seek and rotation delays. The locality factor is represented by \alpha. The adjusted access time considering locality is:

T_\text{Access} = (1-\alpha)(T_\text{Seek} + T_\text{Rotation}) + T_\text{Transfer} + T_\text{Controller}

To reduce access time the HDDs include buffer memory that exploits spatial locality by storing neighboring sectors.

Writes target cache first, then flush to platters. This reduces repeated disk access for frequently accessed data.

Scheduling

When multiple I/O requests are fired, the disk scheduler determines the order of processing. The goal is to minimize total access time and maximize throughput. This introduces a Scheduling Delay as the disk may need to wait for the current request to finish before processing the next one. Common scheduling algorithms include:

Solid State Drives

SSDs use flash memory (no mechanical parts), managed by a silicon controller and uses the same form factors as HDDs.

At the beginning of its life, an SSD is faster than an HDD because it has no seek time or rotation delay. However, as the SSD fills up and undergoes more write cycles, its performance can degrade, mainly for writes.

Data is organized into:

Each cell has a limited number of write cycles, leading to wear-out as the oxide layer degrades.

Each page can be in one of three states:

Writes always target empty pages; updates write to new pages and mark old pages dirty. Blocks containing only dirty pages can be erased. This prevents repeated wear on single cells but introduces challenges:

To mitigate wear-out, SSDs implement a technique called Wear Leveling that distributes write/erase cycles evenly across cells. Periodically relocates data to ensure \text{max cycles} - \text{min cycles} < e (small threshold).

SSDs uses Flash Translation Layer (FTL), a firmware component that manages the mapping between logical block addresses (LBAs) used by the operating system and the physical addresses of the flash memory. Mapping strategies include:

Reliability

Unrecoverable Bit Error Ratio (UBER) differs from HDDs: HDD UBER increases linearly with age while SSD UBER change over time, starting low, increasing as the drive wears out, and then rising sharply near the end of its lifespan.

Storage Architectures
Direct Attached Storage (DAS)

Direct Attached Storage is physically connected to a single server (internal or external via SATA, USB).

Network Attached Storage (NAS)

Network Attached Storage is storage that is connected to a network and has its own IP address, appearing as a file server. It provides file-level access to data over the network, allowing multiple clients to access and share files simultaneously.

Storage Area Network (SAN)

Storage Area Network is a network that provides block-level access to data, allowing servers to access storage as if it were directly attached.

Dependability

Systems fail due to: defects, degradation, radiation, design errors, bugs, attacks, and human errors. This leads to economic losses, information loss, physical harm, and reputation damage.

Dependability is a measure of trust toward a system. It comprises five key attributes:

Fault-Error-Failure Chain

A fault is a defect or anomaly in a system.

When a fault is activated, it becomes an error, a deviation from correct operation.

If an error is not detected and corrected, it propagates and ultimately causes a failure, meaning that the system ceases to perform its intended function.

Dependability Approaches

Two primary techniques address dependability:

This is a tradeoff between cost (hardware, performance, and development), performance, and dependability. Design decisions depend on: technologies, requirements, context, and environment.

Reliability Metrics

Reliability follows an exponential failure model: R(t) = e^{-\lambda t}

where:

Mean Time To Failure (MTTF): expected time until first failure: \text{MTTF} = \int_0^\infty R(t) \, dt = \frac{1}{\lambda}

Mean Time To Repair (MTTR): expected time to detect, repair, and recover: \text{MTTR} = t_{\text{detect}} + t_{\text{repair}} + t_{\text{recover}}

Mean Time Between Failures (MTBF): expected time between consecutive failures in repairable systems: \text{MTBF} = \text{MTTF} + \text{MTTR}

Availability formula: A = \frac{\text{MTTF}}{\text{MTTF} + \text{MTTR}} = \frac{\text{uptime}}{\text{uptime} + \text{downtime}}

Failures In Time (FIT): number of failures per billion device-hours: \text{FIT} = \frac{10^9}{\text{MTBF}}

Component Lifecycle

A component experiences three phases during its operational lifetime:

System updates and new deployments risk introducing failures into production. Some common strategies mitigate this risk:

Reliability Block Diagram

The system structure is represented as a block diagram where each component is a block and links show dependencies.

A system functions if there exists at least one operational path from start to end.

Connections represent two reliability configurations:

Standby Redundancy: A redundant component remains idle until the primary component fails, then automatically activates. This approach approximately doubles the MTTF compared to a single component.

r-out-of-n Redundancy

A system that requires r out of n components to function correctly for the system to operate.

The system reliability for r-out-of-n redundancy (assuming identical components, each with reliability R):

R_{\text{voting}} = \sum_{i=r}^{n} \binom{n}{i} R^i(1-R)^{n-i}

This formula sums the probability that at least r components are operational.

When the majority of components must be operational, the reliability of a single component could be higher than the reliability of the entire system, especially when the failure rate \lambda is high.

Ultima modifica:
Scritto da: Andrea Lunghi