How can data be stored
How can data be stored
Computer Data Representation
How Data is Stored
When we group bits together they have been given names by the computer industry.
Most references to computers use the number of bytes as a measure for the computer’s memory (primary storage) capacity and storage (secondary) capacity.
Computer memory is partitioned (divided) into a number of data containers called memory cells
Each cell stores a specific amount of data called a word (e.g., in our class, we will usually use examples using 8 bits.)
Each cell has an associated location identifier called an address
Data to be processed is coded in a binary (base-2 number) form using various encoding schemes discussed below:
To begin with, digits 0 and 1 are binary digits and each is referred to as a bit for short.
Again, 0 represents an OFF state and 1 represents an ON state
The capacity of a computer’s memory is determined by the number of bits per cell and the number of cells into which memory has been partitioned, i.e., computer memory depends on how many bits may be stored in each cell and how many cells there are available.
The industry settled on a sequence of 8-bits (given the unit name byte) as the basic unit of memory
The term byte preceded by a prefix are used to express the memory/storage capacity of a computer. See Chart #1 below.
Units for Measuring Memory (Data Storage) Capacity:
Digital data storage – A guide to modern systems
by Casey Schmidt | December 3, 2019
Modern digital data storage requires systems capable of backing up data and preventing all possible data overload. Learn more about data storage and how it works in order to fulfill your own companies needs. Here’s an in-depth breakdown.
What is digital data storage?
Digital data storage is mostly offline storage for backup and fail-safe data. In recent years, digital data has evolved to include cloud storage. It is a server that hosts all types of uploaded data, including media files. It generally exists to serve companies that have large amounts of data and need it protected and backed up.
How is digital data stored?
In a technical sense, data is stored as code or numbers for a computer to read and control. It’s then guided based on the computer input rules and stored in different locations. Data within files can be stored offline in different drive types, on a physical location like a hard drive and online in the cloud.
Data is stored in a manner that computers understand.
Different ways to store data
There are different types of data storage and it’s important to understand how they contrast from one another. RAM, or memory, is temporary data storage so the computer can quickly access it. The data stored here isn’t permanent. It instead allows a computer to read data fast, opposed to the slower alternatives of storage.
The other type of storage is a device like a hard drive, which holds permanent data unlike the RAM storage. This storage is potentially mobile, as the drive can be external as well as internal. Personal data within different types of files including media generally are stored onto things like the hard drive or external drives for future use. It also helps the uploader retrieve data quickly in the future.
Data is stored both permanently and temporarily.
Alternatives to traditional storage
Digital data storage began as a computer storage technology for audio. It then transformed to include different digital files. In modern times, companies are moving away from traditional on-site locations in favor of cloud systems that optimize collaboration. When companies need to share data with outside parties, they implement a digital data center. These systems save space, money and time. They are also important because of the security they provide to backup data immediately.
Companies protect sensitive data through sufficient, around-the-clock service. Because of this, they’re trusting their data to systems to enterprises whose main goal is digital data storage. These systems prevent data overload, and they also store data within infrastructures that allow companies to quickly access their data.
There are alternatives to traditional data storage.
Digital data storage systems
Data storage systems are servers that host company data in a location different than the company location. Digital asset management (DAM) software similarly stores data but there are a few nuanced differences. DAM is suited for media files and documents – it boosts retrieval, sharing and collaboration.
DAM technically stores data, though it’s more geared toward companies wishing to store and share digital files with third parties. It also organizes files efficiently. Consider a DAM system when your storage needs include a more broad range of file types, such as media and other documents.
The future of digital storage
As data continues to grow and evolve, companies in turn need more ways and space to store it. Certainly, nobody can predict the future. However, it’s certainly possible to give strong estimates based on what we know so far. If we follow the trajectory that we are currently in, data should continue to increase exponentially. Furthermore, companies will need to keep extensive records for legal reasons.
Will this data continue to be stored on local servers? Will it mostly take to the cloud? There’s a lot of uncertainty as to what will win out in the end. There’s even talk of things like magnetic tape data storage becoming a large presence. The most important thing you can do is pay attention to what you need specifically to successfully maintain your data.
Digital data storage should not only help free up storage space but also allow your company a chance to quickly retrieve files. Use the right storage system in order to get the most efficient data use.
What is data storage?
Data storage refers to magnetic, optical or mechanical media that records and preserves digital information for ongoing or future operations.
There are two types of digital information: input and output data. Users provide the input data. Computers provide output data. But a computer’s CPU can’t compute anything or produce output data without the user’s input.
Users can enter the input data directly into a computer. However, they have found early on in the computer-era that continually entering data manually is time- and energy-prohibitive. One short-term solution is computer memory, also known as random access memory (RAM). But its storage capacity and memory retention are limited. Read-only memory (ROM) is, as the name suggests, the data can only be read but not necessarily edited. They control a computer’s basic functionality.
Although advances have been made in computer memory with dynamic RAM (DRAM) and synchronous DRAM (SDRAM), they are still limited by cost, space and memory retention. When a computer powers down, so does the RAM’s ability to retain data. The solution? Data storage.
With data storage space, users can save data onto a device. And should the computer power down, the data is retained. And instead of manually entering data into a computer, users can instruct the computer to pull data from storage devices. Computers can read input data from various sources as needed, and it can then create and save the output to the same sources or other storage locations. Users can also share data storage with others.
Today, organizations and users require data storage to meet today’s high-level computational needs like big data projects, artificial intelligence (AI), machine learning and the internet of things (IoT). And the other side of requiring huge data storage amounts is protecting against data loss due to disaster, failure or fraud. So, to avoid data loss, organizations can also employ data storage as backup solutions.
How data storage works
In simple terms, modern computers, or terminals, connect to storage devices either directly or through a network. Users instruct computers to access data from and store data to these storage devices. However, at a fundamental level, there are two foundations to data storage: the form in which data takes and the devices data is recorded and stored on.
To store data, regardless of form, users need storage devices. Data storage devices come in two main categories: direct area storage and network-based storage.
Direct area storage, also known as direct-attached storage (DAS), is as the name implies. This storage is often in the immediate area and directly connected to the computing machine accessing it. Often, it’s the only machine connected to it. DAS can provide decent local backup services, too, but sharing is limited. DAS devices include floppy disks, optical discs—compact discs (CDs) and digital video discs (DVDs)—hard disk drives (HDD), flash drives and solid-state drives (SSD).
Network-based storage allows more than one computer to access it through a network, making it better for data sharing and collaboration. Its off-site storage capability also makes it better suited for backups and data protection. Two common network-based storage setups are network-attached storage (NAS) and storage area network (SAN).
NAS is often a single device made up of redundant storage containers or a redundant array of independent disks (RAID). SAN storage can be a network of multiple devices of various types, including SSD and flash storage, hybrid storage, hybrid cloud storage, backup software and appliances, and cloud storage. Here are how NAS and SAN differ:
NAS
SAN
Types of storage devices
SSD and flash storage
Flash storage is a solid-state technology that uses flash memory chips for writing and storing data. A solid-state disk (SSD) flash drive stores data using flash memory. Compared to HDDs, a solid-state system has no moving parts and, therefore, less latency, so fewer SSDs are needed. Since most modern SSDs are flash-based, flash storage is synonymous with a solid-state system.
Hybrid storage
SSDs and flash offer higher throughput than HDDs, but all-flash arrays can be more expensive. Many organizations adopt a hybrid approach, mixing the speed of flash with the storage capacity of hard drives. A balanced storage infrastructure enables companies to apply the right technology for different storage needs. It offers an economical way to transition from traditional HDDs without going entirely to flash.
Cloud storage
Cloud storage delivers a cost-effective, scalable alternative to storing files to on-premise hard drives or storage networks. Cloud service providers allow you to save data and files in an off-site location that you access through the public internet or a dedicated private network connection. The provider hosts, secures, manages, and maintains the servers and associated infrastructure and ensures you have access to the data whenever you need it.
Hybrid cloud storage
Hybrid cloud storage combines private and public cloud elements. With hybrid cloud storage, organizations can choose which cloud to store data. For instance, highly regulated data subject to strict archiving and replication requirements is usually more suited to a private cloud environment. Whereas less sensitive data can be stored in the public cloud. Some organizations use hybrid clouds to supplement their internal storage networks with public cloud storage.
Backup software and appliances
Backup storage and appliances protect data loss from disaster, failure or fraud. They make periodic data and application copies to a separate, secondary device and then use those copies for disaster recovery. Backup appliances range from HDDs and SSDs to tape drives to servers, but backup storage can also be offered as a service, also known as backup-as-a-service (BaaS). Like most as-a-service solutions, BaaS provides a low-cost option to protect data, saving it in a remote location with scalability.
Forms of data storage
Data can be recorded and stored in three main forms: file storage, block storage and object storage.
File storage
File storage, also called file-level or file-based storage, is a hierarchical storage methodology used to organize and store data. In other words, data is stored in files, the files are organized in folders and the folders are organized under a hierarchy of directories and subdirectories.
Block storage
Block storage, sometimes referred to as block-level storage, is a technology used to store data into blocks. The blocks are then stored as separate pieces, each with a unique identifier. Developers favor block storage for computing situations that require fast, efficient and reliable data transfer.
Object storage
Object storage, often referred to as object-based storage, is a data storage architecture for handling large amounts of unstructured data. This data doesn’t conform to, or can’t be organized easily into, a traditional relational database with rows and columns. Examples include email, videos, photos, web pages, audio files, sensor data, and other types of media and web content (textual or non-textual).
Data storage for business
Computer memory and local storage might not provide enough storage, storage protection, multiple users’ access, speed and performance for enterprise applications. So, most organizations employ some form of a SAN in addition to a NAS storage system.
SAN
Sometimes referred to as the network behind the servers, a SAN is a specialized, high-speed network that attaches servers and storage devices. It consists of a communication infrastructure, which provides physical connections, allowing an any-to-any device to bridge across the network using interconnected elements, such as switches and directors. The SAN can also be viewed as an extension of the storage bus concept. This concept enables storage devices and servers to interconnect by using similar elements, such as local area networks (LANs) and wide-area networks (WANs). A SAN also includes a management layer that organizes the connections, storage elements and computer systems. This layer ensures secure and robust data transfers.
Traditionally, only a limited number of storage devices could attach to a server. Alternatively, a SAN introduces networking flexibility enabling one server, or many heterogeneous servers across multiple data centers, to share a common storage utility. The SAN also eliminates the traditional dedicated connection between a server and storage and the concept that the server effectively owns and manages the storage devices. So, a network might include many storage devices, including disk, magnetic tape and optical storage. And the storage utility might be located far from the servers that it uses.
SAN components
The storage infrastructure is the foundation on which information relies. Therefore, the storage infrastructure must support the company’s business objectives and business model. A SAN infrastructure provides enhanced network availability, data accessibility and system manageability. In this environment, simply deploying more and faster storage devices is not enough. A good SAN begins with a good design.
The core components of a SAN are Fibre Channel, servers, storage appliances, and networking hardware and software.
storage (computer storage)
Data storage is the collective methods and technologies that capture and retain digital information on electromagnetic, optical or silicon-based storage media. Storage is used in offices, data centers, edge environments, remote locations and people’s homes. Storage is also an important component in mobile devices such as smartphones and tablets. Consumers and businesses rely on storage to preserve information ranging from personal photos to business-critical data.
With the advent of big data, advanced analytics and the profusion of internet of things (IoT) devices, storage is more important than ever to handle the growing amounts of data. Modern storage systems must also support the use of artificial intelligence (AI), machine leaning and other AI technologies to analyze all this data and derive its maximum value.
Today’s sophisticated applications, real-time database analytics and high-performance computing also require highly dense and scalable storage systems, whether they take the form of storage area networks (SANs), scale-out and scale-up network-attached storage (NAS), object storage platforms, or converged, hyper-converged or composable infrastructure.
By 2025, it is expected that 163 zettabytes (ZB) of new data will be generated, according to a report by IT analyst firm IDC. The estimate represents a potential tenfold increase from the 16 ZB produced through 2016. IDC also reports that in 2020 alone 64.2 ZB of data was created or replicated.
The term storage can refer to both the stored data and to the integrated hardware and software systems used to capture, manage, secure and prioritize that data. The data might come from applications, databases, data warehouses, archives, backups, mobile devices or other sources, and it might be stored on premises, in edge computing environments, at colocation facilities, on cloud platforms or any combination of these.
Storage capacity requirements define how much storage is needed to support this data. For instance, simple documents might require only kilobytes of capacity, while graphic-intensive files, such as digital photographs, can take up megabytes, and a video file can require gigabytes of storage.
Computer applications commonly list the minimum and recommended capacity requirements needed to run them, but these tell only part of the story. Storage administrators must also take into account how long the data must be retained, applicable compliance regulations, whether data reduction techniques are being used, disaster recovery (DR) requirements and any other issues that can impact capacity.
This video from CHM Nano Education explains the role of magnetism in data storage.
A hard disk is a circular platter coated with a thin layer of magnetic material. The disk is inserted on a spindle and spins at speeds of up to 15,000 revolutions per minute (rpm). As it rotates, data is written on the disk surface using magnetic recording heads. A high-speed actuator arm positions the recording head to the first available space on the disk, allowing data to be written in a circular fashion.
On an electromechanical disk such as an HDD, blocks of data are stored within sectors. Historically, HDDs have used 512-byte sectors, but this has started to change with the introduction of the Advanced Format, which can support 4,096-byte sectors. The Advanced Format increases bit density on each track, optimizes how data is stored and improves format efficiency, resulting in greater capacities and reliability.
On most SSDs, data is written to pooled NAND flash chips that use either floating gate cells or charge trap cells to retain their electrical charges. These charges determine the binary bit state (1 or 0). An SSD is not technically a drive but more like an integrated circuit made up of millimeter-sized silicon chips that can contain thousands or even millions of nanotransistors.
Many organizations use a hierarchical storage management system to back up their data to disk appliances. Backing up data is considered a best practice whenever data needs to be protected, such as when organizations are subject to legal regulations. In some cases, an organization will write its backup data to magnetic tape, using it as a tertiary storage tier. However, this approach is practiced less commonly than in years past.
An organization might also use a virtual tape library (VTL), which uses no tape at all. Instead, data is written sequentially to disks but retains the characteristics and properties of tape. The value of a VTL is its quick recovery and scalability.
Digital information is written to target storage media through the use of software commands. The smallest unit of measure in a computer memory is a bit, which has a binary value of 0 or 1. The bit’s value is determined by the level of electrical voltage contained in a single capacitor. Eight bits make up one byte.
Computer, storage and network systems use two standards when measuring storage amounts: a base-10 decimal system and a base-2 binary system. For small storage amounts, discrepancies between the two standards usually make little difference. However, those discrepancies become much more pronounced as storage capacities grow.
The differences between the two standards can be seen when measuring both bits and bytes. For example, the following measurements show the differences in bit values for several common decimal (base-10) and binary (base-2) measurements:
The differences between the decimal and binary standards can also be seen for several common byte measurements:
Fortunately, many systems now distinguish between the two standards. For example, a manufacturer might list the available capacity on a storage device as 750 GB, which is based on the decimal standard, while the operating system lists the available capacity as 698 GiB. In this case, the OS is using the binary standard, clearly showing the discrepancy between the two measurements.
Some systems might provide measurements based on both values. An example of this is IBM Spectrum Archive Enterprise Edition, which uses both decimal and binary units to represent data storage. For instance, the system will display a value of 512 terabytes as 512 TB (465.6 TiB).
Few organizations require a single storage system or connected system that can reach an exabyte of data, but there are storage systems that scale to multiple petabytes. Given the rate at which data volumes are growing, exabyte storage might eventually become a common occurrence.
Binary vs. decimal data measurements compared
Random access memory (RAM) is computer hardware that temporarily stores data that can be quickly accessed by the computer’s processor. The data might include OS and application files, as well as other data critical to the computer’s ongoing operations. RAM is a computer’s main memory and is much faster than common storage devices such as HDDs, SSDs or optical disks.
A computer’s RAM ensures that the data is immediately available to the processor as soon as it’s needed.
The biggest challenge with RAM is that it’s volatile. If the computer loses power, all data stored in RAM is lost. If a computer is turned off or rebooted, the data must be reloaded. This is much different than the type of persistent storage offered by SSDs, HDDs or other non-volatile devices. If they lose power, the data is still preserved.
Although most storage devices are much slower than RAM, their non-volatility make them essential to carrying out everyday operations.
Storage devices are also cheaper to manufacture and can hold much more data than RAM. For example, most laptops include 8 GB or 16 GB of RAM, but they might also come with hundreds of gigabytes of storage or even terabytes of storage.
RAM is all about providing instantaneous access to data. Although storage is also concerned with performance, it’s ultimate goal is to ensure that data is safely stored and accessible when needed.
Organizations increasingly use tiered storage to automate data placement on different storage media. Data is placed in a specific tier based on capacity, performance and compliance. Data tiering, at its simplest, starts by classifying the data as either primary or secondary and then storing it on the media best suited for that tier, taking into account how the data is used and the type of media it requires.
The meanings of primary and secondary storage have evolved over the years. Originally, primary storage referred to RAM and other built-in devices, such as the processor’s L1 cache, and secondary storage referred to SSDs, HDDs, tape or other non-volatile devices that supported access to data through I/O operations.
Primary storage generally provided faster access than secondary storage due to the proximity of storage to the computer processor. On the other hand, secondary storage could hold much more data, and it could replicate data to backup storage devices, while ensuring that active data remained highly available. It was also cheaper.
In contrast, secondary storage can include just about any type of storage that’s not considered primary. Secondary storage might be used for backups, snapshots, reference data, archived data, older operational data or any other type of data that isn’t critical to primary business operations. Secondary storage typically supports backup and DR and often includes cloud storage, which is sometimes part of a hybrid cloud configuration.
Digital transformation of business has also prompted more and more companies to use multiple cloud storage services, adding a remote tier that extends secondary storage.
In its broadest sense, data storage media can refer to a wide range of devices that provide varying levels of capacity and speed. For example, it might include cache memory, dynamic RAM (DRAM) or main memory; magnetic tape and magnetic disk; optical discs such as CDs, DVDs and Blu-rays; flash-based SSDs, SCM devices and various iterations of in-memory storage. However, when using the term data storage, most people are referring to HDDs, SSDs, SCM devices, optical storage or tape systems, distinguishing them from a computer’s volatile memory.
Spinning HDDs use platters stacked on top of each other coated in magnetic media with disk heads that read and write data to the media. HDDs have been widely used in personal computers, servers and enterprise storage systems, but they’re quickly becoming supplanted by SSDs, which offer superior performance, provide greater durability, consume less power and come in a smaller footprint. They’re also starting to reach price parity with HDDs, although they’re not there yet.
An external hard disk drive
Most SSDs store data on non-volatile flash memory chips. Unlike spinning disk drives, SSDs have no moving parts and are increasingly found in all types of computers, despite being more expensive than HDDs. Some manufacturers also ship storage devices that use flash storage on the back end and high-speed cache such as DRAM on the front end.
Unlike HDDs, flash storage does not rely on moving mechanical parts to store data, resulting in faster data access and lower latency than HDDs. Flash storage is non-volatile like HDDs, allowing data to persist in memory even if the storage system loses power, but flash has not yet achieved the same level of endurance as the hard disk, leading to hybrid arrays that integrate both types of media. (Cost is another factor in the development of hybrid storage.) However, when it comes to SSD endurance, the types of workloads and NAND devices can also play an important role in a device’s endurance, and in this regard, SSDs can vary significantly from one device to the next.
Since 2011, an increasing number of enterprises have implemented all-flash arrays based on NAND flash technology, either as an adjunct or replacement to hard disk arrays. Organizations are also starting to turn to SCM devices such as Intel Optane SSDs, which offer faster speeds and lower latency than flash-based storage.
Intel’s 3D XPoint-based Optane SSD
Various optical media formats
Flash memory cards are integrated in digital cameras and mobile devices, such as smartphones, tablets, audio recorders and media players. Flash memory is also found on Secure Digital cards, CompactFlash cards, MultiMediaCard (MMC) cards and USB memory sticks.
Flash memory
Physical magnetic floppy disks are rarely used these days, if at all. Unlike older computers, newer systems are not equipped with floppy disk drives. Use of floppy disks started in the 1970s, but the disks were phased out in the late 1990s. Virtual floppy disks are sometimes used in place of the 3.5-inch physical diskette, allowing users to mount an image file like they would the A: drive on a computer.
Enterprise storage vendors provide integrated NAS systems to help organizations collect and manage large volumes of data. The hardware includes storage arrays or storage servers equipped with hard drives, flash drives or a hybrid combination. A NAS system also comes with storage OS software to deliver array-based data services.
Diagram of a storage array
Many enterprise storage arrays come with data storage management software that provides data protection tools for archiving, cloning, or managing backups, replication or snapshots. The software might also provide policy-based management to govern data placement for tiering to secondary data storage or to support a DR plan or long-term retention. In addition, many storage systems now include data reduction features such as compression, data deduplication and thin provisioning.
Three basic designs are used for many of today’s business storage systems: direct-attached storage (DAS), NAS and storage area network (SAN).
Pure Storage’s FlashBlade enterprise storage array
The simplest configuration is DAS, which might be an internal hard drive in an individual computer, multiple drives in a server or a group of external drives that attach directly to the server though an interface such as the Small Computer System Interface (SCSI), Serial Attached SCSI (SAS), Fibre Channel (FC) or internet SCSI (iSCSI).
NAS is a file-based architecture in which multiple file nodes are shared by users, typically across an Ethernet-based local area network (LAN). A NAS system has several advantages. It doesn’t require a full-featured enterprise storage operating system, NAS devices can be managed with a browser-based utility and each network node is assigned a unique IP address, helping to simplify management.
Closely related to scale-out NAS is object storage, which eliminates the necessity of a file system. Each object is represented by a unique identifier, and all the objects are presented in a single flat namespace. Object storage also supports the extensive use of metadata.
A SAN can be designed to span multiple data center locations that need high-performance block storage. In a SAN environment, block devices appear to the host as locally attached storage. Each server on the network can access shared storage as though it were a direct-attached drive.
Advances in NAND flash, coupled with falling prices in recent years, have paved the way for software-defined storage. Using this configuration, an enterprise installs commodity-priced SSDs on x86-based servers and then uses third-party storage software or custom open source code to apply storage management.
Non-volatile memory express (NVMe) is an industry-standard protocol developed specifically for flash-based SSDs. NVMe is quickly emerging as the de facto protocol for flash storage. NVMe flash enables applications to communicate directly with a central processing unit (CPU) via Peripheral Component Interconnect Express (PCIe) links, bypassing the need to transmit SCSI command sets through a network host bus adapter.
NVMe can take advantage of SSD technology in a way not possible with SATA and SAS interfaces, which were designed for slower HDDs. Because of this, NVMe over Fabrics (NVMe-oF) was developed to optimize communications between SSDs and other systems over a network fabric such as Ethernet, FC and InfiniBand.
A non-volatile dual in-line memory module (NVDIMM) is a hybrid NAND and DRAM device with integrated backup power that plugs into a standard DIMM slot on a memory bus. NVDIMM devices process normal calculations in the DRAM but use flash for other operations. However, the host computer requires the necessary basic input-output system (BIOS) drivers to recognize the device.
NVDIMMs are used primarily to extend system memory or improve storage performance, rather than to add capacity. Current NVDIMMs on the market top out at 32 GB, but the form factor has seen density increases from 8 GB to 32 GB in just a few years.
Non-volatile dual in-line memory module (NVDIMM) is a hybrid of NAND and DRAM.
Consolidation in the enterprise market has winnowed the field of primary storage vendors in recent years. Those that penetrated the market with disk products now derive most of their sales from all-flash or hybrid storage systems that incorporate both SSDs and HDDs.
Market-leading vendors include:
Smaller vendors such as Drobo, iXsystems, QNAP and Synology also sell various types of storage products. In addition, a number of vendors now offer hyper-converged infrastructure (HCI) solutions, including Cisco, DataCore, Dell EMC, HPE, NetApp, Nutanix, Pivot3, Scale Computing, StarWind and VMware. Many enterprise storage vendors also offer branded converged and composable infrastructure products.
Comparing traditional, hyper-converged, disaggregated hyper-converged and composable infrastructures
Understanding data storage
Data storage has come a long way since the days of disk systems. Sure, those disk systems might still be used here and there—but now all that data is attached to a network and software-defined.
Container storage for dummies
What is data storage?
Data storage is the collection and retention of digital information—the bits and bytes behind applications, network protocols, documents, media, address books, user preferences, and more. Data storage is a central component of big data and data management.
Think about it like this. Computers are like brains. Both have short-term and long-term memories. Brains handle short-term memory in the prefrontal cortex, while computers handle it with random-access memory (RAM).
Brains and RAM process and remember things while awake, and both get tired after a while. Your brain converts working memories into long-term memories while you sleep, and computers transfer active memory into storage volumes when it sleeps. Computers also distribute data by type in the same way brains distribute memories by semantic, spatial, emotional, or procedural.
A brief history of data storage devices
Perhaps the best consolidated history of data storage devices is contained within the first dozen pages of Gordan Haff and William Henry’s From Pots and Vats to Programs and Apps: How Software Learned to Package Itself.
In it, Haff and Henry describe how a 1725 textile worker programmed looms using punchcards that were inspired by automated organs’ cylinders. Punchcards fed information into a 19th century computer as part of the 1890 U.S. Census and remained popular until the era of magnetic tape drives began in the 1950s. From there, the size of magnetic tape drives shrank until they became cassette tapes.
Right before the 1970s, IBM released the floppy disk—which were used for almost everything. Floppies initialized mainframes, stored software applications, and were the only persistent storage device available until hard disk drives (HDDs) dropped in price. HDDs became compact disks (CDs) in the 1980s, and solid state drives (SSDs) replaced the spinning disks with solid chips and flash memory. Flash storage now fits in our pockets as flash drives that hold hard copies of everything we want or need.
Data storage types
Software-defined storage
Software-defined storage (SDS) uses abstraction management software to decouple data from hardware before reformating and organizing it for network use. SDS works particularly well with container and microservice workloads that use unstructured data, since it can scale in ways hardwired storage solutions simply can’t.
Cloud storage
Cloud storage is the organization of data kept somewhere that can be accessed through the internet by anyone—given the right permissions. You don’t need to be connected to an internal network (that’s known as NAS) and aren’t accessing the data from hardware directly attached to your computer. Popular cloud storage providers include Microsoft, Google, and IBM.
Network-attached storage
Network-attached storage (NAS) makes data more accessible to internal networks by installing a lightweight operating system onto a server that turns it into something called a NAS box, unit, or head. The NAS box becomes an important part of intranets because it processes every single storage request.
Object storage
Object storage, also known as object-based storage, is a flat structure in which files are broken into pieces and spread out among hardware. In object storage, the data is broken into discrete units called objects and is kept in a single repository, instead of being kept as files in folders or as blocks on servers.
File storage
File storage arranges data as hierarchical files that users can open and navigate from top to bottom. Since files are stored on back ends and front ends the same way, users can requests files by unique identifiers such as names, locations, or URLs. This is the predominant human-readable storage format.
Block storage
Block storage splits storage volumes into individual instances known as blocks. Each block exists independently, which gives users complete configuration autonomy. Because blocks aren’t burdened with the same unique identifier requirements as files, blocks are a faster storage system—making them ideal formats for rich media databases.
How do I learn to use storage?
The way you learn to do anything else: practice. Deploying a new storage system is a lot smoother with training, and we have a ton of ways to make sure you’re ready. If you think you’re blessed with an innate knowledge of storage systems—or just want to see if you know enough to be dangerous—take this little storage quiz to assess your skill level. If you need some training, take a few courses from our cloud computing, virtualization, and storage curriculum, complete the whole thing, or take the ones required for you to get a Red Hat Certificate of Expertise in Hybrid Cloud Storage.
Why Red Hat?
Software-defined storage is inherently open. It decouples hardware from software, freeing you from vendor lock-in. Red Hat has taken «open» a step further. Our software-defined storage is also open source. It draws on the innovations of a community of developers, partners, and customers. This gives you control over exactly how your storage is formatted and used—based on your business’ unique workloads, environments, and needs.