Your A to Z Dictionary to What’s What in Object Storage
This Object Storage (OBS) dictionary is a compilation of the most important Object Storage terms and acronyms. From Namespace to Zeta Architecture, we’ve put together helpful definitions for anyone looking to understand Object Storage. For everything from Active Archiving to Metadata, read part I of our blog here.
For further learning click the hyperlink on each of the terms.
Namespace is the collection of objects held within an object storage. Namespaces may span multiple physical storage systems and locations but act as a single unifying construct under which all of the objects reside.
Objects are the singular entity, similar to a file that contains data stored within an OBS system.
Protection is the safeguarding of data against loss or inability to access it when needed. Common data protection schemes within OBS include erasure code and geo-spreading.
Query analytics is the ability to analyze data from a query, or question. The availability of large-scale storage enables queries that are more effective since there is more data to interrogate, which typically improves results.
Representational State Transfer
(REST/ReST) APIs are an architectural style for networked applications that require interoperability across servers, storage, etc. generally through the HTTP protocol. REST is a stateless, client-server, cacheable communications protocol.
Sharding is the breaking up of an object into smaller shards/fragments from which the original object can be reconstructed. Typically, in an OBS environment erasure coding is applied to each shard, which is then distributed as widely as possible across physical storage to protect data availability against any localized component failures.[Tweet “Here’s Part II of the A-Z Guide on Everything #ObjectStorage”]
Tape consolidation is the rationalization of multiple backup processes and/or copies made of data into a single copy. Deployment of an active archive storage solution can enable an enterprise to reduce the retention period of its onsite backups as well as the number of offsite backup copies resulting in operational savings.
Unstructured data is any kind of data does not have a predefined model or organizational layout such as that found within a columnar database. Common types of unstructured data include office productivity files, machine generated log files, images, audio, video, and other multimedia content such as web pages, etc.
The volume of data is increasing dramatically. International Data Corporation (IDC) estimates that by 2020 there will be more than 44 zettabytes (44 billion terabytes) of data stored.
Web-scale describes an architectural approach to computing that historically assumed very large deployments such as an enterprise data center (or larger) but today tends to describe the scalability associated large cloud service providers, if not frankly, the entire planet.
X100 is the integrated object storage solution from Western Digital’s ActiveScale™ product family that can scale up to 52PB within a single namespace.
Yottabyte is one septillion bytes, an immense amount of data. IDC estimates by 2020 there will be 44 zettabytes[i] of stored data globally, which would equate to 0.044 yottabytes.
It is a high-level enterprise architecture associated with Big Data that seeks to simplify business processes and define a scalable way to integrate data rapidly into a business environment.
is a high-level enterprise architecture associated with Big Data that seeks to simplify business processes and define a scalable way to integrate data rapidly into a business environment.
[i] Source: IDC, The Digital Universe of Opportunities Study, 2014