How Data Storage Has Changed
Types of Data
Data can be classified as structured or unstructured based on how it’s stored and managed. Structured data is organized in rows and columns in a rigidly defined format so the applications can retrieve and process it efficiently. Structured data is typically stored using a database management system (DBMS).
Data is unstructured if its elements cannot be stored in rows and columns, and is therefore difficult to query and retrieve by business applications. For example, customer contacts may be stored in various forms such as sticky notes, email messages, business cards or even digital format files such as .doc, .txt and .pdf. Due to its unstructured nature, it’s difficult to retrieve using a customer relationship management application. Unstructured data might not have the required components to identify itself uniquely for any type of processing or interpretation. Businesses are primarily concerned with managing unstructured data because more than 80% of enterprise data is unstructured and requires significant storage space and effort to manage.
Structured versus Unstructured Data
Structured data is that found and organized into rows and columns. Unstructured data is that found in PDF files, email messages and attachments, instant messages, x-rays, scanned images, manuals, forms, contracts, audio, video, invoices, rich media and Web page content.
Data, whether structured or unstructured, does not fulfill any purpose for individuals or businesses unless it’s presented in a meaningful form. Businesses need to analyze data for it to be of value. Information is the intelligence and knowledge derived from data.
Businesses analyze raw data in order to identify meaningful trends. On the basis of these trends, a company can plan or modify its strategy. For example, a retailer identifies customers’ preferred products and brand names by analyzing their purchase patterns and maintaining an inventory of those products. Effective data analysis not only extends its benefits to existing businesses, but also creates the potential for new business opportunities by using the information in creative ways. A job portal is an example of this. In order to reach a wider set of prospective employers, job seekers post their resumes on various websites offering job search facilities. These websites collect the resumes and post them on centrally accessible locations for prospective employers. In addition, companies post available positions on job search sites. Job-matching software matches keywords from resumes to keywords in job postings. In this manner the job search engine uses data and turns it into information for employers and job seekers. Because information is critical to the success for business, there’s an ever-present concern about its availability and protection. Legal, regulatory and contractual obligations regarding the availability and protection of data only to add to these concerns.
Data created by individuals or businesses must be stored so it’s easily accessible for further processing. In a computing environment devices designed for storing data are termed storage devices for simply storage. The types of storage used varies based on the type of data and the rate at which it is created and used. Devices such as memory in a cell phone or digital camera, DVDs, CD-ROMs and hard disks in personal computers are examples of storage devices. Businesses have several options available for storing data including internal hard disks, external disk arrays and tapes.
Evolution of Storage
Historically, organizations had centralized computers (mainframe) and information storage devices (tape reels and disk packs) in their data center. The evolution of open systems and the affordability and ease of deployment they offer made it possible for business units/departments to have their own services and storage. In earlier implementation of open systems, the storage was typically internal to the server.
The proliferation of departmental servers in an enterprise resulted in unprotected, unmanaged, fragmented islands of information and increased operating cost. Originally, there were very limited policies and processes for managing these servers and the data created. To overcome these challenges, storage technology evolved from non-intelligent internal storage to intelligent networked storage.
Highlights of the technology include:
• Redundant Array of Independent Disks (RAID): This technology was developed to address the cost, performance and availability requirements of data. It continues to evolve today, and is used in all storage architectures such as DAS, SAN, etc. (See below for RAID Chart.)
• Direct-Attached storage (DAS): This type of storage connects directly to the server (host) or a group of servers in a cluster. Storage can be either internal or external to the server. External DAS alleviated the challenges of limited internal storage capacity.
• Storage Area Network (SAN): This is a dedicated, high-performance Fibre Channel (FC) network to facilitate block-level communications between servers and storage. Block storage is just that: evenly sized blocks of data. Database servers can often take advantage of block storage systems. Storage is partitioned and assigned to a server for accessing its data. SAN offers scalability, availability, performance and cost benefits compared to DAS.
• Network-Attached Storage (NAS): This is dedicated storage for file servicing applications. Unlike SAN, it connects to an existing communication network (LAN) and provides file access to heterogeneous clients. This is the most familiar kind of storage – it’s what we interact with most on a daily basis. Users of file storage have access to files and can read and write to either the whole file or a part of it. File systems are what operating systems provide on all of our personal computers. In a shared environment file storage is often seen as a network drive. Because it’s purposely built for providing storage to file server applications, it offers higher scalability, availability, performance and cost benefits compared to the general purpose file servers.
• Internet Protocol SAN (IP-SAN): IP-SAN is a convergence of technologies used in SAN and NAS. IP-SAN provides block-level communication across a local or wide area network (LAN or WAN), resulting in greater consolidation and availability of data.
Object storage is probably the least familiar type of storage to most people. Object storage doesn’t provide access to raw blocks of data and does not offer file-based access. Object storage provides access to whole objects, or blobs of data and generally does so with an API specific to that system. Unlike file storage, object storage generally does not allow the ability to write to one part of a file. Objects must be updated as a whole unit. Three of the most common commercial object storage systems are Amazon’s S3, Google’s Cloud and Microsoft’s Azure. Object storage excels at storing content that can grow without bound. Perfect use cases include backups, archiving and static Web content like images and scripts. One of the main advantages of object storage systems is their ability to reliably store a large amount of data at relatively low cost.
Where to go from here
To speak with a data storage specialist, call (631) 789-9595 or fill out our Information Request Form and a representative will call you back shortly.