Did you notice that we are living in a world where objects either have or will have the ability to communicate or generate some kind of data? The data growth is inexorable-
- the amount of data created in the past few years is probably more than ever in history of mankind[1]
- stored data is expected to grow close to 3x in the next 3 years[2]
- both unstructured data (images, videos) and structured data (logs, sensors) are expected to continue to grow exponentially[3]
Big data has already started helping organizations to improve their bottom lines, gain agility, spot meaningful innovations, and transform businesses[4]. The data revolution has turned out an impetus for other enabling industries such as compute, storage, and network. This unprecedented growth of data raises a question: why is data so important today?
[Tweet “5 Ways Data has Altered Our World #bigdata #ML”]Let’s walk through some examples to understand and appreciate how data is creating value and how data is changing our world. The list is by no means exhaustive, but illustrates the incredible data-enabled changes in our world.
Connecting People and Societies
Social media platforms are the best illustration of how data has created an entire new industry. The new ways of connecting people and societies have generated new kinds of financial metrics such as average revenue per user, revenue/cost per click, monthly/daily active users, new users per day etc. Companies, which are successful in analyzing and pushing more personalized content, generally run high on the new financial metrics and hence create more value. The “always on” platforms continuously monitor and record individual connections, preferences, places of interests, priorities etc. The social graph data is stored in different storage tiers (memory, solid-state drives, and magnetic media and others), transformed and then mined with complex mathematical algorithms. The machines then predict individual behavior/decisions in a given environment, which help companies to push personalized advertisements and generate revenue.
New Ways to Measure, Manage, and Control
Technology innovations have delivered many new ways to measure, manage, and control applications. The easy picks are health care, temperature, gas, identification, security, and collision sensors/detectors. The machine-generated data of these applications is stored either on the device or in the cloud. The compute quickly runs standard algorithms on the data and then triggers predefined actions such as contact a doctor or ambulance, shut off gas or power supply, or prompt a threat alert etc.
Shifting gears, going a little complex now, consider big machines such as aircrafts, ships, oil & gas rigs, excavators, heavy cranes, trains, and manufacturing lines. All these machines require regular maintenance or part replacements to avoid costly, unplanned downtime. In some cases, such as aircrafts and railways, preventive maintenance is required because failure can lead to economic as well as loss of human lives. Therefore, it is essential to continuously measure and store vital data, transmit it to a high-performance cloud, and let software analyze it and trigger critical actions. One example would be analyzing the behavior of several machine parts to predict failures, forecast replacements, schedule and synchronize replacements, and automatically order new parts using cloud-based ERP systems. This is how new data-driven systems help eliminate unplanned downtime and streamline supply chain processes. Overall, increasing data and analytics in the industrial world is improving productivity, efficiency, and safety.
A Profound Way to Search
Just few decades ago it was hard to find information. People used to meet experts and would have to get their hands on journals and books to filter any meaningful information. Interestingly, similar to digital libraries, people have created digital indexes and developed query languages to search key words. When someone is looking for information, machines just query an index table, locate the data, and start fetching it. As time passed by, data grew rapidly and required more indexing so that even more information can be organized and searched. Indexes started growing in size and complexity, which resulted into more time to find desired data.
Today’s search will often rely on caching complex and big indexes in DRAM (or SSDs) so that it is faster to locate the required data and then start to pull it from the address. Once data is located, users are provided with results in a fraction of a second[5]. Today’s search has reached new heights, with boundless access to everything from knowledge bases to local businesses or services information on our handheld devices, available on the go.
Activity Automation by Machines
Structured machine-generated data or logs carry a ton of actionable information such as security, transactions, and compliance. Machines can mine this type of structured data very well, process them quickly with complex mathematical transformations to detect anomalies, and within no time trigger actions such as fraud detection and safety alerts. A typical example is how credit card companies detect hackers by a pattern of unusual monetary sums, frequency of cash withdrawal or purchases, geographic location, ip address, and other signatures to identify fraudulent activity. Once fraudulent activity is detected, transactions are immediately blocked and the customer receives a new card. In other industries such as restaurants, the logs help in providing real-time operational intelligence such as customer food preferences by location, discount coupons, marketing campaigns, and customer service time.
Enabling New Discoveries
What previously seemed impossible is possible today. With the advancements in distributed compute and storage, high performance machines view and analyze data patterns in several ways to derive many useful conclusions. Healthcare is one area where data is propelling incredible advancements. Deep learning can enable machines to scan thousands of medical records, cluster the data in many different ways, and use statistics to identify interesting and useful patterns. The machines can then use these patterns and probability models to predict whether a patient is likely to be affected by a disease or not. Climate change is yet another good example. We have data on socio-economic populations, historic temperature measurements, flood and drought statistics, natural calamity, and others. Once all that data is connected and analyzed, it is expected to give us many insights that will help us to understand global climate patterns or even how to restore natural resources or fauna.
Data is Changing Our World
The examples above, definitely not exhaustive, highlight how data has turned into a high-value strategic asset because of its ability to connect people, allow ways to measure and control, find deeper insights for efficiencies, enable search, and bring about new discoveries. Data is changing our world. We are amid a transition to a data-centric world, and the technology that powers it through storage, compute and network is bringing forth new possibilities affecting each one of us.