It is important to understand the concept of big data and the benefits it produces, considering 90 per cent of the world’s data has been generated over the past 12 months, according to Cameron Bahar, software executive at Huawei Technologies.
Bahar was speaking during his presentation at the Gartner Symposium in Cape Town.
“At its core it’s about getting information from data… Processing the information across disparate data sources, multiple data bases, and then coming up with insight that wasn’t obvious is the goal of this big data space,” said Bahar.
Bahar said 600 blog posts, 240,000 Facebook posts, 45,000 Tweets, and 35 hours of YouTube video content is shared every minute.
Google processes a massive 20 petabytes (PB) every day, said Bahar, while seven billion text messages are both sent and received daily.
One million devices had been connected to the internet in 1992, which grew to 600 million in 2006, and then to five billion in 2010. According to Bahar, an estimated 50 billion devices will be connected to the internet in 2020.
Bahar said machine generated data is considerably more than human generated data and is driving big data.
He said it is important to process data quickly once it is stored, “otherwise it becomes stale and useless – the older it gets the less relevant it becomes”.
Bahar said he has been involved with data warehousing for companies such as Wallmart, which was structured to advise them what product to put on the appropriate shelf, in the respective city during the appropriate month in order to maximise their sales.
“The companies who bought into that vision… and [companies] that didn’t buy our database at the time fell behind largely because they couldn’t make decisions fast enough,” said Bahar.
Infrastructure requirements for effective big data includes high throughput ingest, scale capacity and bandwidth on demand, and to process data in near real time.