Online  video is a huge part of our connected world today. It’s a medium that we use  daily to share, communicate, learn, and of course, be entertained – and there  seems to be no limit to its growth.   Facebook is a great example – they are now getting an amazing 8 billion video views a day, more than double what  they saw 6 months earlier. According to a recent Cisco report, video  traffic will be 80% of all consumer Internet traffic in 2019, up from 64% in  2014, and mobile video  will increase 11X in the next 5 years. In China alone, the online video market  is expected to reach more than $17B by 2018 according to iresearch.

As a  video and tech enthusiast, these developments are hugely exciting, but this  relentless deluge of video does indeed present some very real challenges (and  opportunities). Infrastructure challenges are obvious, given the need for  increased storage and compute to process, transcode and manipulate videos for end  user consumption.   However, there is another,  less straightforward, problem to overcome. How can viewers best navigate the  flood of online video content? And how can content providers and advertisers efficiently  and intelligently provide video content that is relevant (and useful) to consumers?

This is  certainly a daunting task and something that we as humans are ill equipped to  handle.  Frankly, it is no wonder that many  companies are investigating the possibility of developing intelligent systems that  leverage machine learning and deep neural networks (DNN) to help automate these  tasks.

Viscovery_2

With  this in mind, Intel, Quanta, and Viscovery came together to build a full stack  solution to this problem that leverages a deep learning based application from  Viscovery, the power and scalability of Intel® Xeon® processors, and Quanta’s  efficient platform designs.  We created a  turnkey solution specifically designed to solve the video content recognition  problem. At Intel, we recognize that it is critical to take a holistic view when tackling these types of  challenges and enable solutions that include everything from the silicon and  server hardware to the libraries and open source components all the way to the  end application. And of course, all of these ingredients must be optimized for  cloud scale deployments.  Below is a high  level view of the solution stack:

In order  to tackle these problems at scale, libraries like Intel® Math Kernel Libraryand optimized open source components like Caffe* are tightly integrated into  Viscovery’s Deep Learning-based video content  recognition engine to take  full advantage of the performance of Intel® processors.  The result is a solution that seamlessly runs  across Intel® Xeon® and Intel® Xeon Phi™ processor-based platforms providing  the capability to train DNNs quickly and deploy at scale at an efficient total  cost of ownership. Below is an example of types of content that the Viscovery  application uses to train their DNNs.  As  you can see they’ve moved significantly beyond simple image and object  classification:

螢幕快照 2016-06-06 下午4.26.06

Of  course, the real proof of success is in the usage of this platform by end  customers.  Leaders in video content  delivery such as LeEco, 8sian, Sohu, and many others  have already deployed solutions based on this stack.

(The author is Joseph Spisak from Intel Corporation)

Viscovery keynote: