SCREEN AFRICA EXCLUSIVE:
Written by Jerôme Wauthoz, VP Products, Tedial
The industry is buzzing with talk about, and deployments of, AI and machine learning technologies, which are making strides in production environments. But have you thought about the difference between the two?
In the parlance of computer science, machine learning uses statistical techniques to give computer systems the ability to progressively improve the performance of handling a specific task by collecting and analysing data, which always creates improvement without explicit programming. Machine learning is closely related to computational statistics: the basis of prediction through the use of computers. It is sometimes conflated with data mining, where systems are designed to focus on data analysis and is sometimes referred to as unsupervised learning.
On the other hand, artificial intelligence (AI) is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals. AI research is the study of “intelligent agents” or any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. Colloquially, the term artificial intelligence is applied when a machine mimics cognitive functions that humans associate with other human minds, such as problem solving.
What can be considered AI is changing all the time: as computers become increasingly capable, tasks thought of as requiring intelligence are now often removed from the definition. For example, optical character recognition is today not considered artificial intelligence because it has become routine technology.
In the twenty-first century, AI techniques have experienced a resurgence following concurrent advances in computer power, large amounts of data, and better theoretical models. The techniques of AI have become an essential part of the technology industry, helping to solve many challenging problems in computer science, software engineering and operations research.
While we can consider AI to be a blanket term that encompasses many media applications, the reality of our industry is that most of these software applications are not currently truly intelligent and must be taught to carry out their particular function. In a real sense, we are applying machine learning tools to process or manage mundane, repetitive chores that have no intelligence other than the data we as trainers supply.
A clear example is often discussed in our industry: the review and metadata annotation of all the assets stored in a deep archive through years of programming. A software tool assigned this task must “learn” to identify the images, recognising the important from the inconsequential, and must have a clear set of objects to note. Cloud-sourced systems that leverage crowd input can be useful to our modern media applications, such as recognizing a location or a celebrity.
However, a unique library of historical media may have no external reference or crowd-sourced knowledge and therefore must be personally taught to recognise the criteria for evaluation and annotation before it can be a useful tool. As media executives and broadcasters, we must recognise that regardless of what we call this service, it is really machine learning we are implementing and there’s effort needed to train a computer to be a useful tool in our applications.
What’s happening now?
Although many companies are investigating applications and requesting proof of concept demonstrations, not many end users have taken the plunge into the deep waters of machine learning because the applications and return on investment are still unknown. What we are all looking for is the “killer app”; the clever application that increases efficiencies, reduces labour or creates new opportunities for monetising an existing library.
As AI systems evolve, more of these specific applications are becoming apparent, and at Tedial we like to think of them as “clever AI”.
Let’s take a look at some examples of clever AI. Software tools have been used for years to convert speech to text and systems have been applying this technology to annotate frame-accurate proxies with the text-for-media searches. A clever application is to enable the underlying data model to recognise and trigger automated orchestration workflows based on the occurrence of specific key words, typically a specific action in a sporting event, and automatically creates an edited clip of the action or automatically distributes the media to appropriate locations or downstream services.
Many cloud-sourced AI services can be leveraged in clever ways to augment or annotate video and audio, recognising celebrities, locations, music beds etc. These tools can make searches faster and more relevant and reduce risk by recognising and alerting management to license infringement problems and/or managing the details of image release documentation. But these applications don’t provide a clever AI result. The ability to evaluate media and judge its “sentiment” is currently an important machine learning exercise under investigation by many media outlet; the clever AI application is taking that recognised sentiment and using it to replace human-generated recognition and action steps. The clever AI application would be to train a system to recognise players, athletes or fans in the crowd to bring more editorial value to a feed. For example, in live sports, a system could be trained to recognise colourful pictures of fans cheering or an athlete celebrating and automatically add the scene into the event highlights at the right position.
At Tedial, as we build more functionality into our SMARTLIVE live sports event system, we are working to enhance our AI integrations to provide measurable clever AI.