SCREEN AFRICA EXCLUSIVE:
Back in 2004, a Chicago-based software designer, Martin Fowler, visited the rainforests of Queensland on the east coast of Australia. He was intrigued by the huge strangler vines, which seed themselves in the upper branches of fig trees and gradually work their way down the tree until they root in the soil. Over many years they grow, strangling and killing the tree that was their host. Equating this to rewriting critical systems software – where the chosen route is gradually to create a new system around the edges of the old, letting it grow slowly over several years until the old system is no longer used – he coined the metaphor ‘Strangler Application Pattern.’
The strangler pattern is commonly used in technology fields these days and the terminology is best understood as incrementally migrating a legacy system, by gradually replacing specific pieces of functionality with new applications and services. As features from the legacy system are replaced, the new system eventually replaces all of the erstwhile system’s features, ‘strangling’ the old system and allowing you to decommission it. There are hundreds of media asset management (MAM) systems out there and as technology develops, some organisations are adopting artificial intelligence (AI) to help them create and use the strangler software development pattern to slowly take over legacy MAM system functions until nothing is left but the new system. There are, furthermore, also many MAM systems out there that are allowing third-party AI applications to enable their users to train their machines to think like them – with the end goal of eventually replacing the user in order to cut costs.
I have been fortunate enough to be involved in the beta testing of software being designed as a potential plug-in to many existing MAM systems, with the aim of utilising AI to do all the hard graft in terms of logging and metadata creation – by having the system learn to do what the operator does.
The ultimate objective of these initiatives is to provide a completely unmanned media asset manager: a system with no human involvement at all, and a typical strangler pattern in practice. The software was originally developed to count and identify the sex of salmon swimming through a canal leading to an upstream river breeding area. The software not only extracts enough metadata from a camera clip to make presumptions about seasons, GPS locations and time of day, but also ‘visualises’ the clip to identify species type (human, animal, insect, etc.), identify the species itself (e.g. lion, leopard) and even possibly its gender. Another method used is audio sampling, which further assists the system log the shot.
While all of these things can easily be done by human eyes and ears, the advantage of having artificial intelligence perform the tasks is phenomenal when it comes to speed. Though it is still in a very rough beta stage, this system – in a test conducted using footage that I supplied – was able to identify all the species of animals around a watering hole, and from the camera’s metadata, told me where it was filmed, the time of day and season of the year. There were 27 shots and the logging took less than a second and even in this early stage of development, it was 100% accurate.
The beauty of AI is that the system continually learns from any errors it generates. When human intervention overrides incorrect information, the system remembers the changes – and even improves on them.
This developmental software is not unique at all. Third-party plug-ins are being used in many asset management solutions, such as Squarebox Systems’ CatDV, one of the pioneering media asset management solutions out there. The CatDV developers have started integrating video and image analysis options from AI vendors into their suite of systems and, through these integrations, CatDV is offering a range of advanced capabilities, including:
- Speech-to-text, to automatically create transcripts and time-based metadata;
- Place analysis, including identification of buildings and locations without using GPS tagged shots;
- Object and scene detection, e.g., daytime shots or shots of specific animals;
- Sentiment analysis, for finding and retrieving all content that expresses a certain emotion or sentiment (e.g., “find me the happy shots”);
- Logo detection, to identify when certain brands appear in shots;
- Text recognition, to enable text to be extracted from characters in video; and
- People recognition, for identifying people, including executives and celebrities.
Another example is Avid | AI Media Analytics, which provides a framework that automates content indexing, such as facial detection, scene recognition and speech-to-text conversion, by using Microsoft Cognitive Services (Azure Video Indexer) to auto-index content using machine learning algorithms, creating a library of rich metadata that can readily be searched.
Avid have also teamed up with Finnish start-up firm Valossa, which grew out of one of Europe’s leading computer science and AI labs at the University of Oulu and integrated their software into Avid | Media Central. The comprehensive audio/visual recognition solution from this partnership can detect and identify people based on their age, speech patterns, sounds, emotions, colours and dialogue.
Dave Clack, CEO of Squarebox Systems, sums it up: “An AI-powered MAM solution offers a way forward. A great approach is to add to the MAM’s existing logging, tagging and search functions through integrations with best-of-breed AI platforms and cognitive engines, such as those from Google, Microsoft, Amazon and IBM, as well as a host of smaller, niche providers. These AI vendors and AI aggregators enable media asset managers to leverage AI analysis tools for speech recognition and video/image analysis, with the flexibility to be deployed either in the cloud or in hybrid on-premises/cloud environments.”
With the media and entertainment industry slowly being transformed by artificial intelligence, the future is bright for AI-powered MAM. In the right hands, AI becomes the key that unlocks the next generation of MAM technologies, and – just like the strangler vine found in the forests of Queensland – artificial intelligence is slowly, ever-so-slowly growing in many broadcast applications, and in some faster than you think.