Machine Learning and Systems: A conversation with 2020 Field Award winners Al Hero and Anders Lindquist
Hero and Lindquist took a few minutes to talk about the impact of machine learning on Signal Processing and Control Systems, and what they plan to do about it
Al Hero and Anders Lindquist, 2020 IEEE Field Award winners in the areas of Signal Processing and Control Systems, discussed how machine learning (ML) is impacting both of their fields, and how they plan to make their mark on ML in the future.
Alfred O. Hero received the 2020 IEEE Fourier Award “for contributions to the foundations of statistical signal processing with applications to distributed sensing and performance benchmarking.” Hero is the John H. Holland Distinguished University Professor of Electrical Engineering and Computer Science and R. Jamison and Betty Williams Professor of Engineering at the University of Michigan. His current research is on data science and developing theory and algorithms for data collection, analysis and visualization that use statistical machine learning and distributed optimization.
Anders Lindquist received the 2020 IEEE Control Systems Award “for contributions to optimal filtering, stochastic control, stochastic realization theory, and system identification.” Lindquist is Zhiyuan Chair Professor at the Shanghai Jiao Tong University and Professor Emeritus of Optimization and Systems Theory at KTH Royal Institute of Technology, Sweden. He works on mathematical problems in systems and control, as well as in signal processing, modeling, estimation, identification and image processing.
Lindquist visited the University of Michigan as an invited speaker, giving us the perfect opportunity to take a brief period of their time to hear both Hero’s and Lindquist’s thoughts on the future of their respective disciplines, and how they may be converging.
The discussion quickly centered on the relatively new technique known as Machine Learning (ML). ML has proven to work so well, it’s become the shiniest coin in the box of techniques for applications including computer vision, autonomous driving, and mining big data – even though it exists without a formal underlying theory.
Following is a summary of the discussion:
Where is the area of Signal Processing headed?
To start at the beginning, signal processing historically started in radar, speech, and acoustics – areas which were very connected to physical systems. Basically, in the early days of signal processing the main issue was the development of mathematical techniques to extract information from noisy, complex continuous time physical signals.
The field grew as advances in electronics and analog computing permitted implementation of more sophisticated mathematical models and, when we entered the digital age, digital computing enabled digital signal processing, which led to widespread integration of signal processing algorithms into technologies we use everyday today.
Digital signal processing was the first step towards what we now call data science.
Then came information theory, and specifically [Claude] Shannon’s sampling theorem.
Signal processing and machine learning have become closely related fields.Al Hero
Yes! Shannon’s contributions to sampling theory underpin digital signal processing, and complement his contributions to communications and coding, cybersecurity, computer engineering, cryptography and other areas.
We are now in the third revolution in signal processing.
We’re seeing a move towards signals being generalized to structures like function spaces, graphs and tensors as data sources become more diversified and provide increasingly complex data. Dealing with the complexities of high data dimension, data heterogeneity, and missingness have sharpened the focus of signal processing onto emerging challenges, like data integration, dimensionality reduction, sparse approximation, and, more generally, learning reliable and accurate signal models from data. Thus signal processing and machine learning have become closely related fields.
But what is machine learning?
As it is a highly multidisciplinary field championed by computer scientists, statisticians, mathematicians, and electrical engineers, there is not a single widely agreed-upon definition of machine learning. Most experts would agree, however, that a main objective of machine learning is to develop principles for automated detection of meaningful patterns in data.
One problem, of course, is the lack of mathematical theory underlying machine learning.
Yes – especially when considering recent machine learning trends towards more and more complex architectures, i.e., deep learning networks, it is difficult to understand why an algorithm works. Signal processing, on the other hand, like Control, is based on explaining why things work.
Precisely. There’s a mathematical structure.
In our field we strive to develop sufficient understanding to be able to predict when a signal processing algorithm will fail. As a result, you have signal processing algorithms that are in cars and in airplanes. They are an integral part of control systems that are reliable precisely because you know what the failure conditions are and can account for them in the design.
When dealing with complex machine learning algorithms, you often can’t predict how well an algorithm that’s been developed on one training set is going to work when applied to a different one.
An interesting development along these lines is what researchers in machine learning call adversarial learning where one tries to purposefully break a machine learning model by inputting unusual data. This is an encouraging trend that is moving towards what we’ve been doing in control and signal processing for decades.
We [signal processing and control researchers] approach machine learning in a different way than a typical computer scientist. Computer science is a discipline that has always been able to rely on more and more computer memory and faster processing to solve problems. Signal processing and control are disciplines built on physical models and theory. We believe you’re not doing things in the smartest way if you don’t have an underlying theory.
Theoretical computer scientists have developed an elegant theory for machine learning, but the spectacular advances in deep learning networks did not grow out of theoretical computer science.
We in signal processing and control are attempting to identify applicable frameworks for deep learning which describe perturbations and variations in the operating conditions of an algorithm, so that it will be robust, and can be depended upon.
It’s a great open problem to try to create a mathematical theory for machine learning.Anders Lindquist
Because of the demands of new technology, I see signal processing and control becoming even more intermingled in the future. We are in the fourth industrial revolution – which is cyber physical systems (CPS).
CPS stands on three pillars: communication, computing, and control. This brings together people like Shannon, [Alan] Turing and [Rudolf] Kalman.
These people were both mathematicians and engineers. This is very important because they came up with mathematical theory for real engineering problems.
And with cyber physical systems, you are combining the physical world and computational world. And then there are the devices that interface both worlds: actuators and sensors that collect data and act on it.
It’s a great open problem to try to create a mathematical theory for machine learning.
Many of us are trying to develop theory that can put highly overparameterized and complex deep learning algorithms on firm theoretical footing. These algorithms are often based on millions of tunable parameters and theory could lead to insights on how to tune them. Our vision is that there will eventually exist similar design principles for deep learning as are used to design automatic control systems, predictors, and adaptive filters that are ubiquitous in applications ranging from channel equalization in communications to automatic braking systems in cars. Because there, you can do some analysis based on the models of uncertainty and determine how perturbations of the data can affect the decisions coming out of an algorithm.
Generally speaking, we are not yet at this point for machine learning algorithms.
So what I’ve seen, and I think it’s similar in control, is that the unexplainable success of deep learning has piqued our interest. These vastly overparameterized systems have too many degrees of freedom to satisfy the intuitive and mathematically understood principles of generalization, error and stability. Furthermore, the algorithms seem to work despite the fact that they use very high dimensional generic models for the data that require massive amounts of experimentation to tune the parameters.
It is worth saying that there was a widely held skepticism in the signal processing community against neural networks and overly complex systems that were difficult to understand or analyze. They couldn’t be analyzed, and you couldn’t mathematically predict the accuracy, stability or robustness of their performance. In the early days, machine learning methods didn’t really demonstrate that they could do any better than standard techniques.
But then – I remember attending machine learning conferences, where they were showing a continuous improvement of neural network results over the last decade. And this was while using an increasingly complex architecture – more layers – with increasingly complex computational algorithms and larger and larger databases to train on.
You just couldn’t look away any more.
And you think you’ll be able to find a model for why this has worked?
Let me put it in blunt terms. If we’re not able to do it, then disaster looms.
With the trend towards deregulation in the United States, this could lead to companies putting out inadequately tested beta versions of software on critical systems – leading to more tragic incidents like the recent Uber and Tesla autonomous driving software glitches. In the Uber incident a pedestrian was killed because of machine learning and computer vision software that failed to recognize a bicyclist crossing the road.
I’d like to add another perspective, and invite you to Shanghai. There’s some crazy traffic in Chinese big cities, even with regular drivers. You don’t need machine learning to cause accidents, but with machine learning, you don’t seem to have anyone to blame.
[And then the discussion became more philosophical]
Recently results of a survey were published that are relevant to this blame issue. People were asked to make the following choice. If they were going to die in a lethal crash, would they prefer to die having someone else who was at the wheel of the opposing vehicle, or an algorithm controlling the wheel? The survey indicated that more people preferred to die when there was a person who they could either blame – or forgive – for human error.
And this all gets to a philosophical question of what is the responsibility of society.
This goes way beyond signal processing. But this is why I say that I believe that the consequences of not figuring this out are going to be very high. Now, perhaps the consequences will only be that machine learning ends up in the trash bin of history.
Yes – but then you also have all this artificial intelligence, which also has a lot of moral implications. Even if I do not believe in all the hype about AI, how much are we going to let AI take over our lives?
And how much is AI going to displace workers and disrupt economies? The economies may become more efficient at production, but the consumers won’t have jobs. And what will it mean for humans not to have to work? A fulfilling job gives a person a sense of purpose in life and interacting with diverse co-workers builds kinship beyond one’s immediate social group.
At this point, Hero had to rush out to catch his plane, and fulfill one of his purposes in life.