The obsession with models, meaning, the neural network architectures that define the form of a machine learning program, has run its course, said Re. Re recalled how in 2017, “models ruled the world,” with the prime example being Google’s Transformer.
What Re termed “new model-itis,” the obsession by researchers to tweak every nuance of architectures, is just one of many “non-jobs for engineers” that Re disparaged as something of a waste of time. Tweaking the hyper-parameters of models is another time waster, he said.
For most people working in machine learning, “innovating in models is kind-of not where they’re spending their time, even in very large companies,” he said.
Where people are really spending time in a valuable way, Re contended, is on the so-called long tail of distributions, the fine details that confound even the large, powerful models.
It is a discipline, ultimately, he said, where engineers spend their time on more valuable things than tweaking hyper-parameters.
Engineers spend their time “monitoring the quality and improving supervision,” said Re, the emphasis being on “human understanding” rather than data structures.
Overton, and another system, Ludwig, developed by Uber machine learning scientist Piero Molino, are examples of what can be called zero-code deep learning.
“The key is what’s not required here,” Re said. “There’s no mention of a model, there’s no mention of parameters, there’s no mention of traditional code.”