ML Proverbs
Model Maxims & Data Dogmas This article was originally published on Leonard’s Substack
October 23, 2023Model Maxims & Data Dogmas
This article was originally published on Leonard’s Substack.
In the same vein as Rob Pike’s Go Proverbs, engineers relish a good aphorism. I’ve tried to consolidate these frequent murmurs of the ML/AI community into somewhat tangible anchors:
Garbage in, garbage out. (💩+ 🧠 = 💩)
Bad ingredients tend to always lead to a bad dish. Be mindful of what you consume.
Correlation does not imply causation.
Two variables moving together doesn’t automatically make one the puppet master.
Good models will become bad.
Models that don’t adapt to the changing landscape will degrade.
Measure twice, optimize once.
Ensure that you understand the problem and metrics before jumping into optimization.
Clear is better than clever.
A simple and clear model or approach is always preferable over something clever, hard-to-understand and maintain. Beware of the Black Box.
Question perfection.
If the results are too good to be true, it is highly likely that you made a mistake.
Control your dependencies.
Every external dependency introduced adds a layer of complexity and potential points of failure to your system. Choose them wisely and clean up unused dependencies.
Fall in love with the problem, not the solution.
Meaningful incremental progress comes from spearheaded focus on the problem at hand. Make sure you know what to look for before you dive into the matrix.
Untested assumptions become unseen errors.
In ML/AI it’s particularly easy to introduce changes that unexpectedly causes fires and performance degredation. Failures deeply entrenched in the model’s internals are subtle in their manifestations. Such unseen errors can compromise the system and even lead to significant real-world consequences. It’s of outmost importance to engage with purposeful testing across the whole system.
Functionality frames filing.
Data’s utility and role in computations should dictate its storage blueprint.
Embrace uncertainty; it’s where clarity finds its measure.
The inherent imprecision that accompanies machine learning and AI methodologies is something we have to embrace. It’s crucial to understand the limitations and to establish an acceptable margin of error tailored to each specific context or objective.
Keep it Simple.
In the famous words of Einstein everything should be made as simple as possible, but not simpler. In software development, we constantly strive against the tide of complexity. This battle intensifies in the realm of machine learning/AI. To navigate these waters effectively, we must double down on our commitment to simplicity and clarity.
While the world of machine learning and AI might appear as a distinct frontier on the surface, it’s crucial to remember that the guiding principles of traditional software development such as, KISS (Keep It Simple, Stupid), DRY (Don’t Repeat Yourself), YAGNI (You Aren’t Gonna Need It), the tenet of Separation of Concerns, the emphasis on loose coupling and so on, still hold significant value here. After all, at its core ML/AI system parallels the challenges and intricacies of any other software system endeavour.
This list is by no means exhaustive. I invite you, to share your pearls of wisdom. Are there any guiding principles you believe deserve a spot on this list, or perhaps some that might not resonate as strongly? I’m eager to refine and expand upon these with your collective wisdom. After all, we’re all part of this journey, learning and iterating as we go. Please share your thoughts @substack or reach out to me directly.
References
These resources provide comprehensive insights to the broader principles of machine learning. I strongly recommend them for anyone looking to further their understanding:
Rules of Machine Learning: A robust guide offering a set of best practices for ML engineering. Explore it further here.
Reliable Machine Learning: This enlightening read steers you through the process of applying an SRE (Site Reliability Engineering) mindset to machine learning. Authored by Cathy Chen, Kranti Parisa, Niall Richard Murphy, D. Sculley, Todd Underwood, and other guest authors. You can find it here.
“You must unlearn what you have learned.” – Yoda