The Things I Don't Know Yet

2020-11-13

I appreciate the value of taking a step back, and admitting things you don't really know yet.

There's a lot of technical stuff I haven't even begun to learn about: more advanced MLOps topics apart from infrastructure fundamentals, the ins-and-outs of deep learning development workflows. I'm sure there is a lot

I see a long road ahead, but I know about those areas and can acknowledge them. They are not what I'm most curious about right now though.

Unstated Unknowns

Some questions are not just hard to answer, they emit a kind of repellant force. Things you don't know how to even begin solving, or how to ask can make you feel overwhelmed and many other unpleasant emotions. Natural laziness senses that, and tries to stop you from going near them - "oof, it looks like a bunch of work, where would you even start".

Those questions usually point to current, real and practical problems I'm facing, but haven't gotten around to acknowledge them yet.

In my experience, acknowledging these problems, and trying to clearly state questions that point to them is a great first step towards building clarity, step by step.

The Situation

Before diving deeper, I'd like to try to solidify my current context, as well as the purpose and motivation of my curiosity with this field.

Why? I think AI products (more specifically applied deep learning) are making things possible which wouldn't have been feasible before. Problems where it's near-impossible to code up a solution from scratch can be tackled given enough data. It's the closest to applied magic I have witnessed. The potential impact is phenomenal. I want to be a part of this.

How? My angle is "experience with infrastructure & DevOps topics - Kubernetes and container technology in particular". It's an area I enjoy and have been working in for a while. It's also something which can benefit a lot of AI companies!

What? You can't solve everything at once. I think that "expensive GPU resources" is a nice, tangible and painful problem which can be solved with the skills and experiences I have built over the year. It's not a complete solution to everything an AI company is facing, but it's a worthwhile contribution and a viable professional opportunity.

The Unknowns

Off the top of my head, here are some questions I have stumbled over in the last week, but haven't gotten around to acknowledging yet.

Some are more broad and specific to my situations, others will probably leave me wondering for the next few years.

How can I increase my exposure to deployment and infrastructure challenges which AI companies (that's what I say when I mean companies with a deep-learning powered, GPU-heavy product) are facing?
What can I repeatedly do to reliably start conversations with potential clients?
How can I experiment with GPU infrastructure topics without working in a vacuum? (let my learnings be guided by real-world needs)
What happens before the realization "our GPU resources are costing too much" creeps up? Where does it lead?
What masks or delays the symptoms of high utilization of GPU compute resources?
What happens at the same time as "our infrastructure costs too much"? (my intuition: ops and deployment challenges, because most people don't have experience with the basics)
What tends to exist around "expensive GPUs" - things enhancing the pain, hindering resolution or distracting from it?
Where in the maturity curve of a company does "expensive infrastructure" show up? What challenges come before it, and what comes afterwards?
Which parts of MLOps are really relevant to a given company? (Just as with "big data", not everybody has strictly-speaking MLOps problems. My intuition says that they look like "nice to have problems", which kick in eventually)
What's isn't being talked about as much as it should be? What topics have too much attention?
What easily implementable solutions around GPU infrastructure, leading to easy wins are there?

I'm quite sure that this list is not complete. I'm also certain that I will re-read, grow and modify it in the coming months. But it's a place to start.

Deploying AI

The Things I Don't Know Yet

Unstated Unknowns

The Situation

The Unknowns