MLOps Community
timezone
+00:00 GMT
Sign in or Join the community to continue

Experiment Tracking in the Age of LLMs

Posted Jul 31, 2023 | Views 478
# Experiment Tracking
# Prompts
# Neptune Ai
Share
SPEAKERS
Piotr Niedźwiedź
Piotr Niedźwiedź
Piotr Niedźwiedź
CEO @ neptune.ai

Piotr is the CEO of neptune.ai. Day to day, apart from running the company, he focuses on the product side of things. Strategy, planning, ideation, getting deep into user needs and use cases. He really likes it.

Piotr's path to ML started with software engineering. Always liked math and started programming when he was 7. In high school, Piotr got into algorithmics and programming competitions and loved competing with the best. That got him into the best CS and Maths program in Poland which funny enough today specializes in machine learning.

Piotr did his internships at Facebook and Google and was offered to stay in the Valley. But something about being a FAANG engineer didn’t feel right. He had this spark to do more, build something himself. So with a few of his friends from the algo days, they started Codilime, a software consultancy, and later a sister company Deepsense.ai machine learning consultancy, where he was a CTO.

When he came to the ML space from software engineering, he was surprised by the messy experimentation practices, lack of control over model building, and a missing ecosystem of tools to help people deliver models confidently.

It was a stark contrast to the software development ecosystem, where you have mature tools for DevOps, observability, or orchestration to execute efficiently in production. And then, one day, some ML engineers from Deepsense.ai came to him and showed him this tool for tracking experiments they built during a Kaggle competition (which they won btw), and he knew this could be big.

He asked around, and everyone was struggling with managing experiments. He decided to spin it off as a VC-funded product company, and the rest is history.

+ Read More

Piotr is the CEO of neptune.ai. Day to day, apart from running the company, he focuses on the product side of things. Strategy, planning, ideation, getting deep into user needs and use cases. He really likes it.

Piotr's path to ML started with software engineering. Always liked math and started programming when he was 7. In high school, Piotr got into algorithmics and programming competitions and loved competing with the best. That got him into the best CS and Maths program in Poland which funny enough today specializes in machine learning.

Piotr did his internships at Facebook and Google and was offered to stay in the Valley. But something about being a FAANG engineer didn’t feel right. He had this spark to do more, build something himself. So with a few of his friends from the algo days, they started Codilime, a software consultancy, and later a sister company Deepsense.ai machine learning consultancy, where he was a CTO.

When he came to the ML space from software engineering, he was surprised by the messy experimentation practices, lack of control over model building, and a missing ecosystem of tools to help people deliver models confidently.

It was a stark contrast to the software development ecosystem, where you have mature tools for DevOps, observability, or orchestration to execute efficiently in production. And then, one day, some ML engineers from Deepsense.ai came to him and showed him this tool for tracking experiments they built during a Kaggle competition (which they won btw), and he knew this could be big.

He asked around, and everyone was struggling with managing experiments. He decided to spin it off as a VC-funded product company, and the rest is history.

+ Read More
Demetrios Brinkmann
Demetrios Brinkmann
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More

At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios is constantly learning and engaging in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether that be analyzing the best paths forward, overcoming obstacles, or building lego houses with his daughter.

+ Read More
Vishnu Rachakonda
Vishnu Rachakonda
Vishnu Rachakonda
Data Scientist @ Firsthand

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.

+ Read More

Vishnu Rachakonda is the operations lead for the MLOps Community and co-hosts the MLOps Coffee Sessions podcast. He is a machine learning engineer at Tesseract Health, a 4Catalyzer company focused on retinal imaging. In this role, he builds machine learning models for clinical workflow augmentation and diagnostics in on-device and cloud use cases. Since studying bioengineering at Penn, Vishnu has been actively working in the fields of computational biomedicine and MLOps. In his spare time, Vishnu enjoys suspending all logic to watch Indian action movies, playing chess, and writing.

+ Read More
SUMMARY

Piotr shares his journey as an entrepreneur and the importance of focusing on core values to achieve success. He highlights the mission of Neptune to support ML teams by providing them with control and confidence in their models. The conversation delves into the role of experiment tracking in understanding and debugging models, comparing experiments, and versioning models. Piotr introduces the concept of prompt engineering as a different approach to building models, emphasizing the need for prompt validation and testing methods.

+ Read More
TRANSCRIPT

Hello and welcome everyone. This is the MLOps Community Podcast. I am your host, Dmitri oss, and today I was joined by Vishnu as a co-host, but he had to bounce real fast. So I am doing this one solo. The recap, the intro of our conversation with the c e O of Neptune. Peter, I found a few things fascinating about this conversation.

I think Vishnu had some key takeaways that he relayed to me too. The first thing that I loved is how he is navigating and how he is effectively steering the ship through the incredible amount of changes that are happening when it comes to the machine learning and AI space, and where he plays and what his tool has traditionally played in this idea of experiment tracking or metadata management.

So being a c e O of a company that is trying to navigate his waters, I thought he had some gems that he took away or that I took away from it, and I don't wanna give any spoilers, so I'm gonna let you come to your own conclusions, but it's nice to see how he thinks and see how he decides to make decisions.

That being said, of course, we talked about traditional MLOps and Neptune and the way that they have. Brought themselves to the table and now what they're looking at when it comes to LLMs and how that changes in the product vision and in the product that they are creating for data scientists and machine learning engineers.

So let's jump into this conversation. Before we do, I have one ask for you. Just one. And that is you share this episode with just one friend. So that we can keep the good words spreading about all of this knowledge and wisdom we are sharing on this podcast. That's it. Let's get into it.

Peter, great to have you here. I'm super excited to talk all about LLMs and how you all at Neptune are doing some wild stuff with that and where your vision is, and you just said something before we started talking that I feel like we have to address right now. It's almost the elephant in the room. And you studied with Sam Altman and can you not?

Not with Sam, with Voix. Oh. It's a little different there. Okay. Who's Voytek and what is going on? You connected with Sam. Okay. Okay. EK is one of the co-founders of OpenAI, and he, he's more like, he's, I think he's still leading the research group there. And so I would say initial brain behind when it comes to research and, and science actually there are quite a quite a few of my friends from studies and also from.

Competitive programming times because I used to be, oh, let's call it a professional player in competitive programming. It is, it was not about data science. It was not kago, it was before it was classical algorithms. By the way, there's there is my professor from Strom, when, from times when I studied math and computer science, he.

He had a, a kind of upgraded version of a during test, and the upgraded version is when like the, the bot when, when the algorithm would be able to qualify to national level of Olympic informatics. And we are not there yet. Like this question was asked today to Boy Tech and we are not there yet. Boy Tech predicts that.

This decade it'll happen, but it is not yet obvious how it will be done. So even for, guys who are creating, who are behind large language models the capabilities and the projection, like we definitely, people feel that we are not, it is not the end. We'll be able to do more like he VO did today and.

Comparison to sigmoid function and where are we in the curve of progress? If it was represented by sigmoid function, he bet that in the middle. So it means that quite likely we'll see significant progress in the next few months, years, but it should flatten. I think it's similar how to, to deep learning wave, maybe seven years ago or so when deep learning was kind of in the basement for.

Years. Like, yeah, the techniques in your network are old, right? But nobody was talking about it. Uhhuh. Then there was the, that like we, we noticed, okay, this is possible, this is possible. More people focus on it. There were more, more investment. And then the progress started like booming. And to some extent, like yes, we can always say that large language models are still deep learning models, but it's a different game.

It's a different scale. Yeah. I think it's not so fair to really compare them, even though they are technically wise, some extent similar. Yeah. So you, you undersell yourself a little bit there on the professional aspect of your informatics abilities. I read somewhere that you were, what you were an Olympiad for Poland in your.

Um, informatics. What's that all about? I can silver, but I, I had silver medal in on a national level while I was in high school. It is where, by the way, the, like Polish Informatics and, and this whole competitive programming community, it is what it gave me so much because on one hand I, because of that, I was able to go to any university in Poland.

So I picked the best one for, for these type of studies. Consequently I was able to meet like the smartest guys in, in these field national, but also internationally because when I was student, I was also competing in competition like Google Cogen, Facebook Hacker Cup. So, it gave me, I would say, some level of confidence.

That's, yeah, I am, I'm also quite smart, right? So I can try, I, I'm, I'm not afraid to, to try to build my own company, et cetera, et cetera. So it gave me a lot. Talk to us a little bit more about building your own company. You've done a few companie. Which one now? Yeah, exactly. You've got a few. Tell us about those.

And then now you're on Neptune and how did that transpire? So I started my, let's say, entrepreneurial background. When I was still studying one day I decided that I am done with competitive programming because quite likely I, I, after, um, being top 30 worldwide, Google Code Jam, I, I got offer from, yeah, mo most of the.

Big, I'm a companies in the valley, so this kind of goal, being able to work as a software engineering companies that I respect and, and was kind of done. So also I decided let's start my first company. At that time I was still student. I wanted to finish my studies, so, and I studied in Poland and in Poland.

The VC ecosystem at that time didn't really exist. Maybe, maybe there was something, but. So, I, I, I had to have money to, to build stuff. And so, I decided to create the first company that will be able to generate money to build product company. And, and that is how my first company, like I'm co-founder Coline is a company that specialize in net software defined networking.

So, Networking, but not, like, not hardware, more soft com, something in between hardware and software. And it was 12 years ago and like, it's supposed to be a small company. At the beginning it was the plan, but today it is over 400 people company, highly profitable. Wow. And drawing, I'm not obviously anymore, um, like operationally involved because I'm focused on Neptune.

But my brother is c e o and, and my friends who, who co-founded this company with me are like, we are still close, and I still major shareholder. So, so it was one company, few years later, I think four like, so, eight, seven years ago, I co-founded Deep Sense ai. That would be one of the leading consulting firms in Europe.

The idea, like, let's focus on data science. Actually, I need to give a credit to EMBA again. It was after we met in San Francisco and he was at the time pursuing PhD around deep learning. And and he was always, I, I need to admit that Voytek was always. A believer, a visioner when it comes to capabilities of those models today.

It is easy to be, to say, yeah, like you are, amazing. Whatever is possible. Like it is easy. It was about eight years ago, it was not so easy. Um, and, and yeah, and, and I kind of felt that he might be right. And when I look at the, at the time, cybersecurity was big. I think blockchain, maybe it was before blockchain wave.

Definitely big data was a thing. But I felt that being able to do something like relevant with the data, quite likely is a game, will be a game changer. So, so a way how you would, like if I were just an individual person, we all the company, quite likely I would start studying deeper deep learning as a company, as a, as a founder with that, has that had already a company, I decided to do both on one hand.

I wanted to get get hands-on and understand this concept deeper. On the other hand, I thought maybe it'll be better to do it together with other folks. So I co-founded a company that's decided to specialize just in this field. And we started from, from participating in Kaggle competitions.

Sure. To get to the level that we say Okay, like, We can build world class models, maybe we, it's time to start providing service around it. So it is how, so it is the other company. And from deep sense it, it is the place where Neptune was basically created the first version. It was internal tool.

It was, it wasn't my idea to be honest. One of the, one of the data scientists who also participated with me in pollution, bioinformatics, he created Experiment tracker combined with, with some infrastructure management layer. Mm-hmm. So, because, because, yeah, today some, it would be rather a bad idea because cloud providers made the a p i and the way you can use hardware for training way easier, but mm-hmm.

Seven or six years ago it was, it was not like that. It was very engineer heavy type of a p i and experience. So, so yeah, it is how it started. I can tell you more about this, but long story short, from two consulting companies I finally got like, was I, I, I saw the problem and felt that this problem can be something that is not just tips related.

I talk with my friends who data scientists or data science leader in other companies. How do you, like, yes, you are building models. How do you manage it? And, and I heard very often what is still, I would say, pretty popular solution. Today I'm using a spreadsheet like homegrown type of solutions.

So it was a, it's usually a good signal that there is, that if you hear it, that many people do it, that signal for a problem, a different struggle. There would be a. Sustainable business around it, right? But for a problem, it was a good signal. So it is how Neptune was started. So talk to me real fast about this.

It's a bit of a tangent, but I gotta ask it. When you meet somebody outside of Poland, Yeah, like in the US and you are chatting with them and they also have passions that light up with yours. Is it like you all are in this small club? It's like we're the two polish people that are in the US that actually care about deep learning and you give each other a bit of a head nod and you're on the same team.

To be honest it wouldn't be, I, I think it is, it is changing now because, It was never a Polish thing for me, so I didn't care so much, right. Like if the guy was from Poland or not. And but it used to be more if I met somebody who was really doing something with machine learning year, even a year ago in a kind of random, occasion.

I'm not talking conference around ops, right? But random occasion. Like it was still a small clock of people who were following the space with large language model. It has changed. Like two days ago I was using Uber and the driver was brought the topic of, of ai. So it is a signal that that it got quite popular.

Yeah, it's gone mainstream and yeah, we're not cool anymore 'cause it's too popular. We're not too popular. Yeah. Well, dude, there is something that I wanted to mention before we jump into this and that I would love to get your thoughts on. It's around the idea of how you've adapted with Neptune. Because you mentioned before that the original product was built for a pane that now the cloud providers abstract away.

Yeah. How have you looked at the adaptability of the product that you're building? So, I think it is is pretty challenging in this space to run product because and, and I think it is, there is a new wave of change of changes and those kind of observations can be still irrelevant. So, I think that very important that it is on one hand, very easy today, and it used to be, and it was easy.

Five years ago to see problems. When I like in my team I'm trying to explain people who reports to me that seeing problems in a startup is not a unique skill. So I would say the same here, like seeing problems in the space. Of MLOps or large language model ops, if we like, depends how you would define it.

It's not a, is not a challenge. The challenging thing is to decide what problems are. Temporary problems, yes, I can solve it. Right? I'm quite RA rather, if I am students and I am, still kind of. Skillful in coding, quite likely I would be able to solve, not in a super generic way, particular hack particular problem quickly, faster than, than a cloud provider, right?

But will cloud cloud provider, for instance, will, will, its, will it be solved in half year, in a year? What is my defensibility? So, yes, you definitely like one thing. You need to look at the problems, but it's not super challenging. You need to be able to figure out what is the solution. Maybe a little bit harder, but still, I won't say this rocket science very often, but will you be able to like build a sustainable business around it so you will be able to make like iterate, make it making better, like really do something in the long run with that is a different story.

So, It is challenging. Maybe for some people even this is not challenging, but I'm at the level where this is where I feel challenge and, and I think a framework to think about it that I realize relatively late is a way like it is, I think it is one-on-one of pro product management. But but I don't have product background.

I have engineering background. So understanding what are the, like what, what, what's, what is the essence? What are the main jobs to be done that we want to solve, and why we should solve it? Why those jobs shouldn't be solved by somebody else? Some, sometimes, and if you look at the maybe landscape competitor landscape from this perspective, you will realize that you're not necessarily always competing with, let's say, MLOps players.

I, I, I wrote a blog post about DevOps space and, and connection to MLOps space. That was not every folk in MLOps community, recycled rec received it positively because why? Why we are bringing DevOps. We, we are, we were, we are MLOps engineers. Yes, it's true with specialization. I'm saying that's, that having just a DevOps knowledge is enough.

But I think you need to look from the perspective of, like, you need to think about, about the core problems. And and I see a lot of, a lot of similarities to DevOps when it comes to core of the problems. But yes, we are, when it comes to way how we are solving it, the d there are differences in details.

I see something to some extent similar with with large language models and how we are. Going to operate then, like it is also it's another, another, another story. But when it comes to the product going back to the original question, I think, I think it is essential to really understand what is the scope of jobs to be done, your products supposed to serve, and being super focused on that.

Um, explicit about that. Also explicit when we communicate to the customers. In this space. It was also easy. I remember it was very tempting. We had a, we, we were able to close the bigger deals with enterprises, but they had needs that I felt that they are not in line with the core jobs to be done of for Neptune.

So we decided to pass on those deals even though closing one deal would double the revenue. So there was a pressure to do it right, but I think that's, MLOps and make large language model ops. It is a marathon. It is not the game of one takes it all. So if you communicate to your shareholders, to your investors, what is the strategy?

You can, you can make those decisions and, and be careful. What type of problems you're jumping to solve is very tempting, but very, very risky. I think that is a fantastic introduction and summary of product management in the context of machine learning. I really like what you said about, what are the temporary problems and what are the future problems, and it's clear that like, when you started this conversation by saying, I wanted to start a product company but I didn't have the cash, and you're clearly now trying to run a very product oriented company.

You're thinking in a very product oriented way and. I hope the listeners appreciate it as much as I am because it's always fascinating to hear like real product visionaries. Think in terms of that. I'm actually reading, funnily enough, Steve Jobs's, biography right now, Walter Isaacson one. And what you're saying, it just meshes so well with the way that Steve thought about product, right?

Because those, those guys are like those, they're not, when I say those guys and say Steve Jobs, okay, there was one Steve job, but if you, for instance, like I really like. You can find an over YouTube record. Like there was interview, I think it was 1997 with Jeff Bezos about, yeah. About internet. And, and they, they, I think they bought or built some centers, brick and mortar, right.

And, and, and still. And Jeff was asked. But you're supposed to be internet company. What's what your investors, when you're building, physical buildings. And he said something like, internet schmear, internet. What's important is what are like the core, core values we are providing to the, to our customers?

And, and he said, vast selection, low price and, and quick delivery. Those are the fundamentally the most important three traits our company or features our company should provide to our customers. And I don't care whether it is via internet or via building a building. Right. And this, this understanding of the, like I would say noise can, or how we can, there's a lot of thoughts, a lot of ideas, how we can really see.

What is the core and and focusing just on that, it's simplify thinking. So I think that the leaders like Steve, like Jeff, they have this ability, like they were really a really capable of understanding what is vital. Yeah. Yeah. And now I'm gonna ask you the question, right? I'm gonna ask you, you the tough question here, which is, the way that we build machine learning products is poised to change, right?

That is what the promise of large language models are, and that is what the promise of this new era of foundation models are. You don't need to build your own model from scratch necessarily. There are large models that truly can generalize in ways that are both. Exciting and almost scary. So for you as a company that exists to provide infrastructure and as you say, help data scientists be data scientists, how has your product vision changed or how has it stayed the same in the context of these advances?

I, I look at that from this perspective. I start from, maybe it sounds cliche, but if I think that we have pretty well thought, Company mission, and this is something that you, you have mission, it should change sparingly, really sparingly. You should really be careful with that. Product vision or vision for the company still should be relatively stable, but you can adjust a little bit more frequently, so, so when it comes to mission, Mm, I don't see a huge change.

And what is the mission then, right? Maybe I should see, see it. So, we stated I, I won't quote the one-to-one, but basically we want to give mo or, or support machine learning teams. Maybe they should be called AI teams. If I'm talking about large ranking models, maybe machine learning is too narrow, but putting this aside, We want to provide them the same or help them to get the same level of control and confidence when they're building and testing their models.

Like it's another question today. What is the model nowadays that machine learning or that's m l t, would be building is the model is a prompt, but let's get to it in a moment. So control confidence around the building and testing similar to what we have in software space. I'm coming from software space and today I think the processes around DevOps.

Okay, there's constant development around it, but they're pretty stable and, and, and you can iterate, develop tests. Deploy and monitor on production software in a pretty con, kind of controllable, confident way. And I think this confidence and control over the process is a fundament, is a fundamental need.

If you want also in ML space or AI space, if you want to truly use those models in production. So, I think the, when, when, when you look at the large language models. I see this is even more challenging now because Yes. We, we see already that there are models based on though, on Ta GPT three for on production already.

Right. But I would say that mostly you see it in the use cases for the use cases where it can be wrong. Or where there is a human in the loop. Sometimes human in the loop is the customer, but it is still human in the loop type of thing. Because I think that we are to some extents with large language models in a similar times to, to, yes, like early deep learning times.

When we were initially kind of in the discovery phase, we, we spotted, we. We, we have new capabilities, and, and every month you have the, like today, this, every week you have a new announcement. What is, what extra capabilities unlock? What is, what is um, doable now? But it is still, I, I, I think that we'll see a huge progress here, but.

The use cases where this part, where, where such models are truly used on production today are quite limited to those where human is in the loop to really unlock it and, and use it also for like, you have in classical deep learning that you can control, you can test and see some validation metrics.

You, you can have some level of confidence. We are using them in predictive maintenance, in use cases where there is no, no human in the loop. So, so I think that we will see improvement, like, new methods, how we are going to test, validate those mo new models mm-hmm. Or new ways, let's call them models.

And, and, and I think it is I don't know exactly what would be the solution, what would be the method here, but I see that this part is the area where Neptune as a company, because of the, of the mission will be involved. And we'll try to like, on one hand, discover those ways and support those, those ways of testing, validating those models, prompts quite likely.

I, I expect to, to see some level of combination hybrid. I don't know. With classical machine learning, deep learning models maybe via agents, it is one visible way. So you know that today you can, um, you can allow a model to, to ask, use Google search, use Google vote from Alpha to calculate something.

I also expect that an agent can be another model that would be more classical, one that would predict something or vice versa, like closer to trans learning or. Embedding techniques. But but I think that yeah, that, that when, when you look at the core needs, let's have those models on production, we will need to have control, like ability to touch those models and control them.

And it is where Lune as a company is trying to contribute. So I, so I think that, that, yeah, that's, that there will be a lot of work to be done around it. For those new techniques. But so it is one thing. Second thing, I still believe that the current techniques for many use cases are, are still relevant.

And I remember, do you remember people are not maybe anymore are talking about them so often, but auto ML concepts. Yeah. Oh my god, yeah. It's so funny. 2019 was big for auto ml. Yeah. And, and there was a question, oh guys, maybe we won't need data scientists. Right. I was asked by my, some customers, will they, shall they build ML team or it would be just out ml?

Yeah. And then, and out ML today is called, I would say H B O or H B O related techniques. Right? Sure. I think that's, that's we will be mixing different techniques depending on the problems. If you want to co and we'll be a able to cover more and more types of problems. The amount of investment in, in this or related fields is growing.

Yeah. So we will be also discovering more, more stuff that is doable. Peter Ray, which is, what you want to do is provide control and confidence to the people that are, in charge of these models, so to speak, whether they're data scientists or wherever they might be, and you want to do that regardless of whatever paradigm we're in, whatever context we're in, it's almost.

Neptune is indexed to the growth of machine learning. Machine learning is gonna grow, and so Neptune will grow and you know the product will evolve along the way, but the mission stays the same. Control and confidence. Yeah, and I love that. Well, there is one thing that I wanted to jump into when it comes to.

Just something that you said earlier and you kind of sprinkled it into the conversation around the experiment, tracking now becoming prompt tracking, and if that is going to be the evolution, and I would love to hear your thoughts on that because it feels like there is a definite need for prompt tracking.

But I don't know that experiment tracking is going to a hundred percent go away. It still feels like there's a ton of use cases for that. And so how are you looking at both of those? I said this sarcastically. Okay. So, so, again, I, I, I am looking from jobs to be done perspective, eh? So in a short, from short time per, let's say one year perspective, and this is what I, what I see and I can share with you, I am, we are seeing more traction and demand for experiment tracking than without large language model.

, I, I think it is side effects. Of big, of bigger investments into pro, like, broadly speaking, ai. So, yes, we have a new big sum that makes everything else looks way smaller, right? But the reality is that there are a lot of synergies.

So, so I think it is, it is why we see it, but when it comes to experiment tracking, Let's, let's discuss what it is about. And by the way, I really don't, but we use this because like maybe you will need to, you'll be using with mops name we use experiment tracker because name, because it is what market understands.

But if you look at the jobs to be done of Experiment Tracker, they are. Way behind. Beyond just experimenting, experimenting, south's pure research nor production. So I would say that main jobs to be done often, experiment tracker would be about on longhand. When you are building models, you want to understand what is happening.

You want to understand the building process, you want to debug it, you want to. Compare it with other experiments. So in this way, you would understand like whether the model you're building is going to the right direction or not. You want to version it so you have some level of reproducibility, some, way to maybe share particular model or share it for a feedback.

And you want to make it in a way that is that you can Hand the more model over to a ops team. So from this perspective, when I think about prompt engineering, that is quite a different way of building models. I'm not even say, I'm not even sure that we should be calling prompt engineering pro as engineering model.

Building process, right? Oh yeah. Very. 'cause the model like is stateless. When you think about, okay, fine tuning for instance, is not available for the latest models. It's for GP PT three, but GPT 3.5 and four, as far as I remember, you cannot fine tune it. Mm-hmm. So we, what, you're left, you're left with how you'll craft Prompt plus more, right?

You can you can configure agents and, and build a prompt in a sequential way using different models. So, Yes, it is engineering. Um, but here we are talking about building phase, how I would understand how, how does it work, et cetera. So yes, I see that until this on our roadmap, I see a support of pro visualizations or chain visualizations.

Integration with, with lung chain is obvious thing. Um, but this is just the beginning. I think that to really support ml, maybe not only ML teams that are building such models and using them on production, we will eventually, we'll also need to think about how we are going, how we are going to support, and what, how, what are the methods, how we are going to validate those prompts.

Because today it is very much human, . Opinion, right? I'm looking at this prompt. I'm looking at this prompt, this prompt look looks better. There are ways to, like, there are, um, ideas. Let's try, let's do this validation also with foundational models that would be judging which, which set of prompts is better.

But by the end of the day, we would also need to figure out how we are testing. Props that would symbolize models. We would need to, we will need to think about how we are going to, what, what are, what would be the techniques we can really use in order to update those models. Because those models like, yes, like if you are asking very, like questions that's with generic context, maybe you don't need to update or you can just update with the new version of GPT.

Available. But if you are building, um, let's call it a solution based on foundational model that is doing something specific with your data, then maybe fine tuning, maybe it'll be done by a agent that would ask your kind of smaller classical model for, um, some prediction. And, and the prediction will be taken into consideration by, by the foundational model.

So I, I think, I think we will also need to be able to connect the dots. What is using what, which version. So it is very much what experiment Tracker is doing today, really. So, I think that it shouldn't be called prompt Tracker. It'll be even worse. Named the Experiment Tracker. I don't like Experiment Tracker as well.

Not sure. I'm not sure somebody crafted a name evaluation store. Oh yeah. It sounds to some extent close. I'm not good in figuring out names. Plus, you know how it is with names. You can name something as you wish. Maybe you'd be super accurate. But what is important is how people are what, what's the mar like how, what the market understand.

Yeah. That's why, that's why We are, we speak to experiment tracker. But, but I think that set of jobs to be done, at least for our understanding of, of experiment Tracker is a little bit broader and prompts prompt. Engineering will be one of the techniques relevant, very important will support it, but it is not the end.

It's basically getting that whole view and knowing all of the lineage and knowing everything about it from. Exactly the, basically, I kind of look at it as the control pain and being able to sit back and say, okay, I understand everything that is happening within this use case or what I'm trying to do with this job to be done.

What? What's my end state? And getting to that end state, there's a million different things that have happened. So I want to be able to see all of these with a quick glance if I need to. So I like that. It is super important. It is super important if you are talking about production. Mm-hmm. Because you need to have confidence and kind of level of control, right?

Yeah. From technical point of view. But, but I believe I maybe not the biggest fan of of regulations, but I think some level of regulations. Will be regulations, audits will be more like bigger topic in the future. Yeah, I mean it's, yeah, so, so you know, like you will need to have some level of order and control what, what you've been doing and how, not only from engineering perspective, but I'm motivated mostly via engineering needs, like having order control process, being able to reproduce stuff, being able to compare stuff.

Being able to say This is better, but it is not. I feel it is better. It is better because we've performed those tests. Here are the results. It is why we are making this call. So I think, this has been a great conversation. We have touched on a lot here, especially when it comes to thinking about the way that.

Machine learning will be in the future, what kind of company Neptune is gonna be to accommodate that, what the future of the experiment tracking category is. I, I kind of just, I kind of wanna ask you as we wrap up this conversation here, Peter, like two years from now, what is your prediction or what is the future that you believe will be skating towards.

I want to get a sense of, as, we talked about product vision, right? Like, where do you think, how do you think the machine learning workflow will look like two years from now? And what kinds of tools will people need? Um, what's your prediction there? Okay. I was not, I was not kind of prepared for this question, but that's the best way to do it.

It's off the cuff. Yeah. Yeah. I, I think that we'll definitely have, Numerical methods to test and validate models built on that are built with or next to foundational models. So, this will be kind of under control. We will be, I also believe we'll be like having defined ways how we combine classical models that I, I, I don't feel that they, they are, They won't be relevant, I believe they will be relevant, but for a specific tasks that are less generic, so you don't have a lot of public data to like, okay maybe predicts pre predictive maintenance type of problems, recommended systems things like that.

So we'll have a ways of combining. Foundational models with classical models. And, and I think MLOps tech stack or community will need to understand how to use it in a comfortable, controllable way. I also believe that because of the markets tech markets, I would say correction it is tougher for us it for, for ISTs.

We have been always building a company in a kind of marathon game in mind. So, We have not spent a lot of money, so we are still when it comes to yeah, amount of money we burn super lean. But I think that um, what would be good for players like Neptune that is a more point solution than end-to-end platform.

The good thing is that we will know with whom it makes sense to integrate. So we will be able to provide more end-to-end solutions not being end-to-end solution because for us, It was always a problem. If you're selling to enterprise, like, yeah, we are components doing experiment tracker. Yes, great, but man, I have a problem.

Like I need to provide tech stack to my, to my org ML organization. And so I wanted to have, okay, sure, here's a end, end-to-end solution. Like, here's the whole solution. UNE is here. Here, you can pick up one of the partners. But it was problematic because this market was so crowded we didn't have money to, to integrate with everybody, right?

So, so I think we'll have some level of natural selection. There'll be less players, less leaders, but more clear categories that that will be well integrated. So on one hand I see a lot of development in end-to-end platforms, but I am not a big fan. I don't see a lot of end-to-end platforms in software space.

I think we'll have more clear categories of point solutions that are well integrated. So, I think this market is, I would call its way more mature market. On the other hand, I'm super curious to see what would be, like, what is the cup or where is the selling for Capability of the current methods and what are the new methods that are coming?

So what is the, yeah, what is the true potential here? But this part I cannot predict. Two years ago, I wouldn't be able to predict. I didn't know, even though I was following this space very closely, that we'll be able to do what we are able to do now with foundational models. Yeah, I, I love the way that you put that and it's so funny.

I just wanna say at the same time that we're talking about machine learning, We still have problems with wifi. So, I don't know if the whole world is gonna be solved, but Demitrius and I, during the course of this conversation had some wifi problems. So maybe, hey, some real world things to solve.

But I think that the vision that you lay out there in terms of knowing exactly, where you'll be able to embed and, and, and, um, connect in with other sort of infrastructure providers to create the sort of end-to-end ecosystem. And the fact that we'll have those numerical methods, as you said, that, that give us the fine tune control and the precision that we need in order to scale these systems.

I just love that. And I gotta say, it's very good for not having thought about this before. And, and honestly, it's probably, you probably have, so I'm gonna call a little bit of cheating. But Peter, thank you so much for joining us for making it a great conversation, for sharing us your product vision, for giving us a preview into the way and Neptune's thinking about the future.

I'm long Neptune and this is it was a great conversation. Thank you for having thank, thank you guys. Really happy to be part of that biggest MLOps podcast. I think even bigger than MLOps, depending on the name podcasts out there. Thank you guys. There we go. Awesome man. Dimitri, any last words here?

Yeah, this has been, this has been great. Perfect. Thank you guys.

+ Read More

Watch More

29:56
Posted Apr 11, 2023 | Views 2.2K
# LLM in Production
# Large Language Models
# Industrialized AI
# Rungalileo.io
# Snorkel.ai
# Wandb.ai
# Tecton.ai
# Petuum.com
# mckinsey.com/quantumblack
# Wallaroo.ai
# Union.ai
# Redis.com
# Alphasignal.ai
# Bigbraindaily.com
# Turningpost.com