Sign in or Join the community to continue

What is the role of Machine Learning Engineers in the time of GPT4 and BARD?

Posted Apr 27, 2023 | Views 1.5K

# GPT4

# BARD

# API

# Digits Financial, Inc.

# Rungalileo.io

# Snorkel.ai

# Wandb.ai

# Tecton.ai

# Petuum.com

# mckinsey.com/quantumblack

# Wallaroo.ai

# Union.ai

# Redis.com

# Alphasignal.ai

# Bigbraindaily.com

# Turningpost.com

Share

speaker

Hannes Hapke

Principal Machine Learning Engineer @ Digits

As the principal machine learning engineer at Digits since 2020, Hannes Hapke is fully immersed in day-to-day, evolving innovative ways to use machine learning to boost productivity for accountants and business owners. Prior to joining Digits, Hannes solved machine learning infrastructure problems in various industries including healthcare, retail, recruiting, and renewable energies.

Hannes is an active contributor to TensorFlow’s TFX Addons project and has co-authored multiple machine learning publications including the book "Building Machine Learning Pipelines" by O’Reilly Media. He has also presented state-of-the-art ML work at conferences like ODSC, or O’Reilly’s TensorFlow World.

+ Read More

SUMMARY

With the fast pace of innovation and the release of Large Language Models like Bard or GPT4, the role of data scientists and machine learning engineers is rapidly changing. APIs from Google, OpenAI, and other companies democratize access to machine learning but also commoditize some machine learning projects.

In his talk, Hannes will explain the state of the ML world and which machine learning projects are in danger of being replaced by 3rd party APIs. He will walk the audience through a framework to determine if an API could replace your current machine-learning project and how to evaluate Machine Learning APIs in terms of data privacy and AI bias. Furthermore, Hannes will dive deep into how you can hone your machine-learning knowledge for future projects.

+ Read More

TRANSCRIPT

Link to slides

Hi, my name is Hannes, I'm a machine learning engineer at digits. Uh We handle uh books for business owners and accountants in real time. And that, and that involves a lot of uh machine learning, natural language processing and these days also large language models. Take it away. Perfect. Hi, my name is I'm a machine learning engineer at digits. Uh We handle uh books for business owners and accountants in real time. And that, and that involves a lot of uh machine learning, natural language processing and these days also large language models. But we had this big question of like how does the role of machine learning engineering evolves in the time of GP D four in bar? But we had this big question of like how does the role of machine learning engineering evolves in the time of GP D four in bar? Does it commoditize our whole profession? Does it commoditize our whole profession? And so if you take an example project here, for example, uh you would build like an address parser with like deep learning, for example, to handle some more uh unique cases instead of just using a red uh profiler or red uh implementation. You um And so if you take an example project here, for example, uh you would build like an address parser with like deep learning, for example, to handle some more uh unique cases instead of just using a red uh profiler or red uh implementation. You um um um you would write something like this, you would deploy your model, you know, you would have an internal end point and then all of a sudden you would get those uh results out so that I don't know, might take you a week, maybe two weeks, depending on the complexity of the project and depending on the availability availability of the data. Um you would write something like this, you would deploy your model, you know, you would have an internal end point and then all of a sudden you would get those uh results out so that I don't know, might take you a week, maybe two weeks, depending on the complexity of the project and depending on the availability availability of the data. Um You could also go to an API like open A I or in the future of Palm or uh Claude or other services. You could also go to an API like open A I or in the future of Palm or uh Claude or other services. And oh, by the way, then you run into those rate limited issues and like the IP A uh the, not the IP A but the API sometimes it's not like uh available, but once you get over this and you submit your request, then And oh, by the way, then you run into those rate limited issues and like the IP A uh the, not the IP A but the API sometimes it's not like uh available, but once you get over this and you submit your request, then you get a decent answer back. And so there's zero, um there's zero implementation time from the M L side. you get a decent answer back. And so there's zero, um there's zero implementation time from the M L side. And then it is sort of like, hm what does it mean for the projects? And then it is sort of like, hm what does it mean for the projects? And then you dig a little bit deeper and you come across those projects from the initial demo where like somebody scribbles uh a description of a website um on a page and then all of a sudden the algorithm uh or the, the, the API end points uh spaces out some ideas about like how to write the HTML and CS S and it looks very much like what they had anticipated uh in the scribbling. And then you dig a little bit deeper and you come across those projects from the initial demo where like somebody scribbles uh a description of a website um on a page and then all of a sudden the algorithm uh or the, the, the API end points uh spaces out some ideas about like how to write the HTML and CS S and it looks very much like what they had anticipated uh in the scribbling. And those moments all of a sudden you're just like, And those moments all of a sudden you're just like, wait a second, what is gonna happen to the entire uh like ecosystem of ML engineering ops, et cetera, et cetera. wait a second, what is gonna happen to the entire uh like ecosystem of ML engineering ops, et cetera, et cetera. And so we're not like the little cabin here like uh running around and screaming. But if we step away, then what does it mean for us as like machine learning engineers, uh MS projects, owners, stakeholders, et cetera. Is that a guttenberg moment? And I call this a guttenberg moment because something interesting happened in the middle ages And so we're not like the little cabin here like uh running around and screaming. But if we step away, then what does it mean for us as like machine learning engineers, uh MS projects, owners, stakeholders, et cetera. Is that a guttenberg moment? And I call this a guttenberg moment because something interesting happened in the middle ages in like 14 50. The uh there was this a German fellow who quote un quote invented a printing press. And until then, in like 14 50. The uh there was this a German fellow who quote un quote invented a printing press. And until then, um books have been transcribed in monasteries. Um There were monks, they were trained on doing this by hand and uh they did this at a, with high artistic value. But then uh this inventor came around and was like, hey, we have a printing press. Uh We can print those um books have been transcribed in monasteries. Um There were monks, they were trained on doing this by hand and uh they did this at a, with high artistic value. But then uh this inventor came around and was like, hey, we have a printing press. Uh We can print those um documents or Bibles or books or whatever they wanted to print much quicker. And that sort of like contributed to the entire democratization of access to books. And uh had a lot of like in I I um had an a massive um impact on literacy across Europe and things like this. But if you walk away from this or if you, if you zoom out from this moment, you could say that um documents or Bibles or books or whatever they wanted to print much quicker. And that sort of like contributed to the entire democratization of access to books. And uh had a lot of like in I I um had an a massive um impact on literacy across Europe and things like this. But if you walk away from this or if you, if you zoom out from this moment, you could say that um we have a machine learning engineer. Um um we have a machine learning engineer. Um And we have the modern printing press is basically the api for those models. And we have the modern printing press is basically the api for those models. And we're at a moment right now where like as a, as a community, we need to figure out like where we go with our projects. And we're at a moment right now where like as a, as a community, we need to figure out like where we go with our projects. So what this means is like So what this means is like machine learning got democratized. Uh What I mean with that is like now we have domain experts, they don't need to be experts in machine learning. They can just basically take the data and run it against the API S and solve quote unquote some machine learning problems. machine learning got democratized. Uh What I mean with that is like now we have domain experts, they don't need to be experts in machine learning. They can just basically take the data and run it against the API S and solve quote unquote some machine learning problems. And the interesting effect we also see in the community is like now we have discussions about machine learning and the impact of machine learning in a broader public. So you see articles, opinion pieces in the New York Times talking about And the interesting effect we also see in the community is like now we have discussions about machine learning and the impact of machine learning in a broader public. So you see articles, opinion pieces in the New York Times talking about artificial intelligence, machine learning. Um I have close relatives asking me all of a sudden like, hey, can I, how can I get a key to open A I, how can I use this? And those folks struggled before um to even understand what I'm doing on a daily basis for work. So those are the good things. And then when we look at artificial intelligence, machine learning. Um I have close relatives asking me all of a sudden like, hey, can I, how can I get a key to open A I, how can I use this? And those folks struggled before um to even understand what I'm doing on a daily basis for work. So those are the good things. And then when we look at um is this a guten back moment for us or not like some projects? Let's be honest, some projects got commoditized like uh we'll talk a little bit more about this later, what this is specifically means for us uh in our project. um is this a guten back moment for us or not like some projects? Let's be honest, some projects got commoditized like uh we'll talk a little bit more about this later, what this is specifically means for us uh in our project. And there are also we need more education around the dangers of machine learning. And I'm not danger. I'm not saying like to scare anybody here, like, but we need to talk about like, what does safety mean? What does bias mean? Like how do we train those models? What is actually happening behind the scenes? And And there are also we need more education around the dangers of machine learning. And I'm not danger. I'm not saying like to scare anybody here, like, but we need to talk about like, what does safety mean? What does bias mean? Like how do we train those models? What is actually happening behind the scenes? And that goes back to the criticism around open the eye. Are they really that open or uh what the name says or more like close? And I think they're maybe more on the later hand. Um so that goes back to the criticism around open the eye. Are they really that open or uh what the name says or more like close? And I think they're maybe more on the later hand. Um so we also need to educate folks who use those API S that the predictions are not objective. There's still subjective data behind the scenes and there is a bias in the data just because the machine has made a prediction doesn't mean it's we can classify this as objective. we also need to educate folks who use those API S that the predictions are not objective. There's still subjective data behind the scenes and there is a bias in the data just because the machine has made a prediction doesn't mean it's we can classify this as objective. So before we talk about like what does it mean for our projects, let's summarize like the early lesson we learned when we looked into those API S. So before we talk about like what does it mean for our projects, let's summarize like the early lesson we learned when we looked into those API S. And so And so the first results were really impressive, like no question, right? Like you see you saw the demo, you um realize what's actually going on uh behind the scenes, we're like, wow check GP T was great. But then GP T four is sort of like that's a different level, right? That was super useful. the first results were really impressive, like no question, right? Like you see you saw the demo, you um realize what's actually going on uh behind the scenes, we're like, wow check GP T was great. But then GP T four is sort of like that's a different level, right? That was super useful. Um Um Then you start thinking about like wait a second, those brief interactions, they can be misleading. Then you start thinking about like wait a second, those brief interactions, they can be misleading. And as an example, like the case I showed you earlier with the address parsing. And as an example, like the case I showed you earlier with the address parsing. If you submit the address, if you submit the the same request multiple times If you submit the address, if you submit the the same request multiple times with that prompt, I showed you with that prompt, I showed you you would get the same values back. But for example, the keys in this Jason structure would change. So it's inconsistent. you would get the same values back. But for example, the keys in this Jason structure would change. So it's inconsistent. Um So there's therefore when we have conversations internally about like those API S, we just need to be really aware that the first results can be misleading. Um So there's therefore when we have conversations internally about like those API S, we just need to be really aware that the first results can be misleading. And then there is like convincing hallucinations. So we have seen task where we tried to run them through an API And then there is like convincing hallucinations. So we have seen task where we tried to run them through an API and and the model returned something in a decent data structure, but it hallucinated in the values themselves. So it, it, it believed that extracted something and it was uh convincing, but it was utterly incorrect. the model returned something in a decent data structure, but it hallucinated in the values themselves. So it, it, it believed that extracted something and it was uh convincing, but it was utterly incorrect. And we've seen this now with discussions about like biographies, uh descriptions of people, like it's very convincing. But um again, the whole text was hallucinated And we've seen this now with discussions about like biographies, uh descriptions of people, like it's very convincing. But um again, the whole text was hallucinated and then, as I just mentioned, the upper was is inconsistent. and then, as I just mentioned, the upper was is inconsistent. So um when we look at this, it feels a little bit like little Kevin is like dancing at home and just like not making things up but like pretending to be a little bit bigger um than what he actually is. And so that's how it feels like right now with like machine learning projects, like we can do a lot of things. But boy, there's like if one of the strings breaks, then all of a sudden everything stops dancing in the movie. So um when we look at this, it feels a little bit like little Kevin is like dancing at home and just like not making things up but like pretending to be a little bit bigger um than what he actually is. And so that's how it feels like right now with like machine learning projects, like we can do a lot of things. But boy, there's like if one of the strings breaks, then all of a sudden everything stops dancing in the movie. So So let's talk about like commoditization. let's talk about like commoditization. That was my first impression when we saw on GP T four that a lot of projects got commoditized. That was my first impression when we saw on GP T four that a lot of projects got commoditized. And even folks in, in the first few hours after the release were like, hey, I'm doing a phd in natural language processing is my entire phd still worth doing. And I think the short answer is yes, because there are a lot of problems still not answered, but a lot of machine learning projects got commoditized. So which ones would we consider as being commoditized? And even folks in, in the first few hours after the release were like, hey, I'm doing a phd in natural language processing is my entire phd still worth doing. And I think the short answer is yes, because there are a lot of problems still not answered, but a lot of machine learning projects got commoditized. So which ones would we consider as being commoditized? So any project with public data available is basically commoditized because there's a good chance that it has been uh soaked up into the um the sort of like the, the scrapers of those data sets for GP D four or maybe potentially GP T five. So if, if it isn't solved yet, um there's a good chance that somebody else would do this in the future. Um And it gets commoditized So any project with public data available is basically commoditized because there's a good chance that it has been uh soaked up into the um the sort of like the, the scrapers of those data sets for GP D four or maybe potentially GP T five. So if, if it isn't solved yet, um there's a good chance that somebody else would do this in the future. Um And it gets commoditized projects without any specific environment requirements. So let's say projects without any specific environment requirements. So let's say um you have a use case, you need a, you need a machine learning model or you need a machine learning solution, but the stakeholders don't have any requirements and like this needs to run on device or uh we have security requirements. Um Then it's a good, potentially uh threatened by uh a public api um you have a use case, you need a, you need a machine learning model or you need a machine learning solution, but the stakeholders don't have any requirements and like this needs to run on device or uh we have security requirements. Um Then it's a good, potentially uh threatened by uh a public api specific U security requirements like a digit. We take um the security of the data uh as as a paramount and we do not ship the data to third parties. So that's really critical. So if we want to use large I no models, we have to deploy them internally. And that is a requirement we couldn't fulfill with opening I and therefore we can't use those API S at this point in time. specific U security requirements like a digit. We take um the security of the data uh as as a paramount and we do not ship the data to third parties. So that's really critical. So if we want to use large I no models, we have to deploy them internally. And that is a requirement we couldn't fulfill with opening I and therefore we can't use those API S at this point in time. And what does it mean for our in-house machine learning? And what does it mean for our in-house machine learning? So So we need to focus on machine learning projects on proprietary data. Let's say the best example is always like um there's a high likelihood that uh opening I won't replace radiologists because yes, they have a lot of data. But if you have like a custom trained, highly specific domain specific model, you there's a good chance you will always outperform we need to focus on machine learning projects on proprietary data. Let's say the best example is always like um there's a high likelihood that uh opening I won't replace radiologists because yes, they have a lot of data. But if you have like a custom trained, highly specific domain specific model, you there's a good chance you will always outperform um um the specific generalized API S. the specific generalized API S. So we don't need a machine learning model to make a radiology uh assessment that also gives us cocktail recipes and tells us like what to cook next Sunday. So we don't need a machine learning model to make a radiology uh assessment that also gives us cocktail recipes and tells us like what to cook next Sunday. Um Any project with very specific security requirements is an in-house project. So if you can ship the data to a third party, for whatever reason, Um Any project with very specific security requirements is an in-house project. So if you can ship the data to a third party, for whatever reason, let's say you wanna run your machine learning model on an I O T device or you want to uh uh don't share the data with a third party that makes a good qualification for an in house project. let's say you wanna run your machine learning model on an I O T device or you want to uh uh don't share the data with a third party that makes a good qualification for an in house project. And then we also have seen great instability around the latency and the availability of those third party API S. So if you have low latency requirements, that is a really good reason to pick up those projects and do them internally instead of shipping them through a third party API And then we also have seen great instability around the latency and the availability of those third party API S. So if you have low latency requirements, that is a really good reason to pick up those projects and do them internally instead of shipping them through a third party API and then most important. And that's a discussion you as machine engineers need to have with your stakeholders is like what is the core intellectual property you want to preserve in your business or in your research project. So if we want to classify the sentiment of a text, is this important to your entire IP value chain? If not and then most important. And that's a discussion you as machine engineers need to have with your stakeholders is like what is the core intellectual property you want to preserve in your business or in your research project. So if we want to classify the sentiment of a text, is this important to your entire IP value chain? If not an open API or a model API might be a really good use case to get you up and running and started and uh then you can focus on the actual IP related projects. an open API or a model API might be a really good use case to get you up and running and started and uh then you can focus on the actual IP related projects. So what does it mean for machine learning? So what does it mean for machine learning? We as a machine learning community, we had objectives in, in for all of our projects, we did not always achieve them, but we had a bunch of goals in the past. We as a machine learning community, we had objectives in, in for all of our projects, we did not always achieve them, but we had a bunch of goals in the past. So we wanted to make our predictions unbiased. So there was a strong focus on like how do we handle the data up front to make sure that everything is as unbiased as possible, balanced training sets, um known data sources, uh making sure that we remove some of the biases in the underlying data. So we wanted to make our predictions unbiased. So there was a strong focus on like how do we handle the data up front to make sure that everything is as unbiased as possible, balanced training sets, um known data sources, uh making sure that we remove some of the biases in the underlying data. There was the objective of like adding transparency around the data and the training uh of the model. So you have seen a lot of like projects around uh model cards from Google, for example, or data cards or things like this to, to communicate. What are the limitations? What are the constraints of all of our machine learning models produced There was the objective of like adding transparency around the data and the training uh of the model. So you have seen a lot of like projects around uh model cards from Google, for example, or data cards or things like this to, to communicate. What are the limitations? What are the constraints of all of our machine learning models produced when we run machine learning models? There is this great objective to add feedback loops to our models to improve the model performance, we know as machine learning engineers, no model is perfect. And so if we see something um which is not working well, we want to approve uh improve this in the next model generation. And therefore, we need to capture those like misclassification when we run machine learning models? There is this great objective to add feedback loops to our models to improve the model performance, we know as machine learning engineers, no model is perfect. And so if we see something um which is not working well, we want to approve uh improve this in the next model generation. And therefore, we need to capture those like misclassification and then add them to our training set that was like a key objective for all of our in in-house machine learning projects. and then add them to our training set that was like a key objective for all of our in in-house machine learning projects. And then we have this objective of that user privacy. Um There are projects going around like Federated learning or uh M L privacy or even encrypted machine learning. So we wanted to make sure that everything is sort of like uh at the, at the highest standard for the use user use cases. And then we have this objective of that user privacy. Um There are projects going around like Federated learning or uh M L privacy or even encrypted machine learning. So we wanted to make sure that everything is sort of like uh at the, at the highest standard for the use user use cases. And then obviously, we have on devices, developments, like sometimes we want to shrink models and make sure that they can run on a cell phone or on the latest iphone. Um As you have seen like for example, with like stable diffusion and those types of models, but those were the sort of like M O ops objectives for our machine learning projects. So when we zoom out again and take a look at API S, And then obviously, we have on devices, developments, like sometimes we want to shrink models and make sure that they can run on a cell phone or on the latest iphone. Um As you have seen like for example, with like stable diffusion and those types of models, but those were the sort of like M O ops objectives for our machine learning projects. So when we zoom out again and take a look at API S, then basically, we're, we're missing out on the last four points then basically, we're, we're missing out on the last four points in the question around like unbiased predictions is sort of like in the question around like unbiased predictions is sort of like up in the air because we don't know what the models have been trained on. Um There's very little information about like the background and so we as a community, we need to be careful with like the objectives we have as machine learning engineers when we use those API S and we talk about a little bit more about this later. But here's one example, let's say you get up in the air because we don't know what the models have been trained on. Um There's very little information about like the background and so we as a community, we need to be careful with like the objectives we have as machine learning engineers when we use those API S and we talk about a little bit more about this later. But here's one example, let's say you get an incorrect uh response from GP D four. an incorrect uh response from GP D four. There is no way right now to feed this back into like your own training set, you cannot find you in this api potentially that works in the future. Um But it comes at a high cost. And so for some of those problems, you might be able to just like uh use a large language model from an earlier generation, let's say A T five or um maybe a GP J or something like this, which doesn't have so many parameters. There is no way right now to feed this back into like your own training set, you cannot find you in this api potentially that works in the future. Um But it comes at a high cost. And so for some of those problems, you might be able to just like uh use a large language model from an earlier generation, let's say A T five or um maybe a GP J or something like this, which doesn't have so many parameters. Um And then find, tune it for your applications. You might not need the cocktail recipes being generated when you want to do the address parsing. And in that moment, you can add back the feedback loops. Um And then find, tune it for your applications. You might not need the cocktail recipes being generated when you want to do the address parsing. And in that moment, you can add back the feedback loops. So we talked about what does it mean for the for the machine learning ecosystem? But what does it mean for you and me as machine learning engineers? So we talked about what does it mean for the for the machine learning ecosystem? But what does it mean for you and me as machine learning engineers? So our role has drastically shift, shifted from like developing machine learning models or uh creating the infrastructure of uh those machine learning systems to So our role has drastically shift, shifted from like developing machine learning models or uh creating the infrastructure of uh those machine learning systems to being the moderator between stakeholders. So you might have a CEO or um a CTO who doesn't have the full understanding of like how the complexity works in machine learning and what the benefits are of like all the um the security aspects and the privacy aspects and those types of things and, and machine learning systems. being the moderator between stakeholders. So you might have a CEO or um a CTO who doesn't have the full understanding of like how the complexity works in machine learning and what the benefits are of like all the um the security aspects and the privacy aspects and those types of things and, and machine learning systems. So it comes down to you as a machine learning engineer to educate the other stakeholders. What are the benefits of like in-house development versus using third party API S So it comes down to you as a machine learning engineer to educate the other stakeholders. What are the benefits of like in-house development versus using third party API S and then you as machine learning engineers, you were the core and then you as machine learning engineers, you were the core drivers. Now when it comes to advising others in organizations around the risks and benefits of those third party API S. So as soon as GP D four came out, we had very long conversations with the security folks on our team. drivers. Now when it comes to advising others in organizations around the risks and benefits of those third party API S. So as soon as GP D four came out, we had very long conversations with the security folks on our team. Um They were, they were very much concerned about like sharing data with third party API S et cetera. And we unpack this a little bit of like, what does it mean? What is being shared? Um Is there maybe a way we could get around this? Um What are the benefits of doing this in house, et cetera. Um They were, they were very much concerned about like sharing data with third party API S et cetera. And we unpack this a little bit of like, what does it mean? What is being shared? Um Is there maybe a way we could get around this? Um What are the benefits of doing this in house, et cetera. So So just to conclude where we going from here as like the as like the community of M L engineers and MLS people. just to conclude where we going from here as like the as like the community of M L engineers and MLS people. So prompt design, I'm not calling this prompt engineering because right now there's too much guessing in this game. Uh It seems like it's more like design a design process. It's very much iterative than an engineering process where there's like a, a given path in the structure. So prompt design, I'm not calling this prompt engineering because right now there's too much guessing in this game. Uh It seems like it's more like design a design process. It's very much iterative than an engineering process where there's like a, a given path in the structure. Yes, that will be part of machine learning, but it won't replace machine learning. That is my strong opinion. Like we will not be out of a job tomorrow because there's like some massive model and we, we just sit there and tune a prompt. Yes, that will be part of machine learning, but it won't replace machine learning. That is my strong opinion. Like we will not be out of a job tomorrow because there's like some massive model and we, we just sit there and tune a prompt. It will be part of it It will be part of it but it not, it won't be the full job. but it not, it won't be the full job. There are lots of M L ops challenges around large language models. So even deploying a model with like plus billion parameters is not an easy task. And we will need as a community, we will need more experience. And like how do we do this, how do we distribute models across multiple instances running this across maybe a bunch of GP us for a single model and how do we get latencies down to There are lots of M L ops challenges around large language models. So even deploying a model with like plus billion parameters is not an easy task. And we will need as a community, we will need more experience. And like how do we do this, how do we distribute models across multiple instances running this across maybe a bunch of GP us for a single model and how do we get latencies down to um to like something we can use in real time systems? Um Also as a side note, like the carbon footprint of those systems um to like something we can use in real time systems? Um Also as a side note, like the carbon footprint of those systems is massive and should not be neglected, neglected. And this is something we need to focus on as a community. is massive and should not be neglected, neglected. And this is something we need to focus on as a community. And then we need the integration of like the, the in integration itself needs like good AM understandings. So we need to know like how tokenize this work, like how um why certain terms make uh models more sensitive and then other terms. And we can, we can understand this with a good machine learning understanding um which we can provide as a community to other stakeholders who want to use those API S. And then we need the integration of like the, the in integration itself needs like good AM understandings. So we need to know like how tokenize this work, like how um why certain terms make uh models more sensitive and then other terms. And we can, we can understand this with a good machine learning understanding um which we can provide as a community to other stakeholders who want to use those API S. And this also goes further in terms of like bias and safety. So instead of just like uh handing over the API keys to somebody else in the organization, let's have a conversation with them about like And this also goes further in terms of like bias and safety. So instead of just like uh handing over the API keys to somebody else in the organization, let's have a conversation with them about like that. But what it means in terms of safety or in terms of bias, then we as a community have no insights into those models and there could be consequences for the users. that. But what it means in terms of safety or in terms of bias, then we as a community have no insights into those models and there could be consequences for the users. So what can we focus on for the maybe uh short term future or mid midterm future focus on projects with proprietary data. So what can we focus on for the maybe uh short term future or mid midterm future focus on projects with proprietary data. Um If you have um a custom data set in your organization that is gold, like nobody will replace you. Uh No O A open API will replace you. Uh And your project Um If you have um a custom data set in your organization that is gold, like nobody will replace you. Uh No O A open API will replace you. Uh And your project focus on subjective machine learning. So our machine learning team is uh heavily focusing on the account on the accountant space. We advise on accounting questions, bookkeeping questions. Those are very subjective machine learning problems because one classification for one user is very different from the classification for another user. There is no global machine learning model for us. We use something like similarity, machine learning, focus on subjective machine learning. So our machine learning team is uh heavily focusing on the account on the accountant space. We advise on accounting questions, bookkeeping questions. Those are very subjective machine learning problems because one classification for one user is very different from the classification for another user. There is no global machine learning model for us. We use something like similarity, machine learning, but you could also think about like recommendation systems as like a subjective machine learning. Nobody will paste uh the entire shopping cart history from some start price system into a prompt um for GP T four to make the next recommendation that would be just simply too expensive. And therefore um subjective machine learning is the key for in house developments but you could also think about like recommendation systems as like a subjective machine learning. Nobody will paste uh the entire shopping cart history from some start price system into a prompt um for GP T four to make the next recommendation that would be just simply too expensive. And therefore um subjective machine learning is the key for in house developments and then avoid plain vanilla, plain vanilla projects like and then avoid plain vanilla, plain vanilla projects like getting the sentiment from text. That is something we can uh getting the sentiment from text. That is something we can uh um um uh this is something we can do probably through the API S. Uh We don't need another model to detect cats and dogs. Uh So uh this is something we can do probably through the API S. Uh We don't need another model to detect cats and dogs. Uh So um focus on the non vanilla project um focus on the non vanilla project and then focus on projects with specific requirements. Like as I said, like user privacy security, low latency and really hone those uh those aspects and become the expert, for example, in like low latency language models and then focus on projects with specific requirements. Like as I said, like user privacy security, low latency and really hone those uh those aspects and become the expert, for example, in like low latency language models and then be the moderator between the stakeholders, be the moderator and advisor and help other people in the organization um to understand the benefits and the sort of like the the disadvantages of those API S and then drive also the conversation. and then be the moderator between the stakeholders, be the moderator and advisor and help other people in the organization um to understand the benefits and the sort of like the the disadvantages of those API S and then drive also the conversation. And with that, And with that, you are basically in a really good spot to live alongside those open API S or model API S to uh to succeed as a machine learning engineer. you are basically in a really good spot to live alongside those open API S or model API S to uh to succeed as a machine learning engineer. Cool. Thank you. Cool. Thank you. Great. Thank you so much. That was an awesome talk. And I think. Great. Thank you so much.

+ Read More

Watch More

Real-time Machine Learning: Features and Inference

Posted Nov 29, 2022 | Views 1.1K

# Real-time Machine Learning

# ML Inference

# ML Features

# LinkedIn

Real-time Machine Learning with Chip Huyen

Posted Nov 22, 2022 | Views 1.8K

# Real-time Machine Learning

# Accountability

# MLOps Practice

# Claypot AI

# Claypot.ai

What is the Role of Small Models in the LLM Era: A Survey

Posted Nov 05, 2024 | Views 909

# LLMs

# Small Language Models

# Specialized Tasks