Which Economic Tasks are Performed with AI? Evidence from Millions of Claude Conversations
speakers

Sophia Skowronski is a Data Scientist at Breckinridge Capital Advisors with previous experience as a Business Analyst at Pledge 1%. Sophia has also worked as a Data Science Intern at Candid, an AI Investigations Intern at Deep Discovery, and held roles at Singularity University and the Global CO2 Initiative. Sophia holds a Bachelor of Arts in Astrophysics and Cognitive Science, as well as a Master's degree in Information & Data Science from the University of California, Berkeley.

I'm a tech entrepreneur and I spent the last decade founding companies that drive societal change.
I am now building Deep Matter, a startup still in stealth mode...
I was most recently building Telepath, the world's most developer-friendly machine learning platform. Throughout my previous projects, I had learned that building machine learning powered applications is hard - especially hard when you don't have a background in data science. I believe that this is choking innovation, especially in industries that can't support large data teams.
For example, I previously co-founded Call Time AI, where we used Artificial Intelligence to assemble and study the largest database of political contributions. The company powered progressive campaigns from school board to the Presidency. As of October, 2020, we helped Democrats raise tens of millions of dollars. In April of 2021, we sold Call Time to Political Data Inc.. Our success, in large part, is due to our ability to productionize machine learning.
I believe that knowledge is unbounded, and that everything that is not forbidden by laws of nature is achievable, given the right knowledge. This holds immense promise for the future of intelligence and therefore for the future of well-being. I believe that the process of mining knowledge should be done honestly and responsibly, and that wielding it should be done with care. I co-founded Telepath to give more tools to more people to access more knowledge.
I'm fascinated by the relationship between technology, science and history. I graduated from UC Berkeley with degrees in Astrophysics and Classics and have published several papers on those topics. I was previously a researcher at the Getty Villa where I wrote about Ancient Greek math and at the Weizmann Institute, where I researched supernovae.
I currently live in New York City. I enjoy advising startups, thinking about how they can make for an excellent vehicle for addressing the Israeli-Palestinian conflict, and hearing from random folks who stumble on my LinkedIn profile. Reach out, friend!

Raised in Reykjavík, living in Berlin. Studied computational and data science, did R&D in NLP and started making LLM apps as soon as GPT4 changed the game.
SUMMARY
We break down key insights from the paper, discuss what these findings mean for AI’s role in the workforce, and debate its broader implications. As always, our expert moderators guide the session, followed by an open, lively discussion where you can share your thoughts, ask questions, and challenge ideas with fellow MLOps enthusiasts.
TRANSCRIPT
Sophia Skowronski [00:00:00]: Okay, cool. So I'm covering the intro today just to get everyone on the same page. And then Adam and Valdemar are going to kind of highlight some parts of it that they thought was interesting and maybe have some point of discussion. And so this paper is not as technical as some of the more recent reading groups, but it kind of talks about questions that I'm sure we're all familiar with, like is AI going to replace our jobs? I don't think you can really role on LinkedIn for more than two seconds without seeing someone talking about how AI is reshaping how we do X task or replace Y process. So just these conversations have really spiraled since November 2022. So this, what's great about this paper is that it kind of built a framework for measuring the economic impacts using actual cloud usage data. And so it gives us kind of a snapshot of how AI is starting to shape the workforce, as they say. And it might help us see where growth in certain skills in AI usage might happen in the future.
Sophia Skowronski [00:01:12]: So it also would be really interesting if other generative AI companies started sharing this type of data as well. Then we could even see we could do even more forecasting about with actual data where demand for jobs and industries and wages might start to change because of AI. So it's a big topic, but what's interesting is that they actually provided some measurement of it using CLAUDE conversations. So here's. I pulled this just directly from the paper. This is their cited contributions here, the main one being the usage based measurement. And in the intro they also mentioned about how much of AI usage research so far has been from surveys. So that this is a novel contribution to this measurement or economic measurement of AI.
Sophia Skowronski [00:02:05]: And the other stuff I'm going to just kind of breeze through, there's a visualization component for each of these which does a better job at kind of sharing the main takeaway from this. And so to kind of get things started. So how are they accomplishing this big empirical measurement? So they built a framework for mapping the 4 million conversations from Claude to occupational categories in the O NET database, which is from the US Department of Labor. And they use that to surface usage patterns. And so you can see conversation. It starts with a conversation, gets mapped to a task, tasks are associated with occupations, and then they break it down into whether or not that specific task was from automation or augmentation and which skill for that occupation is it associated with, as well as looked at wage and AI usage. And so just to give you an quick understanding of what The O NET database has. It's a.
Sophia Skowronski [00:03:13]: It's a program that was created by the US Department of Labor. I already mentioned it provides detailed information about jobs within the U.S. so it's, I guess it's continuously updated with surveys and workers and experts providing feedback. So I just took a. A screenshot of an interesting page on their website. So their data model has like six different components. You can look at worker characteristics, requirements, experience requirements, job requirements, and skills and abilities. And so I just showed just one particular ability that kind of highlights what top roles are related to it as well as like what job zone it is categorized into.
Sophia Skowronski [00:03:57]: And so those two components get added into some of the analysis in the paper. And so CLAUDE is focused. So I mentioned a bunch of data models and CLAUDE is focused on the 20,000 occupational tasks and mapping them to around 1,000 occupations. And each occupation is associated. All occupations are associated with around 35 different skills and like reading comprehension, negotiation, problem solving. And so they mapped between all these three. Three different pieces. And just so you can see it, the O NET database you can easily download too.
Sophia Skowronski [00:04:34]: So here's a CSV of all the or Excel file of all the 20,000 tasks. And you can see it's sorted by job type and it's associated with an id. And then it's also. Yeah, they also mapped out all the occupations in the US they're only using 1000 they found. So there's like different levels of it. So you can see there's like a more broad category here and a more specific. And they're mapping to the broad category of occupations here, which are associated with that code. Kind of cool.
Sophia Skowronski [00:05:12]: And so the contribution here about. Or the. I thought it would just be helpful to kind of talk about how they built the framework, how they measured it. And so it's not as simple as take input, Cloud, Claude conversation output label with 20,000 cardinality, like mapping it to a specific task. They said that wouldn't fit in the context window. So their approach had three different parts. All this is in the appendix, by the way. So they created hierarchical taxonomy of those 20,000 tasks.
Sophia Skowronski [00:05:49]: There's a top level and. Yeah, middle level and base level, which is the 20,000. And so it was kind of built through this recursive process where they first embedded all 20,000 tasks. They clustered it using K means they named the cluster using claude, they named the neighborhood with Claude and they kind of reshuffled and refined and continued until they found a specified, until they reached some criteria. And that's also listed in the Appendix. So they found that there was like deduplication they had to do across tasks. So they did a lot of cleaning here to to select this one in particular or select this taxonomy in particular. Once they created the taxonomy then it was really all about doing direct assignment of each conversation into the task.
Sophia Skowronski [00:06:41]: The conversation goes through some validation component. They first check whether or not is this occupationally relevant and they apply some other labels to the conversation. Which one of them is is it automation or is it augmentation? So they do some initial classifying of some pre processed model and then it goes into. And then it traverses the hierarchy from top to bottom and it selects the most appropriate task at each level. And what's cool about this is they provide all of the prompts for this in the appendix as well. And then finally the last piece is just mapping these tasks to the occupations. And I kind of already showed you there's like a specific ID for each task. And so they then did that aggregation and I think they did the same thing with the skills.
Sophia Skowronski [00:07:32]: But I didn't find where that specific data piece was. And so just wanted to give you just like easy, it's easy to kind of all the charts here are really easy to parse. But this is from the appendix. They're just doing counts. And this is counts by base level task. And so you can see the top is software engineering related or software related. Mid level, same thing. Develop and maintain software applications.
Sophia Skowronski [00:08:00]: And then third the top level is design, implement and maintain information technology systems. Cool. And so they build this classifier or direct assignment taxonomy, hierarchy, whatever. Oh I guess framework would be the right word to use. And so they pass through the 4 million conversations and then returned all these results that are all pretty interesting. And some of it might not be really surprising to some of us on the call, but the use of CLAUDE is frequently used for coding and content creation. So you can see it comprises of about half of the data. And so yeah, so this is like I think these tasks here are the mid level tasks for that they're using in the visualization.
Sophia Skowronski [00:08:47]: Probably makes nicer numbers numbers here. And also I think if you like go to the paper's website, they they actually they give you the CSV of each task with its percentage usage in, in the CLAUDE data data set. And then this is just how AI usage is mapped to occupations versus their representation in the economy. So again you can see outsize usage in computer and mathematical tasks versus other tasks requiring more physical, I think they said physical manipulation or specialized training. So let's See if there's. Yeah, I guess that would be like construction here and then farming, fishing and forestry. So, so this kind of shows where it has a lot of value add, I guess to put it in corporate jargon or where there's still like opportunities to like invent new usage across occupations. And then this, this is a cool chart.
Sophia Skowronski [00:09:49]: So it just plots each occupation and percentage of tasks that use AI or tasks that were found to have relevant related cloud conversations. And so the main, yeah, here's the kind of the main points that they put in the front of the paper that like around a third of occupations are using a quarter are at least using AI for a quarter of their tasks. So all pretty significant, pretty interesting. And then this is another account which just is related, which just maps the tasks to those specific cognitive skills that I mentioned. What I thought was funny was that active listening and they mentioned this in the paper too, that it's primarily they think the active listening is mostly because Claude kind of rephrases what the user prompt, what the user inputs. So they think it's more related to how Plaude is like trained to respond and reaffirm what the person is like is requesting. So thought that was funny. And then.
Sophia Skowronski [00:10:53]: Yeah, here is another, yeah, another point or contribution. Just the percent of conversations and how they're distributed across different wage. Yeah, boundaries I guess for each occupation. So then I just pulled in the table from the appendix that breaks down occupations by job zones. So this column is the percent of conversations and then this is the percent of occupations represented in the economy. So you can see where there's like outsized usage or at least percentage of conversations over their representation in the economy. And they say let's see. Yeah, okay, so yeah, this is occupations by barriers to entry, I guess.
Sophia Skowronski [00:11:46]: Yeah, they talk about like some example jobs that are associated with each zone. I guess like zone five is like requires extensive training. So like lawyers, doctors for is more like analysts I think sales managers. So like people requiring at least like a bachelor's degree I guess to do their day to day job. And so again you can kind of, it's all kind of reaffirming like kind of the same point over and over again that yeah, computer programmers and software generative AI has a lot of potential usage in automation and augmentation there. And then this is the final point and I think there's a lot here here. So I'm not, I'm going to leave some space for Adam and Voldemort to kind of talk through what they want to talk about. But so they also distinguish between automation and augmentation.
Sophia Skowronski [00:12:36]: Here again they. When a t. So like input cloud conversation, it actually first classifies the conversation as meeting as one of these five labels, I believe. And so there's. And so you can kind of just see that most occupations here are leveraging augmentation or as they're using AI to as a tool to augment human capabilities rather than fully automating specific tasks. And so I think that covers like all the main contributions that they listed. There is a lot more in the appendix kind of they discuss how they validated their direct assignment approach approach with human validation as well as like just using clustering to check the prevalence of occupations with different clustering criteria. And they also looked at count different using different counts and using multi label classification.
Sophia Skowronski [00:13:41]: And so it just be. Yeah, so there's a lot more to kind of dig into. But I, I felt like there's. It's all very um, easy to understand and I, I appreciate what Claude contributed here or what anthropic contributed here and Claude. I guess. But yeah, I guess I can. I. The only like question that I had was just around I wonder why they specifically didn't break do this plot but break it down by like augmentation versus automation.
Sophia Skowronski [00:14:16]: Because I know they have that data. So I just kind of wonder. Yeah. If that was like they didn't necessarily want to like make that point that like if it was done on purpose like to not report on which occupations are having higher percentage of automation of tasks. But it was. It's interesting as a first look, I guess and it'd be curious to see what other larger generative AI companies report as well. But yeah, I'll hand it off to Adam. And Adam, I can keep this up if you want to look at any.
Sophia Skowronski [00:14:52]: If we. If what you're talking about covers any of these charts.
Adam Becker [00:14:55]: Thanks. Yeah, no, I think it's. Sophia, well done. I think it did excellent job covering this. I have a little bit to add, but not much. So I think it might also be good to see if folks have questions. I already see that there's some questions in the chat, but Sofia, maybe. Yeah, I'll share my screen.
Sophia Skowronski [00:15:13]: Perfect.
Adam Becker [00:15:14]: And okay, let me know and you guys can see it.
Valdimar Eggertsson [00:15:23]: Yep.
Adam Becker [00:15:24]: Okay, cool. So Sophia, I think you covered this. This is something that I really appreciated them having done. I just wanted to like plant a flag here that it's almost like they're giving us a visual map for their entire research paper. Right. And I rarely see this. Maybe it's for like more consumer facing stuff. I just rarely see it in like, just very technical papers.
Adam Becker [00:15:46]: But I thought this was really nice. So again we're starting with conversations. These are the actual conversations that folks have with Claude. They go and they classify them into distinct tasks that they found from O Net. And each of these tasks is then associated with one or multiple occupations. And then on the basis of this they can start running an app. Right. So this is like the high level of what we've covered.
Adam Becker [00:16:10]: You can see the number of conversations and then based on median salary, again the salaries associated with occupations, whether or not we could see the distribution of augmentation and automation. And each one of these tasks is also associated with specific skills. So that's kind of the framework, I think what would be interesting to explore. Maybe we do this in a little bit in like a more of like a discussion setting is whether this type of thinking and breaking down of an occupation into distinct tasks, distinct skills is even a reasonable thing to do in the first place. I think that there's, there's some space there for, for deliberation. One of the things that they're doing is they're using clio. I don't know if people are familiar with Clio before, but question that I think you could ask yourself is how is it that people are just typing into Claude and then all of that ends up in a research paper? What if they've taken some private information? What if, is there anything that they're doing to like sanitize all of that work? Or maybe like right now you'll show up with all of your queries inside somebody's research paper. Well, they're using a system that they came up with called Clio.
Adam Becker [00:17:16]: And I think it's, I looked into it. It's kind of cool if people don't know it's in a browser. I think it's just Claude insights and observations. So this is what it stands for. And the way it works is diagram that I thought was instructive. So get this, this is random sample, let's say like real world traffic, right? User, how do I tie my shoes? I'm 27 and it's a bit of whatever. So that's the user query that then becomes a facet. And you can even see in their CLIO research paper they show you how exactly they derive facets.
Adam Becker [00:17:53]: So this is almost like a summary and some extracted metadata. So how to tie shoes. This was in English. We went back and forth five times. Right, Same thing here. Now user is saying I'm trying to put bows on my daughter. AI companion. Sure.
Adam Becker [00:18:11]: Sorry. A little zoom pop up. Okay, so user, I'm trying to put bows in my daughter Petunia. Happy to help. Okay, so now help tie bows in daughter's hair. So this is yet the other facet. Right. So then in a very similar way to what the authors of this paper did, where they just started with kind of like a ground kind of like level, and then began to abstract up, they do the same thing.
Adam Becker [00:18:34]: So now the summary of tying shoes and tying bows all went into tying various knots. Right. So these are initial clusters that are kind of like groups of related attributes. And you see at this point they've removed all of like the private aspects, at least they hope. And same thing happens with everything else. Right. So you see at some point, this is, by the way, still private. They're not sharing any of this.
Adam Becker [00:18:57]: But at some point they abstracted enough. It says, okay, daily life skills, tying various knots. So this is the stuff that's then visible to analysts. They also, one of the ways they think about extracting private information and then clearing it is that they say, you know, it's. If we have a cluster that's too small, maybe that's something in private, maybe that's something that just like one individual has contributed. So we're going to get rid of it. So they also get rid of certain clusters. Could that introduce some bias into the ultimate analysis that they come up with? Maybe.
Adam Becker [00:19:30]: Right, but that's. I thought this was kind of fun to take a look at. So this is clear. This is how they went about this in the first place. There is something about. Okay, Sophia, you shared this, which I thought was very interesting, and I thought maybe we could zoom into it a little bit more. So you see here, this is red and green, I imagine. Are these the colors? I'm colorblind, but I think this is what they stand for.
Adam Becker [00:19:54]: We have the darker one. Okay, so red and green. So you see the red one is the percent of US Workers, right? And the green is percent of cloud conversations. And sometimes it flips, right? So just like, pay attention to how it flips. And I think the flip is interesting. So we start with percent of US workers, let's say 5.8%. There's 5.8% of US workers are in education, industry and library, yet we see about 9% of the conversations being about that topic or mapping onto that occupation. So this is when the red is almost like lower than the green.
Adam Becker [00:20:36]: But you could see it the other way Around So in healthcare, only 0.3%, or let's say about 5%, 4.7% of US workers are in healthcare, but totally underrepresented in cloud conversations. Right. So I think it's interesting now, could that ultimately produce some forms of distortion for us? I don't know. Right. Like if all of a sudden all the data that we have is from very distinct occupations, will we then continue to just accelerate those occupations in some way and then ignore all the other ones? Insofar as it is very difficult for, you know, you're just kind of going where the data is. Right. So I don't know, but I thought that there's. There's something here that we can kind of continue to chew on.
Adam Becker [00:21:17]: So this was a cool figure. You mentioned this. Okay, so then there's the aspect of how they actually go about clustering. Right. So we start with 20,000 different. Okay, you, obviously you can't put all of this into the model's context window. It says direct classification via zero or few shot prompting impossible. Was impossible because the full list of tasks does not fit in the model's context window.
Adam Becker [00:21:44]: And so now they needed to start building up this hierarchy. And the way they did it is I thought was cool. They took each one of the task names, put it into this transformer to get the vector representation. Is there a bunch of things that get lost in the process here? So we can even just question whether this method is what's actually getting lost here. And so each time they do this embedding, obviously they're not starting with this. Then they start to group it so you have these neighborhoods. And then you ask Claude again, okay, can you give me another task name? Can you give me a name for this group? They end up trying to validate this with human labelers after the fact to give you some sense of confidence. And it seems to be doing a decent job.
Adam Becker [00:22:30]: So I don't know. Okay, so. But I thought that another thing that was pretty cool here is the number of tasks and the number of nodes that exist in each level. And they say that they require the final number of tasks at level L to be, you know, n sub L to be plus or minus 1.5 n l, where n l is chosen so that the ratio between successive levels follows some geometric distribution here. And I was trying to figure out kind of like why they chose something like this. And I tried to map it to see what it looks like. And Sophia, in your slide, you actually had the numbers right. I think we started with three levels, right? The first one had how many clusters or how many nodes?
Sophia Skowronski [00:23:18]: Yeah, it was like top level 12, middle level, like 450, bottom level, 20,000.
Adam Becker [00:23:24]: 20,000. So that sort of made sense based on this. So I tried it out here to see what it would actually look like and if they had more levels. I thought this was pretty good. Basically, the idea here is you can't. If you were to just try to do, like a linear expansion at every level, at some point you're gonna have like a bizarre jump, or you're gonna start with too many because you're gonna end up with like, 20,000. So the way they came up with a geometric distribution actually is pretty nice. So you can have like, insofar as you have, let's say, like, five.
Adam Becker [00:24:00]: If you have five levels, you can start with 10, end up with a thousand. They ended up with something similar, but with three. So that's why you see that jump. But if anybody. I was just inspired to try to come up with, like, different taxonomies myself afterwards. And I thought, okay, well, this would be, like, a nice way to solve it. Okay, so you can sort of see what this looks like here. Next thing was the augmentation and automation part.
Adam Becker [00:24:25]: I see. Are we running? How are we doing on time? Should we. Should we continue to zoom in? Okay, augmentation and automation. So we have the prompt for it. Maybe we can just look at the prompt first. Okay, yeah. So they show you, by the way, each one of these prompts, which I thought was very nice. So this is the question you're gonna.
Adam Becker [00:24:49]: And then you're gonna ask it. Okay, now just select. Is it first of all an occupational task? So think about it like you're asking a bunch of questions that might have nothing to do with occupations in the first place. Right. So the first level of filter is, is this even relevant to occupations or not? And what counts as an occupation? And I don't know. I suspect that there might be some bias potentially introduced even in this. Right. Like, I'm just asking, is sugar good for you? Is sugar bad for you? Is that actually replacing a diet, like a dietitian's task? I don't know.
Adam Becker [00:25:20]: I'm just asking a personal question. So I think there might be some distortion there. Okay. So then we map the conversation. But we wanted to get to automation augmentation. So human. Consider the following conversation. Give it the conversation.
Adam Becker [00:25:33]: Your task is to analyze human AI interactions and conversation transcripts to identify the primary collaboration pattern. And then they broke down into different collaboration patterns. Directive, feedback, loop, task Iteration, learning, validation. And then you saw afterwards, they end up grouping those into, okay, it's going to be augmentation or it's going to be automation. Okay. And by the way, it asks it to also think a little bit. So it says, okay, you could do something. They played this game a few times with their prompts.
Adam Becker [00:26:02]: They're like, here's some brackets. Do all of your thinking have a scratch pad here? And then afterwards come up with the answer. Okay, so this is how they ask it about the different modes of interaction. And then they end up coming up with whether or not it's augmenting or automating. And you could see that one here. So I think we should leave that up to the discussion because I think I have some, I have some questions and some qualms about this because I'm not entirely convinced that what they consider automation is automation. And I, I don't know, like, I think when they say, okay, here's an example of an error. Help solve this.
Adam Becker [00:26:47]: Is that augmenting or is that automating? I think they consider that to be a little bit more of like an automation part. I'm not feeling like it's automating my, my job, to be honest. I mean, it's, it feels like it's augmenting it every time I paste in some error. They think that maybe it's more automating because there is environmental constraints that sort of like exist beyond me. But I'm not sure that that's necessary necessarily true. So it isn't just that it's not compiling, it's just, it's not compiling and producing an error that is unique to me. And there's something about my business logic that is then not being reflected here. And I think it requires quite a bit of me.
Adam Becker [00:27:21]: So I'm not entirely convinced that it is fully automatable in that sense. But anyway, we can, we can open this up to the discussion in a little bit. We went over this, Sophia. I thought that was cool. This, I thought we can also open up soon in terms of. So, okay, these job zones describe again how much learning and expertise is required. Right. How much preparation is required for you to get a job in one of these zones? Right.
Adam Becker [00:27:52]: So five would be something like, you know, you need to be a lawyer or a doctor. Four, I suspect most of us are in four. You know, you need. Maybe a bachelor's would be nice. It's fine. One or two maybe things that are mostly just like physical labor. So it looks like for our sort of job Zone, that's where the most, that is the most usage of AI is. So I don't know, is that a concern for everybody here or not? Like, does that mean that we are actually going to be augmented or is that.
Adam Becker [00:28:30]: Or we're gonna be automated? It looks like if you were in. And they go into some specifics here that like maybe job zone. Things that are in job zone 5 either require access to private information and data that is more difficult for AI to access. Right. Or it also requires just like some nuance that is still missing from AI or it's missing something. So what was interesting to me is to think, okay, well, will AI just ultimately make it to those job zones as well? And so nobody's safe, right? Or maybe we should all go to law school. I don't know. That's what I got so far.
Valdimar Eggertsson [00:29:12]: Okay, should I take over? Take over. I'll just share some thoughts and then ask you guys, everyone here some questions and we can just turn this into a conversation. I have, I made some slides myself just to guide you. I'm not gonna repeat stuff though. Just use this to guide the thought. Okay, first, first of all, like, this is heavily biased towards the programmers and that's for various reasons. Like technical people tend to use the technical solutions first or adopt them first, obviously. And this is Claude.
Valdimar Eggertsson [00:30:09]: That's one like thing that I like. It was a bias to me because I only use CLAUDE for coding. I use ChatGPT for, you know, looking up nutrition or whatever you mentioned earlier, Cloud Excel coding. So that's where it's being used. This graph was interesting and I was just thinking of how, yeah, the flips like it's. You see the bias so well with the computer and mathematical jobs, whatever mathematical is supposed to be like. They say people are using CLAUDE to do mathematics, but I don't know, I wouldn't trust it to do that knowing how language models work. Maybe just use a calculator.
Valdimar Eggertsson [00:31:05]: One field that doesn't use LLMs that much is installation, maintenance and repair. Obviously you have to work with your hands. So it's a limiting factor of all of this. The main thing is just the AIs won't have hands anytime soon. But I think there's a lot of room for improvement in this. Like just if you have AI assistant that can help you debug your, you know, printer. When you have a printer jam, that is a lot of value. So see something there and legal services there.
Valdimar Eggertsson [00:31:42]: It's almost the same number here. 0.8%, 0.9%. And just so like 100 years ago, only the rich could ever afford the lawyer. Nowadays a lot of people can afford the lawyer. But like, I think it's kind of democratization of legal services. Once AI would be able to give you advice. And that's one of the things I've been using AI for just like helping people understand some tax stuff because it's so boring. This graph is the most interesting one.
Valdimar Eggertsson [00:32:25]: And so I think like this can be looked at as one kind of data point. I mean, it's a time series, not a time series. It's like a, you know, lots of data. But we want to see how this evolves over time. So I can imagine, look, you have snapshots of this, you know, January 2025, and see how it evolves. And that's very good image of.
Arthur Coleman [00:32:50]: What.
Valdimar Eggertsson [00:32:50]: Everyone is so scared of or excited about. Are they going to take our jobs? And it's worth noting how it's all about tasks and not jobs. And as, yeah, like a job consists of 20 tasks and as Adam pointed out earlier that like, maybe we shouldn't think of them as individual, fully individual. They depend on each other. Like your work is kind of a system of tasks that relate to each other.
Sophia Skowronski [00:33:24]: So.
Valdimar Eggertsson [00:33:25]: So it's not easily like decomposable. And yeah, they mentioned in the paper how like for some jobs you can automate some tasks, but others not. But I think we could automate more tasks. This of course only reflects what people are using it for currently, but I think like it could be higher and it's like an interesting question we can maybe talk about in a couple of minutes, for example, writing grant proposals or something. They mentioned as what people are not using AI for. Apparently that's a bit of a surprise to me, like for scientists or something. As far as I can tell, like people in science are using it a lot. My wife complains about how her reviewers of a paper she was submitting obviously was just using ChatGPT to review it.
Valdimar Eggertsson [00:34:31]: Let's see, don't have that message here, but this one, the distribution of like wages and educational skills. I remember I studied this topic a bit soon after ChatGPT came out and it was. It's similar. Like the predictions back there were kind of similar. Like how people with bachelors basically, which have like knowledge of a subject, they're at risk, while people who are like more of a. Yeah, complete experts on a subject, they're not as much at risk. And the people who use their hands won't be unemployed yet. Automation versus augmentation I think it was a bit of a limit to categorize it into these five categories because you have both.
Valdimar Eggertsson [00:35:30]: Because the chat is clause or chatgpt touches upon a lot of different things and use it in different ways. Like if I'm coding, we're both doing what they call task iteration. Like you're trying to. You have some design requirements in mind, you want to get there. And then there's a bunch of what they call feedback loops which are just like, here's my Python script. It's giving me an error. Can you help fix it? Oh, here's another error. That's kind of a straightforward process of fixing an individual thing.
Valdimar Eggertsson [00:36:04]: Yeah. Want to talk a little bit about some of the things in the discussion? The. Yeah, dynamic. They could talk about dynamic tracking of AI usage. I think it would be cool to have like a dashboard, seeing how this evolves over time. Probably useful to, I don't know, investors or something. And the whole thing about augmentation versus automation, I think augmentation is more feasible and desirable and to kind of augment the worker instead of replacing him. And also since the job consists of a lot of tasks, what we want to do is kind of to replace the system of work with human AI collaboration.
Valdimar Eggertsson [00:37:09]: And when you have this kind of system solution, it changes the whole system of the. Yeah. Instead of just automating individual tasks, like it's cool to be able to just solve a particular task. Let's say, I don't know if you only use AI for some, for summarizing stuff, it can help with your work a lot and it makes it faster. But once you, you're able to like interact with the AI in a way that changes completely your flow and makes you make you like 10 times more efficient, not just save 30 of the time. And they talked about, talk about this in this kind of AI economics about like point solutions versus system solutions. And an analogy was when electricity was invented, we had steam engines. We had factories with steam engines, but they were all reliable, relying on centralized steam energy input.
Valdimar Eggertsson [00:38:11]: And the factories were built around that. But once we had electricity, we could make like a distributed factory organized in like a 10 times more efficient manner. And the same can somehow be done with AI and human AI work. And we see it, I think, with coding, so using cursor or something, this revolutionizes the process of making code. And that's all I have. So my first question just was, yeah, what does current AI need to be able to solve 30% of tasks for 50% of jobs? Currently this is 11 according to the study. But what's missing? I mean, obviously like robot hands, if you had hands. But like multimodal systems, a lot of people aren't using text, they're using voice.
Izak Marais [00:39:13]: Yeah.
Valdimar Eggertsson [00:39:14]: My second question is how. Yeah. How can the tools you currently have change this kind of system? I'm trying to describe. I don't know if I conveyed it, but like how cursor makes you able to develop program much more efficiently. Are there any other examples? Yeah, just food for thought. That's all I have. And we'll open up to comments. Just let's, let's see.
Valdimar Eggertsson [00:39:54]: There's something in the chat.
Adam Becker [00:39:59]: Oh yeah, I think the. Andrew, go for it.
Andrew Smith [00:40:07]: Can you hear me all right?
Adam Becker [00:40:09]: Yeah.
Andrew Smith [00:40:11]: So I kind of work with some of the examples where there's currently not a lot of adoption. I'm working with agtech companies. So we're trying to figure out how do you actually get the use of AI into the hands of like a farmer? Whether it be big operations. Big operations they'll have already have like data scientists, they'll work with John Deere's giant analytics systems and stuff like that. The smaller farmers, what we've been exploring is if we kind of get a tool in their hand or make an app that we can get in front of the farm manager who's actually the one who's like figuring out what to plant, when to water, when to apply chemicals, which seed breeders to work with and things like that. They're not really that interested so far. There's kind of like pushback just in the general nature of like, we've been doing this for 100 years, we know know what's best. So talking to the point of how do we get to the.
Andrew Smith [00:41:09]: From 30% or higher, one of the things that I've noticed is we just have to find the right tools and actually get involved. There are big companies like Bear that are using kind of rag systems to go into their studies and try to put a tool in front of some of the farmers that's like if you want to figure out how much fertilizer to apply, like how much gain will you get in your crop in your yield over the next year. And it's one of those things where it almost feels like all we're trying to do is automate the advanced level items. Not to say that like farming, obviously farming does require a lot of knowledge, but what they're focusing more on is like automating the agronomists and people who have like all the super advanced degrees which is interesting. But the people who are just kind of a little bit more stubborn towards technology, I think that's just going to be hard to overcome unless you start like really proving that there's a return. So anyways, that's kind of my perspective of it.
Adam Becker [00:42:12]: Yeah, yeah, I think that's, that's fascinating. I one thought that this inspires is this, I think the challenge of break. So like, naively, if you were to just read the paper, you could say, all right, well just go to these farmers and see which tasks they're dealing with and then let's just see. Okay, this is the kind of task that can be out of. Oh, this is the kind of task that can be augmented. Oh, this is the. And I suspect that might be the wrong way to look at it. I think it's probably better to just say, what are the problems these people have? Because it could just be the case that what we're doing is to Valdemar's example, we already have an existing factory layout and then you're just trying to convert that to electricity instead of just changing the layout to be supported by a completely new technology.
Adam Becker [00:42:58]: Right. And so I think in media theory they sometimes they call it like the rear view mirror society. Right. Like we're just, you're going forward, but really you're just looking backwards to how we used to do things. You're an expert in what we used to do. But the whole thing needs to change. Right. It is like an ecological change.
Adam Becker [00:43:14]: Right. The entire environment and the meaning of what does farming mean in the world of AI, Right. It isn't just how can AI help us do what we've always done? And I. So I imagine that a lot of the thinking has to be a little bit more user centric. Right? Yeah.
Arthur Coleman [00:43:34]: The problem, Adam, is getting people to envision what that looks like, that's the harder part. And letting them use it for simple tasks and automation begins that education to new adoption. And just to give you another example, this came from a conference I was at. I love this analysis. When, when there were horses were the main mode of transportation, getting water to horses was the big. Was like gasoline. Right. And so the analogy was that when cars came along, the thought would be, well, why don't we use cars to move water to feed the horses? Rather than thinking of cars as a new mode of transportation.
Arthur Coleman [00:44:10]: I think to Vlad's point, that is exactly the issue. But how do you get people? It's so hard to get people. And you know this as a tech person, getting people to knowledge in new technology and to understand its implications takes years. I mean, social media, the Internet came along in 96, but social media didn't come along until 2007, which is the natural extension of it. So how, to Andrew's point, do you get people to vision that bigger use case? And I think the only way to adoption is let them use it for the things they're doing today so that they then can begin to think, oh, I could do more. You know, I get it now. I get kind of the idea. So anyway, my, my two cents on the, you know, free advice is worth what you pay for it.
Andrew Smith [00:44:55]: Andrew, what's actually kind of interesting. Sorry. What was interesting is I just got back from a conference in San Francisco, an ag tech conference, and one of the farmers there was saying, like, I have this tool, I have no idea how to use it. And his proposal was to actually have, or his. Not necessarily a proposal, but his perspective on it was he needs an AI for the AI, essentially a tool that tells him how to use the tool. And I was just kind of, kind of speaking to your point. It's kind of interesting to observe that.
Adam Becker [00:45:26]: But my question for the both of you then is that it's. Is the concept of AI even relevant here? Should we even rally behind that flag and tell that farmer, hey, go use an AI as opposed to, here's just a better tool for the job. That's it. You always wanted to do X. Here's just a better way to do X. The fact that does AI is just an implementation detail, right? Yeah.
Andrew Smith [00:45:47]: So that's. Yeah, that's kind of approach we've been taking. And also one of the things, like if you do say AI, because at least in the North America, a lot of the farmers have been kind of burned by existing tech that came out, or they like installed a bunch of sensors and then the company goes under and then they're stuck with a bunch of expensive sensors. So there's like that natural hesitancy to begin with, but we also have to, like, not say it's AI. And one of the things we're working on is we have to like, adjust the prompts to sound really like, not technical, how to make it. Because right now a lot of the way that they get their information will be like a seed breeder that they know they have a rep in their local area. And that seed breeder just comes out and says, we have this new seed. You should try this.
Andrew Smith [00:46:31]: It's going to be great. And it's kind of that, like, friendly relationship. So like, if you just say here's a cold AI chat interface, it's very like cold. And so we've been trying to like really work with it to make it warm and make it feel like a friend. So when we do get some traction that they do like kind of have that reinforcement of it's, it's really friendly. It's something that you, you like put in the back of your mind that's even an LLM of any sort. Right. You're trying to make it feel like comforting and just like something that they would like sit down and talk with, like either at the cafe or like, like and back at their, their house or something, someone they would invite over.
Andrew Smith [00:47:09]: So it's like really, we've been trying to like hone in that, that exact conversation dynamic to make it feel like it's not quite as cold and LLM based.
Adam Becker [00:47:19]: Yeah. Nice. Yeah. I think there was even initial hesitancy for like farmers, I remember even to like plug into the electric grid.
Andrew Smith [00:47:29]: Oh, I can imagine. Yeah.
Adam Becker [00:47:31]: I remember like a bunch of interesting stories there where it was just like they needed to go and like sign like contracts, you know, to like get on and they might never have signed contracts. And it was just deep mistrust. Right. And skepticism about what you're going to do with that contract and how are you going to then like somehow constrain them financially in the future and they didn't really know how to make sense of it. And I suspect that there's a lot of just like mistrust that's happening here too, you know, in all sectors of society. But something to get around. Isaac?
Izak Marais [00:48:04]: Yes. Maybe shifting gears a little bit. I think there's something about responsibility. I mean you as a, as a person doing your job, you have ultimate responsibility for the things you do. So I work as an ML engineer and we automate some tasks in E commerce, but at the end of the day, even those tasks that we automate, there's going to be some monitoring about how accurate is it and somebody's going to be responsible for sampling. And just at the end of the day, somebody has to take responsibility for the cases where the AI makes mistakes. So how do you get around that when you're trying to automate a whole job?
Adam Becker [00:48:47]: Somebody in the, in the chat I think wrote about evaluation. Right. I think I saw that earlier. Like how do you know when the thing is. Oh, here, Chris. Software engineering tasks in particular are much easier to test and evaluate if the LLM output is correct. In that case, you can Immediately run the code, just see does it run or fail to run and then you can move on.
Arthur Coleman [00:49:11]: Yeah.
Adam Becker [00:49:11]: In other cases I don't, I don't know if people have some ideas, but I feel like evaluation is probably key to establishing trust.
Sophia Skowronski [00:49:22]: Isn't just having a human in the loop like always going to be a central component of like sorry, if you can hear the dog barking in the background of where they're adopting generative AI in non tech native industries. So I remember just seeing a paper about doctors using AI and how that helps, that supports their work better than just doctors on their own or AI by itself. So it seems like there's always going to be important to at least for generative AI tasks to have some sort of human validation stepped into the process.
Adam Becker [00:50:06]: Do you think that validation could be centralized and I think, I don't know if you were kind of getting at that, but that's where my head went. It's like is every company going to do their own evaluation? Sure, to a certain extent you need to, but shouldn't there also just be, I don't know, like existing like I feel like I don't know if most companies have the trust in their own expertise to be able to do that evaluation properly in the first place. Right. So you could try give some benchmark, whatever that I created. Is that real? Is that good? Is that, I don't know. Right. Can you really hire very expensive data scientists to like very deep in the weeds of your business? I don't know. So maybe there's some opportunity for more external types of validators.
Sophia Skowronski [00:50:56]: Right? Yeah, it seems like the biggest opportunity right now is for people like honestly like LLM consultants like becoming industry experts and just like teaching like those industries how like an app end to end, like what those actually look like and to start beginning to trust those and like start having like industry wide or specific validation and building tools for that component. Yeah, I don't know how objectively like if you're not from that industry or have that expertise, how you could go in and like build transformative technology. It seems like it really has to come from or at least from my perspective like industry specific innovation I guess and like building up tech expertise within those industries that are non tech native.
Izak Marais [00:51:47]: Yeah, that makes sense. If you think about the law lawyers using large language models, they can't just go and use ChatGPT. It needs to be something that's grounded in the law of their country and all the case.
Adam Becker [00:52:07]: I see a few more things in the chat, but if anybody wants to Jump up, say something. We're also running out of time, so maybe. Final comment. Was this fun as a, as a paper? Do we want to do more of these? Do we want to swing back to something more deeply technical? Do we want to have like higher level philosophical, sociological. Does anybody.
Arthur Coleman [00:52:36]: Oh, for me, my perspective on the one hand, getting down in the weeds is really important and understanding technology. But these trends, these things are more important overall. For that view that you had about how do we go away from the use cases that we understand to the ideas that are going to use AI the way it, the power of it in the future. So I'm. I vote for both, but I'm just me.
Adam Becker [00:53:03]: I see support for variety in the chat. So awesome. Voldemort, did you have a thing you wanted to.
Valdimar Eggertsson [00:53:18]: Not really. I feel like maybe some of the things in the chat kind of should have been addressed because people are.
Adam Becker [00:53:29]: Yeah.
Andrew Smith [00:53:33]: But I think we covered some.
Adam Becker [00:53:36]: Of it something we haven't yet.
Valdimar Eggertsson [00:53:39]: Someone says here, I feel that automation using generative AI is an overkill. Most applications that have gained traction have been by applying simple analytics. First of simple and manual use cases that people really want to solve. Generative AI does not need to be applied to all problem spaces. I think that's great. And people doing AI consulting probably know. Know that like people want, you know, people using chat CBT to multiply numbers or something. It's like.
Valdimar Eggertsson [00:54:10]: Yeah, I don't have any comments on it, just.
Sophia Skowronski [00:54:16]: Right. It seems like this generative AI conversation is going to force some of these industries that don't have good data governance practices into building them out. So I think it's like pushing people in that general direction. Even if. Yeah, because you need simple analytics before you can do anything like ML ML modeling, before you can do simple analytics, you need to have good data governance. So it's. Yeah, it's kind of just forcing people like managers becoming more aware of needing to have these practices in place in their industry. At least you hope.
Adam Becker [00:54:56]: Thank you very much everybody. This was fun and looking forward to the next one.
Izak Marais [00:55:01]: Thank you.
Valdimar Eggertsson [00:55:03]: Thank you.