Video Summary
Dr. Shawn Conley, UW–Madison Plant and Agroecosystem Sciences professor and Extension soybean and small grain specialist, and Jason Lo, data scientist with the UW–Madison Data Institute, present an innovative approach to using large language models (LLMs) for farm management insights.
This session explores how AI can help synthesize thousands of research papers into actionable recommendations for farmers, crop consultants, and agronomists. The team demonstrates a human-in-the-loop system that ensures accuracy, transparency, and traceability in AI-generated outputs. Learn how this AI-powered tool can accelerate research, improve decision-making, and bridge the gap between science and on-farm practice.
Resources
Transcript
0:05
Awesome, Thank you, Chris, and thanks John for setting up the, you know, the, the, you know, the foundation of what we’re going to be speaking to.
0:12
That’ll help me speed through a couple of our slides to make sure we stay on time.
0:16
And it’s funny when John used the term vectorize, my daughter is actually going to be vector for Halloween and when we think of the vector dance, but I I digress as I usually do.
0:27
So anyway, as we go in here today and what I want to do is kind of talk about this team that we’ve built and give, you know, special recognition, recognition to Jason Lo and the Data Science Institute.
0:40
So this is kind of an idea that Damon Smith and myself and my Cool Bean team has come up with is, you know, how can we go through and sift through the immense amount of data that is published on an annual basis.
0:54
I know John Shutske has said, you know, sometimes it reads from these data dense papers that are 35-40 pages long.
1:01
And you know, what do we want out of it?
1:02
We just want the bullet points, we want the facts, we want to be able to get basic recommendations and move on.
1:07
So that’s kind of the the concept and framework that we want to kind of go and address some of this.
1:13
And I think if we look at this in terms of how do we want to use these large language models that John had referred to in, in our science, as we kind of look at this in a, in a couple different contexts.
1:25
First of all, we have this human discovery.
1:28
You know, that’s basically what we do here as scientists.
1:31
We’re coming try to come up with these ideas, you know, with discovery and be able to push that information out.
1:37
Then we’re able to use these large language models and that this knowledge that has been generated by, by scientists, not just here at UW, but not, and not just nationally, but globally, be able to condense that information and really get a sense of, you know, where are we today?
1:54
And you know, what is the next step?
1:56
Cause a lot of times, you know, we kind of get stuck in our little, in our little bubbles here.
2:01
Just say as a soybean agronomist, you know, maybe there’s something cool that a soy or a corn agronomist is doing much I know is not true because they’re not doing anything cool.
2:10
But you know, outside of that, maybe there’s something cool that a plant pathologist is doing that we can bring in and utilize some techniques into into our system.
2:18
So what these LLMS really allow us to do that is be able to have this discoverable knowledge adapted and be able to make the next steps, develop these new hypotheses, test these experiments and really Drive
2:30
innovation forward.
2:32
And one of the things we we did is maybe many of you have our listserv when I saw the survey we put out probably about three months ago, just trying to get a a sense of our clientele.
2:44
What, you know, what is the sentiment you all have towards these large language models or, you know, this AI that you use on a daily basis?
2:51
And I think for the most part, you know, a lot of the responses back were 60 to 70% of the people said, yeah, they’re super useful.
3:01
Yeah, really improves the efficiency of my system.
3:04
I can use them but for data sharing.
3:06
But then we go down to the whole idea of trust.
3:08
And that really popped off is only 40% of the the, you know, the individuals thought that they could actually trust what either somewhat agree or, or somewhat or agree that we can trust the output.
3:22
And I think that’s where we want to kind of focus in is how can then we build this trust in these systems that our growers or crop consultants can then take this information and, and, and adapt it into their systems.
3:35
And I think that’s the challenge we’re we’re facing right now.
3:38
We have this global food predict food product systems is under pressure again.
3:43
We have limited land population growth.
3:45
We all know the challenges we have in agriculture right now, but then we also have this, this gap, you know, we have a lot of practices that we’ve found through research and science that really aren’t being adopted, you know, by farmers.
4:00
You know, what is that research practice graph gap and why are this, why is this adoption rate not being where it is?
4:07
How can we be able to extend that information out to our our clientele and then be able to bridge that gap to really drive home some of these sustainable crop management practices?
4:17
OK, Now one of the things that we often do in in academia and you maybe you’ll hear about these things, they’re called these either meta analysis or a systematic review, OK.
4:29
And roughly what this is, is let’s say we want to do, and we just did this, we did it a meta analysis on cover crop termination, you know, pretty simple.
4:37
And we want to say simple, but pretty straightforward what that combination would be.
4:41
But when we looked at the literature, there was over 500 research papers looking at the effect of cover crop termination in soybean in the United States.
4:52
And who’s going to sit there and glean and read through all the different iterations and information and be able to extract that?
4:57
Well, we did.
4:59
And by we, I mean my team, not specifically me.
5:02
And that process roughly took us nine months to be able to manually pull this information, extract that information and be able to come up with, you know, a research paper that we could pull out.
5:15
The challenge with that is every time we want to update that because in science, we’re continuously generating new information on cover crops or fungicide application timings or resistance and weed management recommendations across the board, constantly being updated.
5:31
So the whole idea of using either a meta analysis or systematic review in the traditional manner really kind of slows us down and I think gives us either a bottleneck or an impediment in advancing our science being able again, get that, you know, bridge that gap between the research, the practice gap it’s happening on, on the landscape.
5:52
So what we did, it’s kind of high level and I’ll have Jason Lo will kind of get into this a little bit better in more detail here shortly.
5:59
But what we did is we downloaded, we had a couple caveats.
6:03
We downloaded 1000 research papers related to Soybean production.
6:07
And I think our our initial caveat of this had to be done in the last 10 years.
6:10
So we wanted relatively new information.
6:13
OK, So then what we did, we pulled in 1000 research papers, downloaded those into a PDF.
6:19
And there’s no way any of us would be able to kind of sift through that information in a timely manner and be able to characterize what we learned.
6:26
And on the outside of this, we wanted to do was screen them all to make sure they would fit for a Wisconsin farmer, distill that information and make sure, make sure it made sense, summarize it into an output form that farmers could do and then give that suggestion to farmers.
6:41
OK.
6:42
So that’s kind of the goal, the bigger picture of how we’re low looking at some of the synthesizing all of this information.
6:49
OK.
6:50
And I’d like to thank John.
6:51
He kind of brought up this whole thing about safety.
6:53
And I think that’s a a big thing that we want to be able to do as we come up with our output.
6:58
Because how many of them remember, like the Maha report, that was a big deal.
7:04
Again, we’re not getting political here.
7:05
But if you kind of read through that report, the first iteration had multiple papers that were basically made-up.
7:12
They weren’t actually real citations.
7:14
So I think one of the things that we that we on the academic side of things to be able to make sure that as we develop these tools and information to farmers, that we can go back and trace back our information to that source.
7:26
And I think that’s a big key here that as you work whatever system you could be working in Gemini, ChatGPT, whatever, you’ll be able to be able to go back and, and, and back check to make sure what that source of information is correct.
7:38
And then, you know, unfortunately, sometimes you have to Fact Check that Fact Check that actually make sure that that source or citation is legitimate.
7:46
OK, And this is where we come in with this hoop it human in the loop.
7:50
And again, I’ll, I’ll bring this forward when Jason Kentaigen talked about that usefulness.
7:56
That’s a that’s a big key point.
7:57
And John did a good job of explaining that too, is you have a 35-40 page research paper, but frankly, in today’s AI, not AI, but today’s how busy we are.
8:09
Most farmers that I talked to, they want an answer in two to three bullet points Max.
8:14
So we have to be able to again, make sure it’s, it’s the claim is accurate, but it’s also concise and clear that yes, you do this, this is how you, you know, this is our recommendation.
8:25
All right.
8:26
And then efficiency again, it’s good to go back to be able to curate all of this amount of knowledge within hours instead of months, or in this case, years.
8:35
So as part of this exercise we did we by we, me and Damon and Jason Lo and our team built this tool.
8:44
It’s like an internal chat bot of using this thousand research papers.
8:48
And we just asked it a a simple question, you know, basically how do we control white mold in Wisconsin?
8:55
And this is kind of an output here that, you know, obviously it has been pulled from literature and we’re lucky to have Damon Smith involved in this.
9:02
But also, you know, one of the world leaders in white mold and really understand that the information you’re getting out of a bot like this is going to be accurate.
9:10
And if you see, we have here the citations like 12,5,3,26,1, 4 and two, these are all citations that a farmer can go back and Fact Check why we came up with these results.
9:21
And the flip side, we also did this as a competitor.
9:26
And I know who was going to say Gemini as a competitor.
9:29
I think they’re a little bit more of a competitor.
9:31
I think we did our work for $50,000.
9:34
And I don’t know how many millions or billions of dollars Gemini’s using on a daily basis to generate theirs.
9:40
But you know, on a, a quick cursory overview of this, it looks just as good as something as an internal chat, chat, chat bot GPT would be using, you know, cited literature.
9:54
However, one of the challenges we understand is a lot of this stuff like Gemini, and this was addressed by John earlier, it’s Internet data.
10:02
It’s out there searching Internet data and pulling stuff.
10:04
So it’s not necessarily knowing what that trusted source is and where that information coming from and the factors limited transparency.
10:11
We do have a couple citations here, but for the most part, you just have to trust that Gemini is going to give you, IE the crop consultant and farmer and you know, a real answer that you know would be trustworthy and for fact they would be legal.
10:27
Can you actually apply these certain pesticides and as a target for that pest?
10:31
So I know I went through things quickly, but I want to make sure there’s time at the end.
10:35
So I’m going to bring in Jason Lo here to kind of get into some of the technical details.
10:40
So Jason, you’re up.
10:42
So for the next few minutes, I would like to show you how our team expert Sean, Damon, Spiro and Tatian is trying to keep full control of this system and ensuring that as John just has mentions, AI use is really safely.
10:59
So it really comes down to three main responsibility.
11:03
So first they have to decide which study should be included, providing all the screening criteria just like what just Shawn has said.
11:12
And then next they guided AI on exactly what information to extract.
11:18
So expert provided recipe per SE.
11:22
And finally, they handle any tricky cases that the AI cannot resolve on its own.
11:30
Maybe next slide.
11:31
Yeah, So and then maybe I can show you how this work in practice.
11:38
So the first step we have is the screening stage.
11:41
So basically we start with expert leading screening.
11:46
Our team set a precise inclusion and exclusion criteria for evaluating the literatures.
11:52
So well the AI model initials ratings.
11:58
So we have multiple AI models to rate on this criteria.
12:03
Most of the time these rating are matched with each other, but sometimes, maybe around 2% of the time, it will conflicts with each other.
12:13
So we introduce a human being, the domain expert to check those cases.
12:18
So to make sure everything is under our control, slide please.
12:26
And then after we have gathered the correct ingredients in the screening phase, we started to try to extract and distill the information a little bit further.
12:37
So here our expert guide the AI to analyze and provide specific questions to ensure only the most relevant information from each study, makes it into the final report.
12:51
So basically, we have gathered a few questions from the expert for asking.
12:57
For example, in the slide had shown what is the effectiveness of foliar fungicide application something, something and then make sure we have the correct information on hand before fitting into another LM agents next slide please.
13:17
So after the distilling this, this is what the data looks like.
13:23
Perhaps this is a cleaned ingredients probably doesn’t really matter too much on this, but two points I want to highlight is it will extract whether a study has the relevant information to the expert guided questions and it will also provide the supporting flow in each of the paper so that you can trace back the original evidence from the source paper.
13:53
Next please so finally the system compiled all the expert guiding data into a complete summary report.
14:05
So with every claims and finding fully cited for the verifications and that basically three step of generating this AI assisted reporting.
14:19
Next slide please.
14:22
So maybe you will ask why does this matter?
14:27
I think a human could do all this basically manually.
14:31
But our approach create true partnership between expert and AI and the system handle the repetitive works, freeing expert to focus on the high value decisions while staying in the complete control.
14:45
So the results speak for themselves.
14:48
This approach cut down the expert workload down to just 1.5% to the original effort, making the process over 50 times faster.
14:57
So with that, I will hand over to Shawn to share some final thoughts.
15:01
Thanks, Jason.
15:03
Yeah.
15:04
So I guess in conclusion, I think you know, at least within our team and I think John brought this up earlier, you know, we’re pretty cautiously optimistic moving forward at how how these tools are going to be adapted.
15:15
I think the the biggest thing that we in agriculture need to understand is how quickly that our labels change, how quickly we have programs change.
15:24
So these are things that if we’re just have to understand and basically give the, you know, the output we get from AI, the sniff test, if you will, it does it make sense.
15:34
Does it make especially for your specific location because you know, you might be pulling, you might ask a question and depending on which LLM you’re using or it’s could be pulling information from Georgia.
15:45
I mean, does that really apply to soybean production in North Dakota?
15:49
So again, things to think about moving forward.
15:52
I think as John had said earlier, LLMS can create, you know, automatic straightforward tasks pretty effectively.
15:59
And you know, I think our next steps that we’re kind of right into right now is it is the focus on validation and generate more quantitative insights.
16:08
Because I think The funny thing is when Jason and Spiros helped put these, Damon helped put these slides together.
16:14
About a week ago, we said there’s really not many yet research grade tasks out there.
16:19
However, I just got an e-mail from one of my colleagues that he had this tool that he’s just adopted into an LLM that basically you can ask a question, drop a pin in a field, you ask a question, what is the best cropping system for maximum profit?
16:36
And based on all the information that it automatically pulls from that field, ’cause it has a GPS locations, which means it brings in all the soil type historical weather data.
16:46
They can kind of go over then and pull in all the research data that we’ve generated across the country related to soybeans and basically come out with a bullet point list of what farmers need to do for the specific field to be able to have, you know, the most profitable system and compare it to what but be in this case, a worst class scenario or system would be.
17:08
So I think moving forward, the whole point is this is advancing rapidly and there’ll be a lot of these tools to be able to test and ground to it moving forward.
17:17
So with that, I’m going to kind of pause right here, make sure we’re done on time and stop sharing my slides and see if there’s any general questions for me.
17:25
And Jason, John Shutske and I know Damon Smith has been actively involved in this as well and he can pop on here if there’s any questions related to Damon.
17:33
So, you know, we covered a lot of meaty stuff.
17:36
So hopefully we we kept it high enough level for y’all that it wasn’t too deep, I guess for you.
17:43
Shawn, where, where are you headed next with your, with this chatbot project?
17:49
Yeah, before we got on today, we were just kind of talking about, you know, why I might be grumpy.
17:53
And part of it was we have all these grants.
17:55
to do, you know the grant portal just kind of open up.
17:57
So we’ve been working on these projects and you know, we’ve been, you know, writing proposals with David and Jason and the Data Science Institute and my colleagues across the country.
18:06
Frankly, what we wanted to do is be able to integrate, you know, the satellite imagery, the drone imagery, you know, integrate what crop consultants are doing and developing A framework when they’re out doing the boots on the ground scouting.
18:18
And then it also bringing in all of the best management recommendations we have from our research funded by both industry and commodity groups and have this into one spot.
18:27
You know, it’s all going to be linked and it can be updated, you know, a living entity, if you will.
18:35
What was the, what was the name of that Skynet?
18:37
Hopefully we don’t get to Skynet quite yet, but this would be kind of the, the protected Skynet version for soybean production that, you know, if our work can basically say, this is where I’m at, this is what I see.
18:49
And it’ll kind of sort through all the information and give, you know, on time accurate in field of recommendations based on what we know today and not stuff that could be 3 years ago.
18:58
So that’s, that’s the dream.
19:00
That’s what we’re working towards.
19:02
Yeah.
19:02
And I think too, like John, John talked about what’s for the public, the general public.
19:07
I like how you and your team have narrowed this down like this is for farmers and this is how we’re going to make it specific to farmers.
19:15
And, and for us to see the difference too is really important of how you’re making that data really precise.
19:22
And the recommendations on the flip side, are going to be more precise too.
19:27
Yeah.
19:27
Just for fun, I’ve asked Gemini some questions on how to control a specific weed, and Gemini was not right.
19:34
I’m just saying. When it gets into the specifics of pesticide use and management and how to especially how quickly things were, herbicide resistance and fungicide resistance for that matter that Damon is working in is, you know, occurring out there.
19:48
We need to be Johnny on the spot, if you will, to get this stuff out there for farmers.
19:53
And goes back to the safety thing that John referred to.
19:56
Can’t, can’t be suggesting an off label herbicide or pesticide application because that won’t turn out well for anybody.
20:04
Yeah, and it’s happening so quickly.
20:06
So thank you again for bringing this topic to our Badger Crop series.
20:11
And again, thank you to our attendees.
Badger Crop Connect
Timely Crop Updates for Wisconsin
Second and fourth Thursdays 12:30 – 1:30 p.m.
Live via Zoom