Adam Parks (00:08)
Hello everybody. Adam Parks here with another episode of Receivables Podcast. Today I'm here with one of my favorite guests, Mr. Rob Grafrath. Always enjoy when he comes on to have a conversation with me, because whether it's in passing at a conference or we're sitting here preparing for a podcast or webinar call, I learned more than a little bit from every one of these conversations. So Rob, thank you so much for coming back again, having another conversation with me. I really appreciate you educating me and sharing your insights.
Rob Grafrath (00:38)
Thank you. Yeah, this is just, I'm a kid in a candy store. I love talking about this stuff. With our last one with the sci-fi stuff, you know, that was just too much fun. We're gonna have some fun today as well.
Adam Parks (00:50)
Look, there's always a lot to learn around artificial intelligence. And I think as an industry, one of the challenges that we have is things have started to move so fast that I don't even know that we're using the right language or that we truly understand the nuts and bolts behind the tools that we're evaluating or looking at end results. And we're so focused on the end game that I think we need to take a few minutes and think about the opening and how these things actually function.
But Rob, before we get into that, for anyone who has not been as lucky as me to get to spend some time with you at the conferences and on calls, could you tell everyone a little about yourself and how you got to the seat that you're in today?
Rob Grafrath (01:27)
Right. Yeah, I started as a debt collector back in 1999. Did that for maybe a year and a half. That was not my calling, but it was the industry that I ended up staying in for the next 26 years. Went over to IT, did management, did development for a number of years, and then IT management. Was a enterprise, the vice president of enterprise systems at a major healthcare debt purchaser for a good long while. And most recently, did technical consulting independently for the first part of the year. And now I am head of business development at CSS Impact.
Adam Parks (02:01)
Well, very excited to have you working with an organization that's so well known for their use of artificial intelligence and CRM systems and having the baseline from a system of record and kind of a holistic ecosystem. So I felt like that made you the perfect person to have this conversation with today just because you've got so many different perspectives that you're able to bring together into the conversation.
As we start talking about artificial intelligence and the tool sets that are available and really what the industry is starting to look at and use, it feels like we might not be using the right vocabulary. And as you and I were planning for this call, we started talking about some of the language that we're hearing in the marketplace and on webinars and other places that just might not be correct. So why don't we kind of start at the beginning? I guess it's probably the best place to go. And talk about how artificial intelligence was kind of came to be to what it is today and how the language of those different levels of tool capabilities have have ultimately impacted the growth of artificial intelligence.
Rob Grafrath (03:10)
Yeah. All right. So buckle up. Because this is although this is very new to a lot of us. It started, let's say, maybe around 2020 when chat GPT came out, became really popularized that, wow, this stuff is amazing. And that was because we started chatting with it. We could we could write something down and have it respond. And boy, it sounded like a human. This is amazing stuff.
But AI has been around since probably the 70s or 80s. It's been around for a very long time kind of simmering behind the scenes and The the uses of the terms things like using the term neural network. This was something that the early AI pioneers they really were using as an analogy and anyone that knows like how brain chemistry works.
The way that your neuron fires, you know, there's a certain threshold that it has to meet as a whole electrochemical system, but that a lot of the same concepts were sort of analogous in computer science where you had particular connections between data points and there were thresholds and there were multiple inputs multiplied by this threshold, you would give it a certain output. So because there was a sort of a close analogy, they were using that term, then I think sometimes people think that they go a little bit too far with that idea and thinking that this is a brain inside of a computer. This is one of the first things I want to make sure that we knock completely off the pedestal. A brain is a electrochemical system that is plastic. It's constantly learning, constantly changing the chemistry and the connections in your brain.
A neural network in the computer sense and computer terms. This is a statistical system. Think of it as statistics on steroids. Given this set of inputs and maybe that set of inputs is trillions upon trillions of words off of the internet and off of books and everything that the models have been trained off of the ways that humans speak and the ways that words and sentences connect together. And the idea is that we as humans throughout all of this written text that the new large language models are built off of. It's just taking what normally would be said under certain circumstances and saying statistically, this, know, given this as your original wording, this is what follows. And it's literally, as it's replying to you, it is literally building word for word, given what I have in my large language model and given the input and given what I've said so far, this is the word, the wording that should follow that. And that really is a mathematical statistical behavior, not a true neural thinking in the terms of human thinking. Right? So then you have people that use AI to describe a deterministic system.
Adam Parks (05:58)
Not an intelligence. Yeah.
Rob Grafrath (06:08)
So deterministic defining our terms here, that is a true false if then logic that given a certain input, you will always have a certain output. There's no wiggle room in a deterministic system. And you can make very complex deterministic systems and call them AI. And often people have, but mainly what we're talking about AI, in recent terms, if you have a deterministic system, you're just talking about good old fashioned programming. And then I'd say the first place where AI has had taken root in our industry is in machine learning. When you start to see propensity to pay scoring happening, those are AI, actual AI in use. We can call it AI because that's what they were calling it as well.
And that also is just applied statistics. there's nothing wrong with calling that AI. I'm fine with it. But let's make sure that we know if someone's saying, and we're going to tell you what's the next best account to work. We're going to tell you what to do with this account. Is that machine learning or is that a LLM? Two very different things, not the same tool so much more we could get into.
Adam Parks (07:19)
So it's interesting. Let me kind of go back to this and make sure that we're for the less technical folks that we're having the same conversation, right? So when we talk about artificial intelligence on the whole, it's, I mean, it's a much broader perspective. We've got our large language models that we're used to playing with these days, the chat GPT's, perplexity, claude, whatever your favorite flavor is, is fine. And then you have this it's probability driven statistics. I always like to say that when we look at artificial intelligence, that it's just, it's a complex math problem. And the most important thing in math is the order of operations. So as we look at, and I know we're going to talk about prompting and things, I always think about the order of operations being one of those most important ingredients in terms of those inputs. Historically, when we think about artificial intelligence in the debt collection industry, it kind of kicked off with companies that were starting with that type of what's the next account to work. And then it started to evolve into not only what is the, what's the next best account to work, but what's the next best action on the account that you're going to work next. And I feel like that's where we started to see this, this kind of growth. Now, a lot of those original scoring companies are no longer in the industry. And I think some of them probably came out earlier than the industry was ready for or even maybe earlier than the technology was truly prepared to add the level of value that it needed to add in order to hit the price point that was being charged at the time. And now we're starting to look at what are these other options in terms of how we're going to continue to see this evolution. And we've been in the speech to text to model to text back to speech again, kind of mentality for a long time. I hear, and I hear organizations are working on this speech to model the speech technology, which would reduce latency times and potentially provide new interesting opportunities for the industry to deploy it. But the industry is looking at AI from a variety of different paths. Now, in your roles throughout the industry, where have you seen artificial intelligence be the most impactful?
Rob Grafrath (09:35)
I would say where I first saw the most impact was on the propensity scoring that we were just talking about in recent times. But today that's almost table stakes. I don't know if I would call it passe, but it's something that you can do, you should do. There's plenty of products to do that and tools to do that with. You can hire your own decision scientist or you can figure out how to do it yourself.
I don't want to say it's boring to talk about, it's it's â it's not the bleeding edge, right? So that the hot stuff we're talking about these days with generative AI like that that based off of LLMs is is a whole other game. There's also the speech and audio based AI, which is not the same thing. So don't get don't get speech analytics and the sentiment analysis mixed up.
Adam Parks (10:00)
No, it's still important. It's... Yeah.
Rob Grafrath (10:22)
and confused with LLM because, and I think this is something where people get a little bit confused. They think that when you're doing sentimental analysis, that all you're doing is taking the words that are being said and transcribing that to text and then deciding is this person upset with the way that this conversation is going. That's not necessarily the case, right? Your audio-based AI systems, they're reading tone and cadence and they're they're not just taking pure text, but also taking text and tone. that's something, at CSS we have our, we call it talk IQ plus that it's listening to your call because we have an integrated dialer that people on our system, the calls are going through the dialer, the call isn't going well. And there's no, there's no mistaking that from a human's perspective. But if you had just used pure text analysis on that, it might not might not sound that bad if it were someone said okay fine I'll pay the bill you know that is an upset consumer fine whatever you know but but if it's okay that's fine I'll pay the bill
Adam Parks (11:26)
Cadence in tone is playing a major role in that, right? It's not just about what is being said. It's about how it's being said and how it's being delivered.
Rob Grafrath (11:34)
Right, right. And so with a system like ours that can both detect and then store that information later for review or for later mining, that's a whole other tool set that sits on top of and coincides with because you still want that call transcript. You need the speech to text AI, which is again, another technology all in and of itself aside from just LLMs.
And then the other tooling, really it's all about, there's so much that's all about tooling, about having these systems work together and connect to each other and be able to not just have a pure LLM. If you just have just one little LLM agent and you're leaning on this guy to be everything. You remember back when we started with Chat GPTs and you would ask it some math problems, and boy it was it was bad at math. I don't remember that but this was before they it was was embarrassingly bad at math and this was before they had started integrating other tooling so that if it's asked a math problem now they they have the tools set up so that it'll call it'll take that question you asked this divided by that plus that it can take it
Adam Parks (12:28)
A little bit,
Rob Grafrath (12:48)
and run it, write some quick Python code or something like that behind the scenes, run the code and have the code, the deterministic system. You remember we said deterministic earlier. If then it's pure logic, no wrong. There's one way to answer a math problem in that case, right? So it runs some deterministic code and outputs the answer and then reads you back that answer. So tooling, enabling your, let's say if you have a chat bot.
That chat bot needs to have a calculator that is not an LLM base, right? That's not pulling from the language model, but is literally just calling out and doing calculations using a calculator tool. And this is where you start to chain together judge LLMs that sit on top of the LLMs that are doing the behaviors. And another term I'm gonna throw out there, so brace yourselves, is MCP, the model context protocol server. So an MCP server is an AI developer's way of basically making an interface, to making like, you can think of APIs as interfaces into your system. An MCP server is like an API into your system, into your database that the LLM in turn then interfaces with. So in your prompting, you say, perform this action. Let's say you have a voice bot. It's on a call with the consumer and in your prompting for that voice bot, you've said, if the consumer says that I refuse to pay this debt, then I want you to run an action code in our system. I want you to run the RTP action code. So the LLM, you do not give the LLM access to do that in your system. No, bad idea.
Adam Parks (14:23)
Fair.
Rob Grafrath (14:24)
Have an MCP server that sits between the LLM and your system. LLM says, hey, we're supposed to run the RTP action code. MCP server then it has the code, the deterministic code, the hard code, so to speak, to, okay, here's the API call that I'm going to make. And it can also be programmed to check whether the, what the LLM is asking for, it's asking to do something that it shouldn't be able to do. And it can kick back a response to the LLM and say, can't do that, you know, try again, go back and think twice. So it's really MCP becomes both your security gateway, kind of your tooling to make sure that your system can do things. So you have this perception that the LLM makes the system do things. We have this agent co-pilot, which is just about the coolest tool ever here in the CSS impact tool set.
Where agent is typing, they say that RTP action again that we were just talking about. If the agent says status this as a refusal to pay. And then lo and behold, their screen just now it's moved and a windows popped and now you have the action code filled out and it has a refusal to pay and it has a nice little summary of the conversation as a note that's ready to be added. Feels like magic. Really, it's the LLM calling the MCP, MCP running an API call into your system and then next thing you know the system has performed the action. All of that tooling kind of working together, that's what some of the best things that are coming out today end up being created.
Adam Parks (15:59)
Everything is created out of necessity. I like what you're saying here about creating this, having this MCP that sits in between the AI related tool and the system of record. So there's a replaceability to the LLM model almost like here. One, it's a protection layer, but two, it's also a future proofing layer because now I'm in a position where, I mean, look, just look at how the tools have moved so fast over the last couple of months, right? Looking at how
Rob Grafrath (16:00)
Thank you.
Adam Parks (16:27)
Gemini breaks all the benchmarks, so ChatGPT releases 5.3 and we just see this constant movement. So we really don't know who's going to win the AI race at this point. We know who is going to win the search race for the World Wide Web at this point. Google has cornered that marketplace. But when it comes to artificial intelligence, we really have no idea who's going to be in control at any point going forward. So future-proofing our organizations and making sure that our connectivity to these tools is running through some sort of a protocol, like you're saying the MCP that creates that, call it gateway in and out of our system or record also so that the AI model can't just go changing things because we've seen, or least we've heard stories. I've never seen it in person, but we've heard stories about the AI going rogue and changing all of the things and now you got to find your way back again.
Rob Grafrath (17:01)
Yeah and it makes it interchangeable. You change out the backend. If you didn't like the outputs you were getting using your Claude based model, now you can use, your, your chat GPT based model, open AI rather most likely.
So the other, so tool set wise, just kind of want to want to throw another one at you, brace yourself, retrieval augmented generation, RAG, they call this. So let's kind of talk about what that is and how it works. Just in case if you're talking to some IT guy and they say, and you're going to use RAG to make sure that your outputs are in line with your, your knowledge base. It's like,
Adam Parks (17:30)
Makes sense.
Rob Grafrath (17:53)
Why am going to use a rag? Well, it's a lot cleaner than it sounds.
Adam Parks (17:55)
Honestly, it's the first time I'm hearing the terminology. I'm curious. Again, this is where I say that I learned something from every conversation I have with you.
Rob Grafrath (18:04)
Yeah, why use a rag? So imagine if you will, you're a very organized, you've got a great organization, you have Nancy over there in compliance, and she has written this, this beautiful backlog of all of these policies that your collectors need to follow. You have all of these wonderful training materials, you have the FDCPA and reg F and all of these regulations. And now you have an LLM and you want to be able to tell the LLM, okay, when I tell you, I want to do a thing. So we have, let's say our, I'll throw another tool out there for the CSS side, the management co-pilot. So we have a back office co-pilot so that management can ask questions about the data in their system. Well, you probably want to expose to that management co-pilot, the FDCPA, reg F and all of your training materials. So how do you do that? That's where the rag comes in.
You take that knowledge base, let's say the FDCPA or whatever, and you run it through an algorithm. We won't get into too many details, but it's stored in a, they call it a vector database where now it's almost like having, it's like giving the LLM a table of contents, so to speak. So if I ask you a question that has to do with the seven and seven rules, like if I ask as a management, here's a great example.
Find the the the violations to the seven and seven rule Okay without that without a rag it's gonna go back to the the LLM just the base model. Let's say that's open AI's model and it's gonna scour the whole knowledge base of what it has from from the internet history that has been trained on how will it know what it's looking for? It might be able to find that because seven and seven that's it that sounds like very specific but but maybe it will maybe it won't but if you use retrieval augmented generation before that prompt goes to and gets consumed by your open AI, LLM, it goes to your vector store and it finds here's, here's a, let's say look up from the table of contents that has to do with seven and seven and, this came from the, the reg E documentation. So it pulls the relevant paragraph or two from reg E, whatever, whatever snippet, or it summarizes the whole section from Reg E and then that alongside your prompt gets bundled together to the package sent over to the OpenAI backend. And now it has the context, the true like meaningful, useful context from your business and from your industry to work with rather than just kind of going out and trying to figure things out on its own. Powerful stuff.
Adam Parks (20:39)
Well, context is mission critical because these models are great, but they're only great if they understand the context in which you're asking the question. Unless you're providing really detailed prompting and you're going through the engineering side of creating those prompts, but that context is everything.
Rob Grafrath (20:49)
Specifically. Context is king. Really, the context of... And this is where I think people feel like LLMs are learning sometimes when they really aren't technically learning. Because as you interface with... And this a lot of the times goes into a chat GPT conversation. I'm talking to chat GPT all day about certain things. And when I enter a new conversation.
Adam Parks (21:10)
Yes.
Rob Grafrath (21:23)
It's referencing things about me that I've talked about it with prior from prior conversations. It's still telling me advice on how to how to help with my consulting business. Even though I've kind of turned the dial down on that since I've started here at CSS, but because it knows me, it knows I'm interested in that and that looks like learning, right? What is it? What is happening there? All of those conversation histories are summarized in those in snippets of context windows very old conversation, the older it is, the more condensed and summarized that text ends up becoming. But as I keep interfacing with this tool, this is just one of the features of something like Chat GPT is it's storing those histories, keeping that in the context in an older context window in a very summarized format so that then when I give it new prompts, it's not just going from the base model.
It's going and seeing what it knows about me, what are my preferences, and pulling that context window, all that prior context in, which makes it look like it knows me. If I just wiped all that out, I wouldn't have trained the model. I the perception that these are always learning is, I think that that's a little buzzy, a little buzzwordy, that this model is always learning from your behaviors.
It's technically not always learning. There's a period that is the building and modeling. And this is something, this is when, let's say, they take all of known human history and try to condense it down and train on this is the way that language works. And this is the all the books and these are all the articles and everything. And it's creating that original base model for a large language model. That's when it's learning. Usually that's then that's basically set in and that is stored as a particular model. And then you're just interfacing with that model. And there might be following that you might have some retraining, some some like fine tuning, but that's always an iterative process and it's done more offline. It's not as you're talking to it. And of course, if we're talking about like voice agents, and a voice bot that's taking an inbound call. As that call is happening, you might feel like that AI is learning from the prior parts of the conversation. It's not baking that learning into its base model. It's not. When it tells you it is, they're lying. What they might do is take all the conversations from a period of time and then either have humans rate, okay, this went well, these strategies worked, these didn't, or have another AI model that kind of sits on top of that and does the regression testing to fine tune the original model and then update and change it. But that's more of a incremental, periodic offline task. It's not the always learning, always on sort of paradigm that a lot of people try to say that the AI systems are doing today.
Adam Parks (24:24)
No, it can learn from the context, but that's within that context window of the particular conversation. It's within the Chat GPT thread itself, but it's not necessarily feeding the master model for future use.
Rob Grafrath (24:37)
Yeah, you have to and that gets people because then when they're told that they're that the big models are being trained off of your conversations, they are being trained off those conversations. Yes, but not in real time, like not necessarily this or that conversation. And it's more like like I said offline, it's more of an iterative approach rather than literally an always on sort of learning experience.
Adam Parks (25:00)
So for those that are actively using ChatGPT and these other tools, talk to me a little bit about prompt engineering and its importance in being able to actively leverage these tools and models.
Rob Grafrath (25:13)
Yeah, so you can't think of it in terms of that you're just programming something. It's not like calling a function. LLMs, they are predictive and statistical type of tools. They are not... I'm going to go back and use the word deterministic again. They're not deterministic. you expect every time I say a knock-knock joke that it's going to respond with the exact same punchline every time, like, it's not necessarily, because it's been... it takes the possible outputs that it could give you and then it's picking depending on and there's dials that you can have turned up or turned down. How creative do you want this to be? Do you want it to just take whatever the single most likely output should be from your model or do you give it, do you let it kind of have a little bit more creativity? Turning it down all the way to zero creativity can end up with some very rigid inflexible conversations too far on the other end and now it's completely creative. It's become saying nonsensical things and creative really isn't the right word. Random. It can become very random if you turn up that dial too much in the other direction. So of course, as we're thinking about chatbots, we want that dial turned pretty low, almost to zero so far as the randomness that we allow it to have. But it does have to have a certain amount. Otherwise it becomes a deterministic system and it's not it's not using the power of what an LLM should be used for. So think of variability, the variability of the output of LLMs as just an inherent feature of those systems rather than a an issue that well I need predictable outputs. If you need predictable like if anytime this is said that this is the response then what you're looking for is not an LLM, what you're looking for is programming, is deterministic rules, which you can lay on top of your LLM, right? So you can have, when the LLM says, is the output that I want to create before that gets produced and put out to a consumer, have something sitting on top of that, some code or a tree of possible logic that check that versus what it should be allowed to do and then stop it versus allowing it through.
Adam Parks (27:30)
Understood. So we talk about the, the prompt engineering. We talk about kind of context being king, and being able to bring the context to it and how can we prompt engineer within these context windows? What capabilities do we ultimately have?
Rob Grafrath (27:38)
And it's it's the capabilities of your system. All of these are growing so much. They're growing every day. And I feel like building not just not thinking of building an agent anymore, that it's more of a two heads are better than one sort of sort of tool set that you should be thinking about building where you have the the the agent broken into different roles.
So you might have one role or one agent who's responsible for answering dispute questions. One that is responsible for payment arrangements or something like that. One that's got an overarching judge, the judge LLM that we've talked about before, that's making sure that everybody's following the rules based off of using that rag that we talked about earlier.
Checking against the rules that checking against your policies and procedures and laws and making sure that nobody has broken the rules among all of the LLMs that you've tried to implement and really chaining these together to making them work. more, it becomes like it's, although from a user experience, it feels like you're talking to one thing. In reality, there's a whole crew behind there, right? There's a whole army of agents doing their jobs.
Adam Parks (28:54)
We've also started talking about ecosystems and how these things like chain together and you can start using some of these tools as part of a larger ecosystem. And I think about it similar to the way that the industry has started to leverage Omnichannel. Cause it's not just about the communications through each one of those channels. It's about orchestrating the next best action and kind of a conglomeration of all the things that we've talked about today. How do you...
Rob Grafrath (29:08)
Okay.
Adam Parks (29:17)
get it to say the right things, not hallucinate. How do you start bringing all of these systems together into one ecosystem to create a capability that the debt collection industry ultimately needs? And I know when you and I were first talking about, from a consulting standpoint and then your move into CSS impact, as we started looking at, as you started telling me more about CSS, it sounds like you guys have started bringing together more of that, let's call it combined ecosystem environment. Talk to me a little about how as an organization you've brought these tools together for not siloed management, but for holistic integration.
Rob Grafrath (29:55)
Yeah, I wouldn't be here. I wouldn't be at CSS if they weren't doing this, honestly. Yeah.
Adam Parks (30:01)
That's why I asked the question, right? Because it's not something I would normally ask on a podcast because it is a little bit more specific to your product. But that's what really interested me is you made that move over to CSS is to like, okay, all of these things actually work together in a cohesive ecosystem. What were, like, how do you actually execute on that? And what kind of impact does that bring to the space?
Rob Grafrath (30:22)
Exactly. that's that's it's a beautiful thing when you when you see this this symphony of tools that are all working together. I think it requires a constant dedication to the cause, a constant forward thinking, bleeding edge sort of sort of mindset. I've had the great privilege of sitting in on R&D calls as that starting out as the new guy to fly on the wall and hearing all the things that we're talking about doing here in 2026 and how because we are a entire ecosystem of products, we have that power, that benefit, we have the data. You don't have a lot of collection systems these days that still have an integrated dialer and not talking about a white label dialer, not talking about integrating a third party dialer, but our own dialer that is built by and for and within the collection system that we're working on. And thinking about then how powerful that is to be able to have that tight integration and have the call transcription just happening in real time, having all that call data stored in your database, having it be able to seamlessly query information about the account while calls are happening and that the agent co-pilot experience being able to have the control to build our own MCP to then interface with the account and take whatever actions that you could imagine that you would want an agent to be able to take on the account. It's not something that just happens, not most systems and not every system thinks about how amazing of a tool set that you can make once you have your own integrated dialer, integrated texting, integrated emailing, and web portal, and client portal. And there's this whole suite of products and voice collection agent that are functioning together kind of all off the same core system with the same interface the MCP that we were talking about. It's a beautiful thing.
Adam Parks (32:24)
It really is. It's unique to see everything coming together versus going to all of these different vendors for all of these different pieces and parts. But Rob, like every conversation that we have, I really appreciate you spending a little bit of time with me today to educate me not only on LLMs, Judge LLMs, the MCP, which was something that I really had not thought about before or really considered about that additional layer between your system of record and all of these additional tools. But I do learn something. Thank you for sharing your insights with me again today.
Rob Grafrath (32:55)
Yeah, thanks Adam. Always a pleasure. Always a pleasure. Let's do it again soon.
Adam Parks (33:01)
I know that we're gonna do it again soon. For those of you that are watching, if you have additional questions you'd to ask Robert and myself, you can leave those in the comments on LinkedIn and YouTube and we'll be responding to those. Or if you have additional topics you'd like to see us discuss, you can leave those in the comments below as well. And I know I'm gonna be able to get Rob back at least one more time to help me continue to create great content for a great industry. But until next time, Rob, I really appreciate all of your insights. Thank you for sharing with me today.
Rob Grafrath (33:26)
Thanks Adam. Catch you next time.
Adam Parks (33:29)
And thank you everybody for watching. We appreciate your time and attention. We'll see you all again soon. Bye.