I wanted to explore NotebookLM as a way of learning and loaded it up with multiple sources:
- links to Data Cloud Documentation
- links to Trailhead modules on identity resolution which can consume loads of credits
- various PDFs
- a link to Szymon Lewandowski’s “Salesforce Data Cloud Credits Guide” (which the AI incorrectly attributes to Salesforce at 3:44 as a source of an example)
- This is hands-down the best article on the resource, so make sure you visit his page and read the article
After that I was able to ask tool about the details and it responded providing references to the source, but the real kicker is the pseudo-podcast that sounds pretty convincing.
Sadly you can’t steer what exactly the audio should be about, but I can see myself exploring the topic in a similar fashion.
Well, this just changed on October 17th 2024…
Scroll down to view the transcript of this recording.
Working with NotebookLM
Adding Sources

As you can see here, the tool can ingest a lot of types of content including:
- PDFs, .txt & Markdown files, Audio files (e.g. mp3)
- Links to webpages and YouTube movies
- You can also paste text
- and write separate notes
Currently there’s a limit of 50 sources in a single Notebook.
Using the tool to learn
After you add sources, you can start see a short summary, suggested questions and can start asking the tool about the topic you’re interested in in a chat.

For example let’s ask it what goes into the Identity Resolution credit consumption when there are no active matching rules running.

See that circled 1 in the 3rd line of the response? You can click that and see the Source Guide showing you where the tool got its information from:

Generating the Podcast Audio Guide
As you may have noticed the Notebook Guide also contains the Audio Guide section which allows you to instantly generate a synthetic podcast episode where two hosts discuss the contents pulled from your sources.

The audio file is generated with this exact feature. A mere day later after I shared it, Google has added the option to steer the conversation in a direct manner:

Transcript
Another surprise?
Slack can transcribe audio quite well (it required some minor corrections though).
0:00: Ever feel like your customer data is off playing hide and seek, you know, spread out across like a dozen different systems.
0:07: Well, if so this deep dive is for you, we’re going to be tackling Salesforce Data Cloud specifically, we’re gonna be talking about how it takes all that crazy scattered information that you have in your business and brings it all together to give you that, you know, legendary single source of truth.
0:27: Yeah, it’s it’s like, you know, when you’re trying to solve a puzzle, but all the pieces are like in different rooms, you know, they fit together somehow, but it’s like almost impossible to see the full picture, right?
0:39: And businesses today are just drowning in data.
0:42: I mean, you have data from marketing interactions, sales calls, website visits, loyalty programs just it’s everywhere.
0:49: It really is.
0:50: But if you don’t have a way to actually connect the dots, you’re missing out on like all the really valuable insights.
0:56: Oh, absolutely.
0:57: And that is where Data Cloud, you know, kind of struts in, right?
0:59: It’s trying to be like that ultimate puzzle solver, right?
1:03: We’ve been digging deep into a whole bunch of Salesforce articles, trailhead modules, you know, all those external analyses, really trying to uncover how this thing actually works.
1:14: So think of this deep dive as your cheat sheet, not just to understanding the basics, but how to actually use it well, how to use Data Cloud efficiently and avoid, you know, those surprise credit card bills because those are no fun.
1:29: Yeah, because the real aha moment with this isn’t just about seeing all your data in one place.
1:36: It’s about.
1:36: Ok.
1:37: Now how do I use this unified view to make really good business decisions?
1:41: Exactly.
1:42: That actually make financial sense, right?
1:44: And don’t bankrupt you.
1:45: Yeah, that’s what we’re really going to try to unpack.
1:47: I love it.
1:47: Ok.
1:48: So let’s get into the nuts and bolts here a little bit.
1:50: The heart of how Data Cloud does its magic and that starts with identity resolution which honestly, it sounds kind of intimidating like something out of a spy movie honestly.
1:59: But really it’s just the process of figuring out which pieces of information belong to the same person.
2:05: Right.
2:05: Exactly.
2:05: It’s basically Data Cloud’s way of saying, hey, this email address, this phone number, this loyalty program membership, those all belong to the same person, right?
2:15: Makes sense.
2:15: And that’s how it creates those accurate unified customer profiles.
2:19: Ok.
2:20: So how does Data Cloud even begin to make those connections?
2:24: What signals is it using to link those different pieces of data together?
2:29: So this is where the concept of a source profile comes in and this is a really important thing to understand if you want to understand how you get billed for Data Cloud.
2:38: OK.
2:38: This is important, I’m listening.
2:39: Yeah.
2:39: So they have a source profile as any individual record from any of your data sources, right?
2:46: So this could be a customer record from your CRM subscriber record from let’s say your email platform or even a loyalty program member entry.
2:56: And here’s the kicker Data Cloud actually charges you based on how many of these individual source profiles it needs to process, not just on the final number of unified profiles that it spits out at the end.
3:10: Hold on.
3:10: I just want to make sure I’m clear on this again because that almost sounds like if Data Cloud has to process, say 1.4 million source profiles like individual puzzle pieces, right?
3:20: To create my unified customer view, I’m still getting billed for that initial 1.4 million.
3:26: Even if at the end of the day, there are way fewer unified profiles.
3:30: That is 100% right?
3:31: OK.
3:31: All right.
3:31: You got it.
3:32: So it’s like and think of it like this, they’re charging you for the work they’re doing, they’re examining every single piece, even if some of those pieces end up fitting together to make one, you know, complete picture.
3:44: And I mean Salesforce actually gives this example where they talk about 1.4 million source profiles taking 140,000 credits.
3:52: And that’s regardless of how many unified profiles you end up with.
3:56: OK.
3:57: So efficiency is key here.
3:58: Efficiency is key.
3:59: Yeah, we do not want to be wasting those precious credits.
4:01: So once data clog starts putting these puzzle pieces together, how is it actually organize and store all this information?
4:09: So Data Cloud uses these things called data model objects or DMOs.
4:14: You’ve probably heard of them.
4:15: I’ve heard of them.
4:16: If you’ve been, you know, researching Data Cloud and instead of like throwing all your clothes into one giant drawer, each DMO is like a separate container organized for a different type of customer information.
4:29: So it’s like instead of a big messy drawer, it’s like you’ve got those dividers, you know, for socks and shirts and pants.
4:35: Exactly.
4:36: Exactly.
4:36: So each one has its own little purpose.
4:38: Yeah, we’ve got ad M for individuals, you’ve got DMO for their contact points, like their email addresses, their phone numbers, even ones for tracking any unique identifiers a business might use for their customers.
4:52: Ok.
4:52: So if the DS are, are nice organized drawers, the individual DMO that’s going to be the most important one, right?
5:00: That’s where we’re storing all the core data points that actually identify who a person is.
5:05: 100%.
5:05: Ok.
5:05: Think of the individual DMO as like the master record, the central hub for each customer.
5:10: And for this whole identity resolution thing to actually work, there are certain fields that you absolutely have to have in this DMO like their name, date of birth, maybe a customer ID.
5:21: If the business uses those, you’re on the right track.
5:26: But before we get into the specifics of like what those must have fields are and how Data Cloud actually links them.
5:32: Let’s take a quick detour and talk about another really important piece of this puzzle.
5:36: These are contact point objects.
5:39: So contact point objects, these are the DMO that hold those individual pieces of contact information.
5:44: OK?
5:45: That let’s be honest, often end up different across different systems even for the same person.
5:50: Oh, absolutely.
5:51: Like I might use a different email address for, you know, my online shopping versus my loyalty program even though it’s still me.
5:58: Exactly.
5:59: But Data Cloud is smart enough to figure that out.
6:02: It can link those different contact point objects together, emails, phone numbers, even physical addresses and tie them back to that main individual DMO.
6:10: So even if the contact details are kind of scattered, it can still tell it’s the same person behind them all.
6:16: That’s how it keeps everything accurate and make sure we’re not accidentally creating duplicate records for the same person.
6:21: It’s like having a detective on the case, you know, piecing together all the clues.
6:25: OK?
6:25: This is making the whole unification process a lot clearer but we’re just getting started.
6:30: We still need to uncover how Data Cloud actually maps our data and what all the rules are behind matching and merging profiles.
6:39: Stay tuned.
6:39: We’ll be right back after this and we are back diving even deeper into this whole world of Salesforce Data Cloud talking about how it takes all that, you know, messy customer data and turns it into hopefully a gold mine of insights.
6:55: And before the break, we were getting into those data model objects, the DMOS the building blocks of that single customer view.
7:00: Yeah, and how Data Cloud is really smart about connecting those different DMOS, you know, like that individual DMO those contact point objects to build like a complete picture of each customer.
7:12: Even if they’re infos like all over the place, it’s impressive like Data Cloud’s like the super detective, you know, linking all those random pieces of info together.
7:23: But there’s another important step even before those identity resolution algorithms, you know, can really work their and that is data mapping.
7:33: This is where we tell Data Cloud exactly how to understand the information that we’re giving it from all these different sources.
7:40: Exactly.
7:40: It’s like you’re giving Data Cloud a set of instructions.
7:44: It’s a roadmap basically.
7:46: So it knows like, OK, this field and this system that’s the customer’s first name and this field over here that’s their email address.
7:54: So we’re not just like dumping everything in and hoping for the best.
7:58: No, no, we’ve got to be strategic about it.
8:00: You’ve got to map it out.
8:01: Yeah.
8:01: To make sure that it’s actually making the right connections.
8:04: Exactly.
8:05: And that’s where, you know, being smart about your credits comes in again.
8:08: Remember every time we go back and change our identity resolution rules, it’s going to refresh everything and that’s going to use up some of those credits.
8:15: Ok.
8:16: So we got to get it right the first time.
8:17: Exactly.
8:18: Data mapping.
8:19: It seems pretty simple.
8:20: Yeah, it does like you’re just telling it at a high level.
8:22: OK?
8:22: This column in my CRM, that’s the first name, this one’s the email and so on, right?
8:28: You’re on the right track, but there’s one field that you really, really, really need to make sure you get right.
8:35: OK.
8:36: What’s that?
8:36: And that is the individual ID.
8:40: OK?
8:40: The individual ID think of this as the like the golden thread.
8:44: It ties everything together.
8:46: It’s the thing that definitively says this data point and this data point and this data point, that’s all the same person.
8:54: So even if like I’ve got two John Smiths in my system who might even have birthdays that are kind of close, the individual ID is going to be that tiebreaker.
9:04: So it doesn’t accidentally merge their profile.
9:06: That’s exactly it.
9:07: It’s the ultimate identifier even more important than like names and email addresses, which let’s be real people change that all the time.
9:15: Yeah, exactly.
9:16: Salesforce has this really good visual example of this in their trailhead module, they use this thing called the NTO loyalty program data stream.
9:26: OK.
9:27: So what is the NTO loyalty program example, how that work?
9:30: OK.
9:30: So let’s say you’re a company with a loyalty program, you probably have some kind of unique ID, right that you assign to each member in your loyalty database.
9:42: And this ID or like a subscriber key is going to be unique to each person, even if they happen to have the same name or email address as somebody else.
9:51: So in your Data Cloud set up in the data mapping part, you would actually link that subscriber key field from your loyalty program database to the individual ID field in that individual D M.
10:02: Gotcha.
10:03: So Data Cloud’s thinking, OK?
10:05: Even if this John Smith changes his email address or his phone number, as long as that loyalty program subscriber key stays the same, right?
10:14: I know it’s the same guy you got it.
10:16: OK?
10:17: And this is even more powerful when you’re dealing with data from multiple sources, right?
10:21: Oh yeah, for sure, because maybe you’ve got customer data in your CRM in your marketing platform and your e commerce system and they all have their own unique ID for the same person.
10:30: But if we map them all to that same individual ID field and Data Cloud Exactly.
10:36: We’re basically creating like a master key.
10:39: That’s a great way to put it.
10:40: Yeah, that unlocks the full picture across all these different places.
10:45: Absolutely.
10:46: It’s like giving Data Cloud a universal translator for customer identity.
10:50: That’s a really good way to put it.
10:51: Now, what if your business already has its own set of like internal unique customer IDS that you’ve been using across your different systems?
11:00: Right.
11:01: Do we have to, like throw those out and start over?
11:03: That would be a pain.
11:04: No, you don’t have to do that.
11:05: Salesforce thought of that.
11:06: They have something called the party identification object.
11:09: It’s like a special D M just for these custom identifiers.
11:13: OK?
11:14: So you can map your own internal IDS to that and still use them for identity resolution.
11:19: OK?
11:19: So we’ve got options, options are good.
11:21: So now we’ve got our D M set up, right?
11:24: We’ve got all our data meticulously mapped.
11:27: Those individual ids are linking everything together.
11:31: It’s all connected.
11:32: Now we get to the really fun stuff.
11:34: Now the real magic happens which is those match and reconciliation rules, right?
11:39: Yes, this is inside our identity resolution rule sets.
11:42: This is where Data Cloud stops just organizing data and it’s like all right time to put on my detective hat.
11:50: Figure out which profiles actually belong together.
11:53: How do I solve this case exactly?
11:54: Yeah, this is like the brains behind the operation, the match rules.
11:58: These are the ones that are saying, OK, these data points, they have to be identical if we’re even going to think about merging these two profiles.
12:06: So you might say, for example, the first name, the last name and the email address all have to be an exact match.
12:13: So we’re not just looking for any little similarity.
12:15: It’s there’s a very specific criteria, it’s very specific that it has to follow exactly and how strict you are with that can really change your results.
12:25: Of course.
12:25: So Data Cloud gives you these different match methods to kind of fine tune things and we can start with exact normalized matching.
12:33: So exact normalized matching, that basically means Data Cloud is taking things super literally only merging profiles if like every single character is exactly the same even down to like if you put a period in your state abbreviation or something.
12:48: Yeah, you got the right idea, but it’s not quite that literal.
12:51: It does this smart thing first called normalization, which is kind of like cleaning up all those little data entry inconsistencies, you know how to tell me about it.
13:00: Everyone’s got their own style.
13:02: Oh yeah.
13:03: So like if one system has my phone number with dashes and then another one has it without dashes.
13:09: Data Cloud can tell it’s the same number.
13:11: OK?
13:11: So it’s not going to be like, oh those are different mismatch.
13:14: Exactly.
13:14: It gets rid of those little differences, you know, the spacing, the capitalization, even common abbreviations, like before it even compares anything that makes sense because otherwise it would be a nice, we have so many false mismatches.
13:27: Exactly.
13:28: Ok.
13:28: That’s a relief.
13:29: But what about those times when?
13:31: Ok, it’s not just formatting.
13:33: What about when there’s like a genuine difference?
13:37: Like if somebody uses a nickname or let’s be honest, someone made a typo, that’s where fuzzy matching comes in.
13:43: Exactly.
13:44: Fuzzy matching is where David Cloud gets to be a real detective.
13:48: Ok.
13:48: I like it.
13:49: I was like, OK, these records aren’t exactly the same, but they’re pretty darn close.
13:54: Let’s see if there’s enough here to say it’s the same person.
13:56: So instead of needing that perfect match, it’s looking for what, like a high probability.
14:03: Yeah, exactly.
14:03: He uses these algorithms, they get pretty complex to look at things like how similar to these sound, you know, phonetically.
14:11: Are there any characters that got switched around?
14:14: Like a classic typo?
14:15: Even like cultural variations in names?
14:18: So it’s like, hm, is there a chance that Catherine also goes by Katie or that somebody spelled Smith with a Y at some point?
14:25: Exactly.
14:25: That’s pretty smart.
14:26: It’s all about finding that sweet spot, you know, accurate, but also flexible.
14:30: Right?
14:31: And that’s where again, being smart about our credits comes in because if we’re not careful with those fuzzy matches, it can get, we could end up merging things that shouldn’t be merged.
14:43: Exactly.
14:44: More chaos.
14:45: Chaos that’s why testing your rules with a sample data set first is so important because then you can see, ok, how are these rules actually playing out, you know, in a safe environment and you can tweak them before you unleash them on your entire database.
14:59: Right?
15:00: It’s like the dress rehearsal for the big opening night.
15:03: Ok.
15:03: Speaking of fine tuning, we got to talk about reconciliation rules too.
15:07: These kick in when, ok, Data Cloud’s found a couple of profiles it thinks should be merged but there’s conflicting info, right?
15:15: What do you do?
15:16: It’s the tiebreaker.
15:17: Which piece of data wins.
15:19: Right?
15:19: Exactly.
15:20: So for example, maybe you’ve got two different addresses for the same person.
15:24: Your reconciliation rule could say always go with the most recent one or maybe go with the address from this specific system like our CRM.
15:34: OK.
15:34: So match roles are deciding if you’re merging and reconciliation rules are like OK, but how are we going to merge when there’s a conflict?
15:42: Yeah, exactly.
15:43: They work together.
15:44: This is powerful stuff.
15:45: It is.
15:46: But you know, as with any powerful tool, we got to talk about optimization, how do we use this whole identity resolution thing in Data Cloud as smartly and efficiently as possible and by efficiently, you mean, I mean, how do we save our credit, credit conscious, credit conscious?
16:03: Yes, we’ve talked about a few things already like starting with a sample data set being really thoughtful about those match and reconciliation roles.
16:11: What else should people be thinking about?
16:13: So, one really simple thing, people forget about this all the time, make sure you’re regularly reviewing and disabling any identity resolution rule sets that you’re not actually using because even if they’re not like actively running, they can still eat up your credits.
16:31: It’s an easy way like turning off the lights when you leave the room.
16:34: Exactly.
16:35: It doesn’t take much effort but it makes a difference.
16:37: I like it.
16:38: And speaking of things that make a difference, never underestimate the power of just having clean data to begin with the cleaner and more consistent.
16:47: Your data is before you bring it into Data Cloud.
16:50: The easier time those identity resolution algorithms are going to have so garbage in garbage out as they say, exactly.
16:57: Data cleansing is so important.
16:58: People skip over it but you’ve got to do it.
17:01: OK?
17:01: Any other words of wisdom before we start to wrap things up?
17:04: OK?
17:05: So as you’re going through this whole Data Cloud adventure, you know, implementing everything, getting it all set up, keep in mind that Data Cloud has this really cool feature now called the digital wallet.
17:16: And basically it lets you track your credit usage in real time.
17:20: So you don’t have to wait until the end of the month, right?
17:23: No more surprises to find out if you’ve gone over budget.
17:25: Exactly.
17:26: You can course correct as you go.
17:27: That’s huge.
17:28: It’s like having a fuel gauge for our Data Cloud usage.
17:31: Exactly.
17:32: Ok.
17:32: Well, we have covered so much today.
17:36: We talked about how Salesforce Data Cloud tries to solve this problem of like customer data being everywhere.
17:42: We got into the weeds on identity resolution.
17:45: We talked about those all important DMOS, how those keep everything organized, data mapping, match and reconciliation rules and of course, how to not, you know, blow through all our credits on day one.
17:57: I mean, this has been an insightful, deep dive to say the least.
18:01: But before we go, we want to leave you with something to really think about considering like your own data in your business.
18:09: What are those data sources that are like just begging to be unified waiting?
18:15: And then what strategic doors could that open up for you?
18:18: What could you do once you have that single customer view?
18:22: Yeah, that’s the question.
18:23: That is the question, that’s what we want you to take away.
18:25: And on that note, I think we’ll wrap up this deep dive.
18:28: Yeah, this was great.
18:29: Thanks for joining us everyone.
18:30: We will see you next time for another fun exploration of all things.
18:35: Tech.




