We held the fourth session of our LiveRamp for Developers webinar series on July 27th, 2021. This was the second identity session, and focused on the Retrieval API. Kevin Wei, Lead Identity Product Manager at LiveRamp, and Adam Isom, AbiliTec Backend Engineering Lead – the product and engineering heads for the Identity APIs – presented for the second time.
If you missed the first identity session that focused on the AbiliTec API, I highly recommend checking it out. Kevin and Adam delved into some of the philosophical and technical foundations of LiveRamp’s identity graph. And some of the concepts are helpful to understand LiveRamp’s concept of identity overall, as well as how it relates to all of the associated APIs.
The team covered three Retrieval use cases of the Identity API: RampID Retrieval, Envelope Decryption, and RampID Transcoding. They first addressed the problem LiveRamp solves with our API’s output of RampIDs, a little bit about RampIDs themselves, and then each of the API functions’ specific input identifier and associated use cases. Lastly, they finish off with a demo covering each use case.
Identity and Retrieval API Background
Identity is already complicated enough, and after some customer feedback, the identity product team has started to think of our APIs as data APIs. What is meant by that is as opposed to other APIs where you orchestrate or string together systems, we position our Identity APIs as a function of clear inputs and outputs — you feed us an identifier, and we’ll give you a corresponding identifier associated with that input. The outputs for the use cases are then all the same – RampIDs’ and really the only thing that changes with each API function is the input identifier. Our Identity APIs are really just the gateway to the rest of the LiveRamp ecosystem. The identifiers are unique, as a concept in and of itself, but the true potential of them is unlocked when you leverage them for connectivity purposes.
The term “Identity” is thrown around so much without actually defining it, and often downstream applications and/or definitions get muddled and the “why” of our product is lost if we’re not on a shared foundation of context. From a LiveRamp point of view, identity can’t be defined without talking through its two important dimensions: representation and perspective.
Representation is all about how identity shows up in different shapes and forms. To add a bit of RampID pseudonymous flavor to this one — for any given person, their identity is composed of multiple email addresses, 3rd party cookies, mobile advertising ids, and so on.
But when we bring in the concept of perspective, remember that those infinite representations are scattered across an infinite amount of perspectives (think publisher sites, loyalty lists, mobile apps, etc.) Not only that, but everyone has their own custom business rules about how to treat identity.
- At LiveRamp, we have the same reach as the walled gardens, of up to 250 million consumers in the US. That reach is built on a foundation of offline data that is stored in a non-discoverable, secure offline data graph, and that is exclusively used to tie online match data to an pseudonymous, persistent Identifier that can operate across the internet:
- It is connected to Platform Customer IDs and premium publishers
- To the universe of cookies that have been received by LiveRamp’s paid match network over the past 90 days
- To approximately 300 million Android and Apple Devices that are currently active
- To Connected TVs that are shared by multiple users within households across the US
- And ultimately to 250 million people who we can confidently reach on multiple devices.
So what problem does RampID solve? It’s LiveRamp’s mission to make it easy for companies to use data effectively, but it’s impossible to do that when everyone speaks a different data language. We all need to make sure we talk in the same data language -and specifically, RampIDs are one of the tools we use to accomplish that.
So now let’s actually change the picture from a person to an actual RampID. Whereas you can only connect AbiliTec Links with “offline” identifiers such as Postal Addresses, Emails, and Phone Numbers, RampIDs have the advantage of being able to connect both offline identifiers as well as online, pseudonymous identifiers such as 3rd party cookies, mobile advertising IDs, CTVIDs, and other custom partner identifiers. This is super powerful as it allows RampID users to build a complete 360 view of their consumer, both in the known offline and pseudonymous online space.
While RampIDs are person-based, the true value and use case for RampIDs is for connectivity purposes. To be super explicit, RampIDs are LiveRamp’s pseudonymous identifier that enables our clients to use their data safely and effectively across the ecosystem.
It is important to highlight two things in this picture of utmost importance — the dotted line and the one-way arrows from the “offline” data connecting to RampID and how it relates to a one-way trip to pseudonymity. Since RampID is a key that is connected with “anonymous” online data such as 3p cookies, MAIDs, etc., consumers have an expectation that the data tied to those devices remains anonymous — LiveRamp has strict policies around preventing re-identification using any of our identifiers.
We were the one of the first companies to have a Chief Privacy Officer, and we take data ethics really seriously here. LiveRamp has strict rules around who can store RampIDs, where and how they can be used. We employ both technical and contractual controls to ensure that RampIDs remain pseudonymous.
The anonymizer is a piece of logic embedded within our Identity API that helps prevent re-identification. If you have any inputs that may be associated with offline data, such as well, offline data, or identity envelopes, the Identity API will trigger this logic check.
And there are two aspects of this to share:
1. Batches of requests cannot contain between less than 100 unique people. For requests to be properly returned, there must be at least 100 unique requests of these types or none at all.
a. If this requirement is not met, a 400 http code will be thrown with the error: “Too few records in request. Must be at least 100.”
2. The IDAPI randomizes the output in order to prevent a mapping from PII input to RampID output.
b. This randomization behavior in combination with the other anonymizer rules ensure that there is enough data to randomize, and prevents IDAPI users from sending in a small set of DAD and rotating them in/out to derive the associations.
RampIDs are a concept that’s loaded with context, and it’s hard to pick a place to cut off the information flow. As already mentioned, from an API perspective, RampIDs are the outputs, and that means a RampID is always associated with and generated based on some kind of input value.
From a technical ‘what is it’ definition, there are three main components that comprise a RampID: a prefix, a partner encoding, and an unique value.
The prefix describes a particular RampID’s association with the AbiliTec graph. RampIDs have a foundation that is rooted in AbiliTec technology, but that’s a topic for another time. What you need to know about prefixes is that there are two kinds: maintained and derived.
- Maintained links are returned if input PII matches to a node in the AbiliTec graph. For example, if you ask the API to turn [email protected] into a RampID and the API recognizes that entity in the AbiliTec graph, the RampID you’ll receive will begin with an XY.
- Derived IDs are returned if input PII is not recognized within the AbiliTec graph. Derived IDs are algorithmically derived based on input data — upon the same input, the API will consistently output the same derived link. The use case for a derived link is to maximize connectivity even if an identifier is not recognized by the LiveRamp graph. This allows clients to still join on input data sets for consistent linking across files.
It’s really important for LiveRamp to maximize the security of our client’s data as well as our own identifier, so LiveRamp encodes their RampIDs to be specific for each client. The Partner Encoding is a 4 digit value that is specific to each client and ensures there’s no such thing as a “universalID” that represents a person, which can get pretty dicey. For example, the RampID that represents Kevin Wei will look completely different to each API client.
Lastly, the unique value following the Partner Encoding of a RampID is based on the original input data and its quality. RampID is person-based, so if you send in both kevinwei+225bushst and [email protected] to the API which would recognize both entities as the same person, you would receive an identical RampID for both values. This person-based capability of RampID is critical for consolidation and analytics purposes.
Retrieval API Demo
Kevin went on to explain the different types of use cases for the different retrieval types, and touched a bit on the Authenticated Traffic Solution (ATS), and how it relates. Note that on August 17th, 2021, Ian Meyers will be presenting the ATS API, in the last session of this webinar series. Afterward, Adam provided a demo of some of the different use cases utilizing the API.
If you’re interested in learning more about using the Retrieval API, or want the answers to some of the questions discussed in this recap, you can check out the session on demand by registering. In addition, if you’d like to take a deeper dive, please reach out to us via the contact form to get in touch with experts who can lead you through discovering your specific challenges, as well as the solutions we have to offer.
The next session in the series will be given by the Activation product and engineering team and they will be talking to you about the Activation API, on Tuesday, August 3rd at 12 PM PDT.
I encourage you to explore the developer portal at developers.lrmultistaging.wpengine.com and take a look at the Activation API prior to next week, so you can come prepared with any questions you may have and to help familiarize yourself with what we’ll be talking about.