Balancing Member Privacy and Data Impact: Piloting Clean Rooms with Social Impact Partners at LinkedIn
At LinkedIn, we’re fortunate to work with one of the world’s most complete, detailed, and real-time datasets on people’s experiences in the labor market. This data offers a powerful lens into how individuals navigate work, develop skills, and progress in their careers.
LinkedIn’s Economic Graph Research Institute – the team I sit on – and our Social Impact team are also fortunate to work with a wide range of mission-driven partners (government agencies, multilaterals, nonprofits, academic institutions, and other researchers) who want to use this data to improve outcomes for workers. Whether it’s understanding the impact of workforce programs, designing more inclusive policies, or measuring labor market mobility, these efforts align closely with LinkedIn’s vision: to create economic opportunity for every member of the global workforce.
But as much as we want to share this data to drive impact, we operate under a core principle at LinkedIn: members come first, and that means we prioritize their privacy above all else.
Bridging the Gap: Data Sharing and Member Privacy
Over the past few years, we’ve explored multiple approaches to align these two imperatives – sharing high-value labor market data with trusted partners while fully protecting member privacy:
1. In-house analysis:
In this model, LinkedIn handles all analysis internally. We either conduct full research projects (which are highly resource-intensive and not easily scalable), or we provide partners with aggregate statistics through a standardized pipeline. The Data for Impact program follows this model but requires significant bandwidth increases to support any more than top-tier government inquiries.
2. External researcher secondments:
In some cases, we bring trusted external researchers under contract and “inside the firewall.” They undergo the same privacy training and use the same privacy-preserving infrastructure as LinkedIn employees. While effective, this approach is also resource-intensive and difficult to scale.
3. Clean Rooms (our newest and most promising model):
We’ve recently piloted a new approach using clean rooms—a model that enables secure, privacy-preserving analysis across datasets held by LinkedIn and our partners.
Clean Rooms: How It Works
Using platforms like LiveRamp and Azure Confidential Computing allows us to build a data pipeline where we can join data with our partners while upholding each party’s privacy requirements, and together securely analyze merged data without exposing sensitive information from either side.
Here’s how the process works:
Daily Base Table Creation: On our side, we construct a member-level base table every day that includes a couple dozen key labor market outcomes. This ensures the data is both real-time and up-to-date, removing any members who may have closed their LinkedIn accounts.
Encryption and Storage: The data is encrypted and securely stored on a LinkedIn-owned Azure server.
Clean Room Integration: The encrypted data is automatically mapped into the LiveRamp Clean Room each day.
Partner Data Upload: Our partners upload their own datasets – typically lists of program participants with email addresses and/or LinkedIn handles.
Privacy-Preserving Join and Analysis: From within the clean room, we write SQL queries to analyze the joined data. We currently support exact joins on email and/or LinkedIn handle (fuzzy matching is not yet implemented).
Query Approval Process: Before any query runs:
It must pass automatic syntax validation.
It is reviewed for privacy compliance and quality assurance by a separate LinkedIn teammate.
Our legal team verifies that the output meets privacy thresholds (e.g., k-anonymity minimums or differential privacy noise).
Both LinkedIn and the partner manually approve the use of their uploaded data in the query.
Only after all these checks is the query executed within the clean room. In this way, not only is member-level PII protected, but even aggregate statistics that might be sensitive are safeguarded from release.
Why This Approach Is Promising
This model offers several advantages:
Scalability: While onboarding each partner still requires training and setup, once our base data flow is active, it can support multiple clean rooms and queries without increasing compute costs on our side.
Privacy Protection: Partners never see LinkedIn’s raw member data, and we never access their identifiable data directly—only aggregate outputs are reviewed and shared.
- Enabling New Insights: This approach unlocks the ability to analyze subgroups, track outcomes over time, and answer nuanced questions about workforce programs that would be nearly impossible with traditional methods like surveys.
FY2025 Pilot: A Case Study
In FY2025, our Economic Graph Research Institute team partnered with LinkedIn’s Social Impact team and our LinkedIn Marketing Solutions team (who manage the LiveRamp accounts and pipelines) to pilot this clean room model with three organizations:
Marshall University: A public research university in West Virginia that serves a large population of first-generation and rural students, with strong programs in workforce development and applied research.
Upwardly Global: A nonprofit dedicated to helping immigrant and refugee professionals rebuild their careers in the U.S. through job training and employer partnerships.
- CodePath: A nonprofit that partners with universities to deliver industry-informed computer science education and career support to historically underrepresented students in tech.
The pilot was a resounding success. We found match rates of:
CodePath (matching on LinkedIn handle only): 63%
Marshall (matching on email only): 23%
- Upwardly Global (matching on email and handle): 46.7%
Together, these matches covered over 50,000 members – a scale that would have been extremely costly and time-consuming through traditional methods like phone or mail surveys.
Partner Insights: Our partners were able to generate valuable insights into program alumni’s job mobility, industry retention, and employment outcomes – metrics that can be used to refine programming, attract funders, and better support participants. For example:
25% of Upwardly Global’s participants secure employment in the tech industry following program completion
More than 1 in 3 Marshall alumni contribute to smarter, healthier communities by working in the Healthcare or Education industries
- CodePath’s participants are 10% more likely to work in tech than non-participants with similar profiles
What’s Next?
Following the success of the pilot, we’re expanding the program to include six new partners in the coming year. Our focus is on scaling efficiently by applying the lessons learned from the first cohort and leveraging the existing infrastructure we've built.
We're also continuing to explore ways to streamline onboarding and data merging processes, with the long-term goal of making this a repeatable, sustainable program that can support a wide range of research and impact work. This work is an exciting step forward in responsible data sharing for social good – unlocking real-time, real-world insights while putting privacy first.
Acknowledgments
This work wouldn’t have been possible without collaboration across many teams. Special thanks to:
LinkedIn: Cory Boatwright, Casey Weston, Cammie Erickson, Piper Sutherland, Erran Berger, Meg Garlinghouse, Benjamin Schechner, Alex Macht, Subrat Mohapatra, Amit Nathani, Hannah Amundson, Michael Nguyen, Po-Chia Lai, Ruchi Patel
LiveRamp: Jordan Lang, Jonah Sohn
- Partners: Teletha McJunkin, Mike Woodridge, Brandon Dennison, Dave Traub