The Data Drop Panel: July 2021
Updated: Jan 8
Our host and self-confessed ‘data protection contrarian’ Carey Lening takes a deeper dive into some of the most important, concerning, and downright fascinating data privacy and data protection items covered by the Data Drop News podcast in recent weeks.
Pro tip: get The Data Drop on your phone by subscribing to our podcast.
Carey: Hi and welcome to this month's episode the Data Collaboration Alliance's Data Drop Panel. We'll be digging into two hot topics this week. First is the rise and fall of Google's planned efforts to ditch third-party cookies. And the second is the proposed European Commission regs on AI. I'm really excited to introduce this week's guests who will be able to contribute lots of knowledge and insight on these topics.
First is Dan DeMers the President of the Data Collaboration Alliance and CEO and co-founder of Cinchy the pioneering Dataware platform. Second, we have Jeff Jockisch, the CEO of PrivacyPlan where he does independent data privacy research and creates really awesome privacy-centric data sets. And finally, we have Kelly Finnerty who is the Director of Brand and Content at Startpage which doves itself the world's most private search engine. So with that, let's kick it off.
Google Gives Up on the FLoC
Carey: So in 2020, for those of us who may have been following, Google announced that it would phase out the use of third-party cookies in Chrome. And they were estimating that this would be done by around 2022 and if you don't know because you've been living under a rock for a while, third-party cookies are used by advertisers and social media companies like Facebook and innumerable others to track our movements across the internet. While the announcement wasn't particularly big news at first, and it certainly wasn't groundbreaking because Firefox and Safari have already given users the ability to opt out of third-party cookies for some time.
And search engines like Startpage also don't use first or third-party cookies. Google drew controversy nonetheless because they announced that they would be replacing the third-party cookies with this new system called FLoC. And then they were also going to be trialing it on millions of users/browsers over the next few months, largely without their permission. FLoC or Federated Learning of Cohorts replaces the individualized data collection of third-party cookies with a data set of groups or cohorts.
It still lets advertisers track us, though maybe with a little less granularity and a little less directness than before. Now privacy advocates and the ad industry and regulators all kind of collectively lost their minds and they expressed a lot of concerns that this was privacy washing essentially.
So, especially since Google was careful to say that they weren't going to get rid of first-party cookies because that's how they make their money. But many were also displeased about how Google was forcing people to opt-out and rather than opt-in terms of being a guinea pig on this test. And then last week Google took it all back.
They said that third-party cookies weren't going to go live at all. And they were going to get rolled out very gradually starting in Q2 2023. And that the "cluster FLoC" as it were would undergo much more review and regulatory scrutiny at a more responsible pace. So what does this mean for privacy in search guys? Let's talk first about third-party cookies. So I think everybody kind of collectively hates third-party cookies. No one gets excited about them except for maybe advertisers and Facebook. But many privacy advocates, including the EFF, have stated that FLoC doesn't really solve the problem. In fact, it just kind of is still an invasive form of targeting, even if it's just at the cohort level.
So I have to ask you Kelly first, since, you know, you work with Startpage and Startpage has managed to successfully do all of their services without first or third party cookies, you know, Is this FLoC thing a solution in search of a problem?
Kelly: Ah, thanks, Carey. First of all, I love the phrase "cluster FLoC". That makes me laugh. Thank you for that one. Yes, so internally when we were talking about FLoC at Startpage, you know, we were saying really it's a transition from cookies to categories. And you know, is that just sort of just changing the name versus the actual practice? Have you gone have either or any of you gone into your Google profile and it shows you where you can kind of turn on and off different ad personalization?
Yeah. So, right. So that's a really interesting place to go to understand what information is being collected on you and how you're being profiled online. To me, that's the clearest way to understand how these categories are being drawn out. How you're being put into these different cohorts. I, before this call, went on just to see what does that look like for me right now?
And you know, I, while I search on many different browsers, I like to have kind of a mix just to spread out my work, and usually, I'm on different sites on different browsers, but anyways, I was really interested to see what was on this Google profile for me yesterday and some of them completely ring true.
And then some were just total outliers. Like it had, in terms of locations that I was interested and searching about was Kuwait. Seems unlikely, maybe, right.
It really racked my brain and I could not think of what that could have been from. So anyway, what I'm getting to is that we're moving from cookies to categories. There are different ways to track you. And while, you know, it might not be all around Kelly Finnerty, my digital profile, you know, I think like an outlier, like Kuwait, right?
As a part of whatever cohort I'm in that does still start to make it possible that someone's going to be able to identify me as an individual. So there's just always that possibility. And then, you know, as you teed up this kind of kick-off. First-party, third-party you know, as a business Startpage, doesn't collect any data, first-party nor third party.
And it's like, we're now at 2021 where the behemoths of tech: Google, Apple, Microsoft are safe. We're going to stop collecting third-party data, but they're the largest, first-party data collectors. So they almost, at this point, don't need to. It's just something also that we kind of want to reremind our friends who aren't as well versed in privacy that yes, third-party's being talked about, but you know, this first party data is still highly invasive and knowledgeable of your everyday action.
Carey: Yeah, exactly. Exactly. What do you think about those Dan? Like what do you think, is this in fact, a solution in search of a problem? Is there any benefit to FLoC and federated learning as a tool for privacy protections?
Dan: Yeah, it's interesting. I look at this as maybe an early spark of maybe even some good news. The fact that they're even doing anything about it. Personally, I'd rather be grouped in part of a cohort than be personally identifiable and put aside the fact that you know, the worst-case scenarios than using data that can personally identify you. Whereas today that's not hard to do at all. The ability to be fingerprinted is trivial with today's technology. So I appreciate the fact that there's a step in the direction towards, but it's, it's interesting. Cause there's kind of two extremes. There's the extreme where I want everyone that I'm interacting with online to know exactly everything about me versus I want to be completely anonymous.
And the reality is most people don't want either two extremes. You want some level of personalization, but you don't want to be personally identified and you want that personalization to actually be. Appropriate it's, it's not like Kelly, you probably don't want to see ads for travel to Kuwait as an example. It's probably not going to be all that interesting to you. So, I think the ultimate solution is clearly one where the individual person has that control and they are able to release access to that data, to whoever they're dealing with based on the level at which they trust that organization.
And, and that's where a FLoC just doesn't do it right. It doesn't. So it doesn't go nearly far. So the fact that you're grouped into cohorts introduces some risk because you're going to be placed into the wrong cohort and so on and so forth. But I still think it's better than being personally identifiable. But I think we can skip all these intermediate steps and go right to the end game, which is the individual person needs to be able to have control and, and basically throttle the level of personalization and that experience that you're going to get. I think that's the only way forward.
Carey: Good, good point. What about you, Jeff? What do you think?
Jeff: Yeah, I think Dan probably has a right there. Right. You know, FLoC sort of in a vacuum is a pretty great idea, but in practice, it starts to sort of break down. Know, I think, Google sort you know, sort of took their shot and missed, which is sort of sad because face or not Facebook sorta apple, you know, took their shot and sort of, I guess, had a big win. It's almost sort of a comparable situation, but they, maybe they marketed it better.
Carey: That's usually, that's usually the case. Like it's kind of funny. I have a lot of friends that are at Google and the number of times that Google measures to step on itself in marketing or sharing of some new technology. Phenomenal. Anyway, sorry. I didn't mean to cut you off there.
Jeff: No. Right. I mean, it's not apples to apples there. Right. Is it just a different situation? Right. But yeah, I had to make that comparison. Right. So. Yeah. One of the real problems here is right, is with the browser fingerprinting, right. Google really was not able to figure out sort of, I think a workaround for this browser fingerprinting problem. Right? So they've got this cohort issue which Kelly brought up, right? So it's putting you into cohorts, which is better than individually identify you, but they don't have a way to keep these other organizations from still digitally fingerprinting, right? And the combination of putting you into a cohort and letting these other organizations still digitally fingerprint you is the worst possible situation right now. They've got both, right. And the combination is sort of worse than we're w where we are now in some ways.
Okay. So you've got that problem. And then you've also got the problem of. They're only giving you the cohort IDs over time. You can sort of build up those cohort IDs and be able to figure out who you are anyway. Right. It doesn't really solve the problem that way.
Carey: I think that's actually a good point and I want to transition to the next question real quick. And you, you did touch on something that I'm going to bring up, which is that right? You know, FLoC was part of a larger privacy sandbox initiative that Google had and fingerprinting and fraud issues and spam and all this other stuff. We're part of that. And that's actually, I think laudable, you know, I, you know, I think some of them, if you actually dig into the privacy sandbox stuff, there is some really cool stuff there.
And of course, it's not just Google working on this. You have Apple, Mozilla, I don't know if Startpage is necessarily involved with that. That would be also awesome. Cause I think these are good standards to be thinking about. They all oddly have bird names, but you know, don't hold that against them as a cat person, I'm a little offended but I won't hold that against people too much.
But it does point that we are going, as I think Dan said. And as you said in the right direction, when it comes to privacy and giving control and power back to the users. So I want to ask all three of you, what is the most, in a similar vein, what is some innovative privacy-enhancing privacy tech that you guys are seeing that you would like to highlight or share with everybody?
And I'll start with Dan cause I think I know where this is going to go.
Dan: Yeah, well, that's one of the reasons why under the Alliance for working on the Zero Copy Integration standard to solve this not only for individual citizens, but even companies as well cause they still do own data that they produce.
But there are other technologies that are out there that are of the same mindset. So first of all, I would choose to be anonymous versus to be personally identifiable if I had to choose between those extremes. So anything that allows that anonymity is better in a world where there's very few such alternatives.
But if you look at the solid project as an example, and the idea of having personal data pods and rethinking how applications interact with my personal data where they're requesting permission to that. It's any technology that is shifting the decisioning away from whether it's the browser or the website that I'm visiting or the Google's of the world, the search engine, to me as the end-user is a technology that I'm going to be a fan of.
Carey: And what about you, Jeff? I know we talk periodically on LinkedIn about different privacy things and you've spoken at length about different privacy technologies. What's rocking your world, right?
Jeff: Well, I can either, there are a lot you know. There are data unions that I think are pretty interesting ideas that's growing now.
Carey: But what's a data union? I'm curious.
Jeff: Well, so a data union is a way for individuals to sort of aggregate their data up into an organization so that they have more sort of buying power, right? So, you can own your data individually, and there is sort of organizations now that lets you sort of encapsulating your data and sort of selling it so that you can make a return on the data the way Google sort of makes money on your data. It's hard to make money individually, right? When you don't have any buying power. A data union is a way, a vehicle where you can sort of aggregate your data with other individuals so that you've got buying power to be able to sell that data and essentially be able to aggregate it with a bunch of other people so that you've got more power in the market to be able to sell, for instance, your DNA data to a pharmaceutical company.
As part of a larger group that might make you more money in that instance than if you were to individually try to sell your DNA data to a pharmaceutical company. Probably not going to get much money for it as an individual as opposed to maybe a group of a million people or a hundred thousand people.
Carey: I always get a little skeptical when I hear selling of data, but that's me and my curmudgeonly privacy-ness so what about you, Kelly? What do you think is a good tech, a good piece of privacy technology that's coming out there?
Kelly: To me, what I pay a lot of attention to and get excited about really is like what's what hasn't been invented yet. Like in terms of technology, it's like all out there, but what I'm interested in is these collaborations to make non-private tech private. And so, our product teams are fantastic at kind of working with the different partners and understanding how their product works, how their technology works, and then you know how we can apply that Startpage privacy to it.
So this year we've been able to do that with launching a new currency converter and a stock feature. So that really like, okay, before on Startpage, you know, your search was sort of limited, sometimes you had to go elsewhere to search for, you know, the latest information on like cryptocurrency exchange rates. But now we've worked with a partner to be able to use their full API but still make it where you can search for and do exchange rates in the moment online without ever having that data passed over. So to me, it's not like what's new, it's what exists, but is what is now private and available.
And so like our goal at Startpage is to make sure that we build ourselves out. So we're as robust as a traditional search engine. So that really, there's never that need where you're like, okay, I'm going to have to sacrifice some data, some personal information here to get this answer for me.
So yes, I think that's where I find really interesting, is those new kinds of turning over stones of traditional tracking online, and then trying to turn that into something more private. And I mean like Signal, they just did such a great job to do that from a messaging perspective. I think it became uncovered that WhatsApp just was not private at all. And then they were like voila, we have this very similar solution that can help you out.
Carey: That's actually really easy to use. I think that ease of use is the biggest thing moving privacy technologies forward. Like I remember, I'm old, I remember way back in the day, trying to get PGP set up or GPG if you were doing it on the Linux side - Dan's like, yeah, man, I remember that. But it was a pain in the ass, to be honest. It wasn't easy. It was a challenge to measure, you know, to keep your keys in order and everything. And it was just like "ugh". You really had to be kind of committed to privacy. But then they have ProtonMail now. And ProtonMail, it's like I could teach my mom and use ProtonMail. I could teach anyone who's used to using email could use ProtonMail and that's a huge advance. And I think that's kind of touching on what you're saying. It's instead of inventing the new wheel, they're building on and improving what's out there and making it easier to use, making it more discernible to individuals.
You Spy On My Guys with your Little AIs
Carey: So now my second favorite topic here which I came up with the title, because I was just like, I like clever titles and puns. So the title for this is You Spy On My Guys with your Little AIs. And what this is, is in mid-April. Or the European commission came down with a sweeping new set of regulatory proposals that would introduce a very comprehensive set of frameworks and actual regulatory laws around AI in the EU.
This included a lot of different things, but specifically things like setting up a risk-based market-led approach that would require providers, users, and importers and distributors of AI systems and technology in the EU, regardless of where they're based to actually like check their work. It included testing for harm and risk and looking at both these risks in the context of the initial design stage.
So at the creation of the clear view AI, but also by, by data controllers, looking at it for their specific use cases. So if Clearview AI is selling their AI too, Cops need to start looking at it in terms of the biometric, you know, uses for their specific uses. And I think that that's it marks a huge step forward. AI also in the European Commission regs there were certain categories of AI that were classified as high risk. In other words, if the use of the AI threatened the health, safety, or wellbeing of individuals or the fundamental rights under the European code of human rights you, they were considered high risk or in some cases, even banned.
Other initiatives under the regulatory framework, including prohibiting certain types of AI, including social scoring systems. Think like what China is using for their questionable practices and also limiting the use of biometrics for biometric identifiers in public spaces. And then they also set clear transparency and accountability requirements on manufacturers and users of AI to include things like privacy by design and data minimization.
And finally, for the first time, they're imposing the fines. So it's a little GDPR-like. A little bit higher though, 30 million euros or up to 6% of annual worldwide turnover. Nothing says scary quite like fines. So the EU data protection regulatory bodies came out with their own proposal because they thought that the European Commission proposal didn't go far enough.
I kind of agree. They actually would rather ban a whole swath of different AI tools. Particularly, the social scoring. It would bandit across the board so even in private systems. Like, say COVID tracking and airports and on social media networks or social networks and things like that.
And it would have a huge potential impact on society and to Europe. So, my questions for all three of you lovely folks - many have argued of the AI regulations mark a good first step, but don't go far enough. Particularly as most of the European Commission's banned list as kind of minimal, honestly. It's like the killer robots basically are the things that are banned and not much else. More importantly, they glaringly exclude many private sector uses. So, what do you think? And I want to start with Jeff here, cause Jeff gave a really good talk at PrivSec Global if I remember correctly and he was talking all about this. I'll start with you for you, Jeff.
Jeff: I like the regulation. I think it's a good first swing. It's seminal. I think it's very similar to GDPR in that it's going to be sort of a foundational piece of legislation that a lot of companies, and really the world is going to have to wake up to and wrap their minds around.
But it's also something that we've got to do. We have to do something right. And I think it's a good first step. It's broad in its sort of a definition, but it solves I think some of the definitional problems by putting in those different layers, which you sort of talked about. Having sort of low risk and high risk and outright banning some things.
And then really most of the regulation centers around conformity assessments which people are going to be talking a lot about. That's going to be a whole new challenge and figuring out how you staff up for those kinds of things. That's going to be a bit controversial because all of this regulation really is self-assessment at this point. And how is that really gonna work out? Because even if you're doing a self-assessment and then you're certifying that how are these European boards going to be able to certify those results. Do they have the competence to be able to do that? Are they going to have the manpower to be able to even understand what the hell it is they're certifying?
So there are a lot of questions to be answered there. But I think it is a pretty great first step. So I'll have to see where it goes from there.
Carey: I looked at it and went "yeah, more opportunity for consultants."
Jeff: Oh, definitely. That's true.
Carey: All right, Dan, I saw a degree of skepticism or maybe it's just, you're sick of listening to my voice. I don't know.
Carey: Or maybe you're really excited about the regulation. So what do you think about all this?
Dan: Well, I largely agree with Jeff, first of all, this isn't my area of deep expertise. But I do think that it is it feels like a step in the right direction for sure. The idea of a risk-based model to determine the level of transparency which is mandated with financial fines as a great stick. Not so much a carrot, but a stick.
To me makes a ton of sense. And I personally would kind of evaluate this is if you remove the AI from the equation for a second and just think of any entity, which could even be an actual intelligent human, making decisions. There's going to be certain decisions that I would expect there to be transparency into the decision process. I don't want to be refused a life-saving operation because of my gender or because of my race or because of these other considerations. Whether that's a human making that decision or a traditional algorithm making that decision or a trained model making that decision.
I want that to have the relevant transparency behind that. But you know, the ability for a clerk, in a variety store to refuse to sell me gum, maybe it doesn't need the same level of transparency. But the idea of the transparency requirements being based on the risk of what is being decided and what the impact is, to me just make sense. So it feels like it's moving in the right direction.
Carey: So it was really funny today, just randomly, I stumbled upon the newest version of utter horror, which is that Amazon has a system in place that is automated. So it is a little bit of AI, where it tracks drivers and it decides somehow through magic, that the driver is or is not complying with the requirements that Amazon has set in terms of delivery times or in terms of picking up and dropping things off at the right spots or whatever it is. And people are actually getting fired through text messages and email with no human interaction whatsoever, which is just bananas to me.
It's like, what is going on? So I don't know. What do you think about that, Kelly? What do you think about the Eldritch horror that is Amazon AI and getting people fired without any humans whatsoever?
Kelly: So both the EU's regulation and Amazon fine on their workers,
there's always this trade-off of privacy and security. It's like, "okay well, if we're tracking drivers, we can also know if one of them goes off and does something horrible." Or how about your package delivers? That's probably a better route to go with that.
And that's the same with government surveillance, right? Houston with the Patriot Act. After 9/11, Patriot Act gets passed and people let that happen because they say it's for our security. But then, later on, you've realized that they're monitoring, collecting data on just everyday citizens. So you're like, "is that for my security or is that a total invasion of my privacy?" So I feel like there's always that trade off and so to Dan's point is like these legislation is about making it more transparent to people outside of the ones watching us about what they're watching us do, how they're interpreting this.
And so I'm always impressed by the EU to sort of stick their necks out there and put out some legislation. It's always a bumpy first road in first pass. GDPR, I think they're still working out some of the issues. But hey, like now we have things like CCPA that are being passed and the closer we get to some sort of global agreement on privacy standards, the better.
So I say kudos, keep doing it. It's not going to be pretty always. But it's a step in the right direction.
Carey: Totally agree. Yeah. I think the idea of having standards like Jeff mentioned earlier is going to be huge, especially because this is something that implicates - or this is something that affects not just EU companies. It's even broader than the GDPR because if you're bringing in any kind of AI technology, you need to do your homework, you need to do your due diligence and show that the risk is understood and factored in.
And that's huge. That really touches on the transparency and the accountability aspects that are outlined in the GDPR, and that are that are vitally important. And I think that will directly or indirectly set the stage maybe for other practices in other countries that you sort of leading the Vanguard.
I feel like California in the US and like the EU and like the rest of the world.
Sorry, Canada. You guys were starting out. You had PIPEDA really early on in Canada and it was awesome and then kind of just sitting there!
I don't know, I'm just giving Dan a hard time.
So given that this has the impact - this has the potential to impact organizations around the world. What do you think will actually happen? I mean, do you think people are going to care or do you think like what Kelly pointed out is, you know, it's going to be a little bumpy starting out and maybe a lot of organizations are going to blow this off. I mean, I think GDPR had the effect of scaring the crap out of everyone but I noticed that the lack of regulatory initiative by the data protection commissions and all these different countries -Ireland- has kind of stymied some of the interest or stymied some of the effectiveness in terms of pushing forward proper privacy controls.
And on I'm always conscious of this now. It wasn't super clear in the commission report or in the commission proposal, but it was later. It was more clear in the followup from the data protection regulators, that this would be something that's handled by the European Data Protection Supervisor.
So that's the individual regulators in each of the countries. And that's not really a big group of people. So, I mean, I'm dubious, I dunno. Someone be more positive, Jeff or Kelly, someone who's got a more positive outlook. Like, do you think it's actually gonna make a difference or not?
Kelly: I was just going to say, I mean, I think it comes down to the fines and you know, if I've had anyone really forced to pay some sizable finds that they really feel.
Dan: Personally I think it is going to be a bumpy road, but the advantage of the fact that it's ultimately machines that are instructed by humans who create the software. But it's the machines making the decisions actually gives us a unique opportunity to extract that transparency and to mandate that transparency. That would have been harder to do without machines playing a role. So like Amazon would have been able to fire people with or without that technology. But with the use of the technology, if the right carrot slash stick is in place, you can now understand the rules better for determining how someone gets promoted or terminated there, so on and so forth. Whereas in the old world that would have been harder because you're attracting it out of individual people decision-making process which is their own algorithm. So if we - this is gonna be a very iterative process, but if we play our cards right, and we just continue to move forward then it will be much better than a world without machines augmenting and making decisions.
But IF we play our cards right. That's the key.
Jeff: Yeah. I think what this is doing is it's going to shine a brighter spotlight on these companies that are, that are doing things wrong. And sometimes they didn't even know that they're doing the wrong things. I mean, I would bet that the Amazon people that developed that algorithm probably didn't even think about that outcome.
They probably didn't even imagine that was going to happen. They certainly didn't imagine they were going to get a newspaper reporter writing those kinds of articles about them. I don't think that was probably in in their thought process. So, when somebody writes that article and says bad things about you, that's one thing. When they say bad things about you and they say that you violated a regulation, that's worse. And when they say that you may have to pay a multi-million dollar fine, that's even worse again. So that's why this regulation proposed regulation matters, right?
Because it makes the spotlight brighter. And whether those fines materialize or how well it's enforced is a little bit not quite as important I think as the fact that the spotlight is brighter.
Carey: I love it. Okay. I was going to ask a question about ethical AI, but I think honestly, we've touched on all of these things and we touched on the aspects of AI.
We've touched on the fact that this is going to open up more transparency. It's going to force people to think. And as Dan said, maybe disclose what's behind the algorithm in some cases. I know that at least the Data Protection Board proposals actually talked about the idea of sharing and exposing how the AI works, which I think is huge.
So with that guys, I think I'm going to call it. I'd like to just thank you all for attending and speaking with me today. And I'll close by mentioning to everyone listening to follow to follow us here on the Data Collaborations Alliance. Go ahead and visit the datacollaboration.org/iown that's datacollaboration.org/iown. Thanks and have a great day.
The Data Drop is a production of the Data Collaboration Alliance, a nonprofit advancing meaningful data ownership and inclusive innovation through open research and free skills training. To learn more about our partnerships, the Information Ownership Network, or the Data Collaboration University, please visit datacollaboration.org.