Plausible Analytics: GDPR Compliance w/o Cookie Consent Banner

145 points by bugfactory a year ago

jwr a year ago

Serious question: what is the value of web analytics for people?

I run a SaaS business and I dropped Google Analytics a long, long time ago. Primarily because of the tracking, but also because I really couldn't see the value of the data.

In the old days, you could at least use the "Referer" (sic) header to know where people came from and what they searched for. But that is long gone, and the only source of that data is Google/Bing search console.

Page visits are a vanity metric: they tell me nothing about my business. The only thing that actually matters for a SaaS are signups and MRR. Measuring your business by page views is like measuring the business performance of a Walmart by counting cars on the freeway nearby. Yes, the numbers are somewhat related, but you can't draw any conclusions.

I made it a point not to include any third-party JavaScript on my site, but even if I were to make an exception for these analytics, I can't really see the point, unless you are running an ad-driven site where pageviews are king.

vasco a year ago

This seems contrarian just for contrarian sake, given how much literature there is about this, and the fact that it's almost self evident. Tracking impact of your changes, seeing if your users are getting lost after changing something, understanding where they spend the most time, etc.
Say for example, if all your users start spending 30% more time in your reset password page after you pushed out some changes. How would you know? What could be causes of that? Could something be broken with the login? Apply this to everything.
Not having analytics is literally not caring about what they do in your product, so you're either never changing the product and 100% confident it'll always work, or you're probably giving them a worse experience than you could.
How you do this tracking is another story, but there's ethical ways to do it.
- zelphirkalt a year ago
  
  > Tracking impact of your changes, seeing if your users are getting lost after changing something [...]
  The change of adding obnoxious tracking of course accounts for some user loss itself, which it cannot measure. On some of those "modern" websites, that show me a whitescreen without JS, I check my uBlockOrigin and see the domain of that website and some Google shit? Tab closed. No thank you, I will go elsewhere.
  - chii a year ago
    
    > adding obnoxious tracking
    normal people will not see the tracking. It's when laws force the cookie banners that it starts to become an item in people's minds, because that cookie banner is annoying.
    
    dgroshev a year ago
    
    Laws don't force the cookie banners, laws force requiring consent for personalised tracking. Banners as we know them are malicious compliance. There's a difference.
    
    nickpp a year ago
    
    That is simply false. Talk to a lawyer. They advise for cookie banners as good precaution against disproportionate punishment mandated by law.
    
    dgroshev a year ago
    
    I'm a bit confused. You're claiming what I'm saying is false, but you're just referring to someone advising something as a precaution? Do you have a primary source for a legislation mandating cookie banners? (Also, is there a cookie banner on apple.com?)
    There is no "disproportionate punishment" under GDPR in practice, unless you're doing something egregious, and even then (see Facebook). I'm very familiar with the UK regulator, they publish their enforcement actions [1]. I'm not aware of a single case of a cautionary letter, much less "disproportionate punishment", that they sent over a cookie banner on its own. Are you?
    Besides, you correctly hinted at the incentive structure. Your lawyer might advise you to slap a cookie banner just because because they have zero incentive not to, they don't care about your users' experience. You might care though. Personally I consulted multiple external DPOs and lawyers, as well as primary sources, before forming my opinion.
    [1]: https://ico.org.uk/action-weve-taken/enforcement/
    
    nickpp a year ago
    
    I take my legal advice from lawyers, not the internet. They are the ones defending us in court if need come.
    Their position was simple: my team uses 3rd party analytics tools (no ads or anything) so IPs will be passed and cookies will be stored. We don’t control them, we don’t know what kind, if they can be considered personal info or not (GDPR is intentionally vague - classic bad law). So we need to be extra careful since our regulator is not a sane one like the UK’s. Thus: follow the common practice - cookie banner. End of story.
    
    dgroshev a year ago
    
    > We don’t control them, we don’t know what kind, if they can be considered personal info or not
    If I were you, I'd consider changing my lawyers. This is explicitly forbidden by GDPR (art 28), you have to know what your contracted data processors are doing, and you have to have processes in place to assure data subjects rights (eg remove their data from your contracted third parties on request). Cookie banners have nothing to do with this, and you're in breach of GDPR cookie banner or not. If your lawyers didn't stop you from breaching art 28 but recommended slapping a cookie banner "to be extra careful", that's a major red flag.
    
    nickpp a year ago
    
    That “we” was the lawyer’s “we”. But their point stands: tools change and even if we understand and trust their specs and descriptions now, those change too inevitably in the future.
    A bad law, an ambiguous law compels you to be defensive and take precautions. Cookie banners are one of many such defenses and everybody seems to be doing it, validating our strategy.
    Thanks for your advice, but unless you are willing to defend me in court and put your money where your mouth is, with all due respect, I will consider its value to be exactly how much I paid for it.
    
    dgroshev a year ago
    
    GDPR is not in any way ambiguous there, take a look for yourself [1]. Keeping an eye on those changes is a part of your responsibilities as a data controller, it's your vendors' responsibility to inform you of any changes, and it's your responsibility to vet vendors for GDPR compliance. Again, if your lawyers didn't explain this to you (and you haven't read the law yourself), I'd be very cautious of those lawyers.
    On the other hand they probably realise there's zero chance for substantial review of your GDPR practices by the regulator (much less seeing them in court), so they can recommend sticking a useless plaster (opt-in has to be specific, and how can it be specific if you collect it for unknown future changes) and keep you in the dark about more substantial requirements.
    GDPR is a very good and clearly stated law, you can read through it yourself in about half an hour to an hour, a negligible time investment for such an important piece of legislation. The purported ambiguity is a psyop by people who don't want to comply.
    [1]: https://gdpr-info.eu/art-28-gdpr/
    
    nickpp a year ago
    
    The only way GDPR is unambiguous is if you interpret it in the strictest sense. Which we actually did - you truly have to, in a business-hostile place like the EU.
    For example, consider IP addresses as PII. (This is of course not clearly specified by the GDPR). Then analytics processing them needs consent. Thus cookie popup.
    Anything else is interpretation unproven in court.
- raverbashing a year ago
  
  I think you're taking about different things and yes, user tracking inside your site/app is definitely useful, still, it can be anonymous
dustedcodes a year ago

Sales is driven through traffic. No traffic == no sales.
Understanding what drives traffic to your SaaS website is such an important piece of information. For instance, if you write two articles, one describing how to use your product to achieve a certain thing which customers want to do, and another article which compares your product to a competitor product and one of the two articles creates 50x more traffic than the other then you'd certainly want to know this, because then you know what articles give you the biggest return on your time writing them.
Just one of so many examples how web analytics is such an important tool to being a good sales person.
- diffeomorphism a year ago
  
  That sounds like a non-example. Why do you need invasive, personalized surveillance for that? Traffic and aggregate data are an entirely different question.
jonplackett a year ago

Do you not at least want to know page views ratio to sign ups so you can see conversion rate? Or do you have a different / better way to do things like testing a new design / price.
cryptonym a year ago

Only page view? That's not really useful and you already got that with backend logs.
With true analytics, understanding typical session helps you optimising users workflow, making sure relevant features are easily discovered at the right place.
It really helps when you want to work on user experience. You may need metrics such as LCP, INP and CLS with details per type of page, ability to drill down data and get that in real time.
ROI of such script depends on what you do with the data. If that's vanity or not even looked at, you are emitting CO2 for nothing.
- pickledoyster a year ago
  
  >optimising users workflow, making sure relevant features are easily discovered >work on user experience
  These are qualitative improvements which are extremely unlikely to stem from quantitative metrics, especially when the sample size is not significant (which it is for the vast majority of pages in existence).
  - cryptonym a year ago
    
    You can group similar pages for this. When you work on ecommerce: plp, pdp, hp, search, cart, checkout
    That can apply to most businesses.
_heimdall a year ago

In around 8 years of web development, mostly focused on consulting and focusing on ecommerce, I've never seen a net gain from using analytics on a site. If the end goal is to produce data for the sake of data, well sure that will work. Rarely does anyone analyze the data though, and I've never seen anyone dig into the validity of the data and ensure that Google Analytics is in fact accurate and reliable for them.
One of the most disappointing client experiences I had was after building a custom shop for a company that was heavily focused on graphic art. We optimized the hell out of their site, getting performance scores of 97+ when every page was image heavy and included a product grid designed for a masonry grid look similar to Pinterest.
A few days before launch they asked us to add their Google Pixel script. The next day they had included 7 or 8 different third party scripts and blown performance scores into the mid 50s. Its their site and they can do what they want with it, but I sure could have saved a lot of dev time if performance didn't matter at all.
XCSme a year ago

Something like this helps me a lot to understand if the visits I get are useful, and where to focus my marketing efforts: https://s3.amazonaws.com/i.snag.gy/cCdZa9.jpg
that_guy_iain a year ago

Referer works standalone with search consoles.
Page visits tell you have many people you get. If you then use how many sign ups you get then you have a conversion rate. That’s an important figure. Page visits can also tell you if your marketing efforts have worked. Imagine doing all the marketing work and not knowing if it did anything.
- pickledoyster a year ago
  
  OP clearly stated that signups and MRR are the really important figures for SaaS. Not incidentally, those two metrics also tell you if your marketing efforts are working.
  - that_guy_iain a year ago
    
    > Not incidentally, those two metrics also tell you if your marketing efforts are working.
    No, they don't. They don't tell you if visits are up, if more people heard of you or anything. They just tell you that x number of people signed up. We can guess that marketing is going better but maybe it's the time of year where more people need the service. If signups go down, maybe you just had downtime or something on your page was broken.
    If you look at any number in isolation you're never going to get the full picture.
    And your MRR can go up without any marketing. You can just do sales.
    
    michaelt a year ago
    
    > No, they don't. They don't tell you if visits are up, if more people heard of you or anything. They just tell you that x number of people signed up.
    In my experience, it's extremely cheap and easy to get a load of fake page impressions from bots, or to buy your US-only company loads of pageviews from low-cost-of-living countries, or to expand the top of your sales funnel with weak prospects who'll never convert to sales.
    Seems to me only a fool would pat themselves on the back for doing so.
    
    that_guy_iain a year ago
    
    What? Who is honestly doing that? Are you just making random stuff up?
    Imma make my analytics look really good when they're crap because??? People buy fake followers because others can see it. No one else is looking at your analytics. And you sure as hell don't want to increase your page views since your conversion rate would tank and that's the most important metric.
    
    pickledoyster a year ago
    
    >They don't tell you if visits are up, if more people heard of you or anything.
    Again, if you don't care about visits, you don't care if they're up. OP said it best: signups and MRR.
    People hearing about you: do you seriously believe that website analytics are suitable tools that provide reliable metrics for brand/product awareness, recognition, product-market fit, etc.?
    >maybe it's the time of year where more people need the service. If signups go down, maybe you just had downtime or something on your page was broken.
    Exactly, seasonality and website uptime / page functionality are important. They should be measured. At the same time, website analytics have nothing valuable to add to these measurements.
    >And your MRR can go up without any marketing. You can just do sales.
    I think you are circling around it: all those analytics metrics are just a means to justify the existence of useless 'marketers' who have no idea how to actually measure brand visibility, recognition, or any qualitative metric. These 'specialists' can't even fathom (heh) that business seasonality is something that shows up in a north-star metric and have no imagination or technical ability to set up a website monitoring service or a crawler, use a CRM for attribution, etc.
    
    that_guy_iain a year ago
    
    > OP said it best: signups and MRR.
    Oh, they did. My bad. OP a god, they can't be wrong. Oh wait, I'm saying OP is a narrowminded and missing out.
    > People hearing about you: do you seriously believe that website analytics are suitable tools that provide reliable metrics for brand/product awareness, recognition, product-market fit, etc.?
    Why are you bringing up PMF when it comes to analytics. BUT! Yes, can. If your users are using your shit all the time and you got analytics all over your app, you've probably got a better
    But remember when I said earlier looking a single stat in isolation is bad? Ssh.
    > I think you are circling around it: all those analytics metrics are just a means to justify the existence of useless 'marketers' who have no idea how to actually measure brand visibility, recognition, or any qualitative metric. These 'specialists' can't even fathom (heh) that business seasonality is something that shows up in a north-star metric and have no imagination or technical ability to set up a website monitoring service or a crawler, use a CRM for attribution, etc.
    "Useless marketers"...
    Anyways, you're complaining about others people useless while you're saying all data except for your north star metrics are useless.
    Imo, this is arrogance and ignorance mixed together.
troyvit a year ago

The value of web analytics for our organization lies in the same realm as the value of Plausible over any third party analytics: The Funnel.
We're a membership driven organization, and by "membership" I mean we rely on donations to fund our content creation (Though whether you're a member or not you have the same level of access to our content). We care about raw traffic numbers, because it relates directly to our mission of informing people. It tells us how many people we inform day to day.
So yeah we care about those raw numbers, and those numbers are difficult to get w/out javaScript r/n because caching and the terrible log retention of our hosting providers.
Raw traffic numbers only tell part of the story though. We want to know the path people take from first landing on the site to becoming a donating member so that (in theory) we can do more of the things that promote that behavior in more people. That's The Funnel, and that's where orgs like Plausible are best. They're first party tracking, so the data stays with us. Also since they're first party tracking we can track a person's overall relationship with our site, from the first news story they read to the moment they first hit our donation page 3 years into the relationship or whatever.
We should be able to do that with our GA set up, but one of the reasons I want us to shift to Plausible is for its simplicity.
pickledoyster a year ago

You got quite a few seo garbage-level nonsense replies. In my experience, you are right, and most tracking metrics have long since become the (vanity) goal to justify the existence of these 'digital marketers'.
It's funny that they spout nonsense about better UX or how you wouldn't be able to do CRO when you'd just laid out two metrics that are actually important and don't require any website analytics to track.

Nursie a year ago

Cool. Perhaps companies and governments (gov.uk I'm looking at you) could consider using this stuff instead of forwarding all their public interactions to an unaccountable US corp.

XCSme a year ago

Or better if they choose a self-hosted solution.

ChrisArchitect a year ago

Anything new here with this editorialized title?

Their most recent blog post:

Things I hate about GA4

https://plausible.io/blog/things-i-hate-about-GA4

(https://news.ycombinator.com/item?id=40904139)

bugfactory a year ago

No, nothing new. I recently discovered Plausible myself and was more than happy to delete a few Cookie Consent Banners. That's all ;)

shafyy a year ago

Let me also plug my free, open-source and self-hosted event-based analytics solution: Fugu (https://github.com/shafy/fugu). Fugu does not track unique visitors (not even daily like Plausible does) and is made for event-based tracking. Comes with included Docker config to make it easy breezy to self-host.

pogue a year ago

I'm planning on running a small niche WordPress blog that I would like to monetize with adsense & possibly an affiliate program. I see there's a lot of choices for analytics available listed by users in this post. Does Adsense require Google Analytics or could I use one of these more privacy friendly ones?

that_guy_iain a year ago

Im confused why you would care for a privacy friendly option when you‘re already willing to give Google all the data anyways.
- pogue a year ago
  
  Is there a better option than adsense that pays out as well?
NicuCalcea a year ago

AdSense and Google Analytics are two separate products, you can use either, both or neither. If you have AdSense though, you've already allowing Google to track your users, so I don't think ditching Analytics would make the blog any more private.

jsheard a year ago

Obligatory GoatCounter plug: https://www.goatcounter.com

It's also cookieless, the hosted version is free to use within reason, and it's extremely lightweight if you choose to self-host it. It doesn't even need a separate database, it can run self-contained with SQLite (or Postgres if you prefer). A good fit for small sites where the big industrial-grade solutions are overkill.

Sephr a year ago

This service claims to not track personal data, yet their docs admit to storing hash(siteID + User-Agent + IP) + seen_paths on their backend for session tracking.[1]
Sites can track sessions without tracking personal data.
1. https://www.goatcounter.com/help/sessions
- inhumantsar a year ago
  
  right below that the docs also say that this hash is not persisted, only cached in memory and mapped to a UUIDv4. The UUIDv4 is what persists between sessions.
  > The IP address and User-Agent are never stored to the database or disk, and there is no conceivable way to trace the random UUID back to this. > > It’s only stored in memory, which is needed anyway for basic networking to work.
  I can't say whether that is GPDR compliant but it's definitely not storing the hash
- yunohn a year ago
  
  > Sites can track sessions without tracking personal data.
  Could you detail how that would work?
  - JimDabell a year ago
    
    Fetch an empty resource that is privately cacheable, set to max-age=0, and has an ETag containing the current timestamp and a random session id. The browser will consider its cached copy always stale.
    When you next fetch that resource, because it is stale, the browser will revalidate it by passing an If-None-Match header containing the ETag. Update the ETag to include the original timestamp and the current timestamp.
    So on every page load (or whichever other event you want to measure), you will be told when that session started, the session id and when that visitor was last seen.
    To set the maximum session duration, reset the ETag if the last seen timestamp passed to you in If-None-Match is too long ago.
    This can even work without JavaScript by using an img element.
    The only data tracked with this is the session start time, last seen time, and a random session id. Since the session id isn’t related to any of your business logic, it cannot be used to identify an individual.
    To further isolate this data, locate the tracking resource on a different hostname. The browser’s SOP will prevent any cookies from being sent with the request, so your analytics backend can’t record identifying information even if it wanted to. This will also prevent you from tracking which page is being visited, though you can override that with the no-referrer-when-downgrade referrer policy.
    
    Tabular-Iceberg a year ago
    
    That's just a cookie. And then you're back to the annoying consent banners.
    
    yunohn a year ago
    
    You just reinvented analytics cookies. You’d be surprised, but they don’t store PII either. It’s usually just a randomized session ID and timestamps, like you’re suggesting.
    
    stavros a year ago
    
    Why do all this when you can set a cookie with a random session ID?
  - Sephr a year ago
    
    In browsers, it's as simple as:
    if (!sessionStorage.sessionReported) { reportSession(); sessionStorage.sessionReported = 1; }
- justmedep a year ago
  
  „ In comparison, in the context of the European GDPR, the Article 29 Working Party[6] considered hashing to be a technique for pseudonymization that “reduces the linkability of a dataset with the original identity of a data subject” and thus “is a useful security measure,” but is “not a method of anonymisation.”[7] In other words, from the perspective of the Article 29 Working Party, while hashing might be a useful security technique, it is not sufficient to convert personal data into deidentified data.“
  https://www.gtlaw-dataprivacydish.com/2021/03/what-is-hashin...
  - number6 a year ago
    
    I am a DPO. The claims Plausible makes won't hold up to scrutiny.
    It's a simple trick: declaring all data collected to technical data, when in fact it is linkable to a data subject.
    Thus collection of the data requires consent, because a subject is identified at least for the session.
    If you can identify unique visitors you are clearly identifying individuals.
    
    makach a year ago
    
    Indeed you are correct. Plausible it is not. They should put their cookie consent back up, and need to inform their users how they are indeed processing the data collected from personal users.
    
    Symbiote a year ago
    
    hash(daily_salt + website_domain + ip_address + user_agent)
    That's what they do. Within 24 hours the daily salt is gone, and the data is anonymous.
    https://plausible.io/data-policy#how-we-count-unique-users-w...
    
    makach a year ago
    
    problem is that this is what they say they do, there are too many examples of companies being noncompliant to their own policies and regulations. they should explain the abovementioned algorithm in their data privacy declaration published online. also even a hash can be considered as a private and personal data unless it has been protected sufficiently. thus need to inform your users anyway.
    
    number6 a year ago
    
    Good approach. IP Addresses are personal data. So the data and the hash is subject to GDPR.
    You still need consent to collect it - well or some other kind of legal shenanigans. The intent is to track a person, it is not technically necessary. You might have a legitimate interest - but in the end you still have to consider the GDPR to use this tool.
    https://europa.eu/youreurope/business/dealing-with-customers...
    
    omnimus a year ago
    
    Turns out that many officials believe this is fine. Companies using Plausible, Matomo and similar services have been under scrutiny.
    IP adress is required for site to function - your server cant not collect it. Plausible also only processes it for uniqueness and doesnt save it as is. Interestingly most webservers/firewalls will have to keep track of ip adresses so they will be saved in acess logs and caches. Making them more problematic than Plausible. Yet its most likely fine because the intent is not to track individual users but to improve service/keep it runing. Plausible intent is also not track individual users but collect visitor counts which is something used for improving service too.
    I think you might be prematurely spreading fear.
    
    JimDabell a year ago
    
    > Turns out that many officials believe this is fine.
    Who has gone on record with this, and in which jurisdictions?
    
    omnimus a year ago
    
    I have experience from state funded projects from central european countries. Afaik what they battle/hate most is what goes against the spirit of the law. So mainly popups that are hyperdesigned to be confusing so people are forced or tricked or annoyed thus accepting everything. Another thing they battle is how long data is saved and where the data is shared. If you self host service like plausible or matomo that do everything thats possible to be compliant then it's fine.
    I think there is marketing tactic ad/analytics companies and marketers use against services like Plausible. They say these services also require cookie popup and wont give you as much detailed info so why would you use them. Most websites would be fine with limited data Plausible provides but it breaks ad/analytics industry business plan.
    
    number6 a year ago
    
    I can understand your frustration regarding these tactics. Other services should outright be banned.
    
    number6 a year ago
    
    > Plausible also only processes it for uniqueness and doesnt save it as is
    That's exactly the point. Processing of personal data to identify a unique person.
    Regarding firewalls and logs: It's argued that this is legitimate interest as it is stated in Recital 49 of the GDPR. So they got a free pass, for the better or worth.
    > I think you might be permanently spreading fear
    Don't get me wrong, I like the approach. But it's not a get out of GDPR free card.
    
    omnimus a year ago
    
    > That's exactly the point. Processing of personal data to identify a unique person.
    Not sure thats what i said. They cannot identify unique person. They identify unique legitimate visits per one day.
    If logs and firewalls mean legitimate interest because you have to give server your ip address for everything to work then using same thing can be said about plausible especially since the ip address is immediately thrown away unlike with firewalls where the main point is to keep record of bad actors.
    It is very different to google analytics where whole point is to pinpoint repeating visitors, their behaviour etc. You simply can't do that with service like plausible. What you can do is know how many legitimate visits you had and what was visited. For most websites that is enough at same time i would be surprised if not knowing how many people visited your site would not be legitimate requirement for service to function.
    
    janosdebugs a year ago
    
    Legitimate interest still requires the data subject to be informed under Art 13. Not sure how that would be accomplished without at least an info banner. (This goes for server logs too.)
    
    number6 a year ago
    
    If you have a website you have to write this in your Privacy Policy and most do.
    Firewalls are a curious case. It is argued that the data is not collected but transmitted to the controller. Almost as if you get a letter with personal data and now have to deal with it.
    Yes, it's a stretch. Not happy with it but I don't see any practical solution either...
    
    janosd a year ago
    
    AFAIK it's not enough to write it in your privacy policy. Art 21 of the GDPR makes this explicit:
    > (4) At the latest at the time of the first communication with the data subject, the right referred to in paragraphs 1 and 2 shall be explicitly brought to the attention of the data subject and shall be presented clearly and separately from any other information.
    I am not a lawyer, but as far as I can tell, there is no legal way to collect PII (including IP address) or place tracking identifiers on the user's device without at least informing the user explicitly under the GDPR and the ePrivacy Directive.
    
    number6 a year ago
    
    You are correct. In early days of the GDPR people thought about a page in front of the original page without any data collection presenting only the privacy information.
    But soon there was an agreement that Art 13 lit. 4 could be interpreted that as long as you don't have any data collection beyond server logs this would be deemed as sufficient. Or in other words if you won't invoke the Art 21 lit. 1 of the GDPR.
    But since everybody wants to track you on basis of their legitimate interest the web became full of cookie banners
    
    dgroshev a year ago
    
    That's a bit simplistic. IP addresses are not unequivocally personal data. Let's rewind back a bit, GDPR Art. 4:
    > ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;
    IP addresses only allow to identify a natural person when combined with other data, such as ISP data or a profile built over dozens of websites. This is not the same kind of personal data as a name + address, Breyer notwithstanding (note the bit about the ISP in the judgment).
    GDPR is not about identifying an abstract entity, it's about identifying a natural person. Doing the former for long enough/with enough data allows the latter, but especially with time-limited in-memory hashes that's a non-existent window of opportunity.
    In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.
    
    number6 a year ago
    
    > In practice this'd probably need to be resolved in court, and I'm sure not a single SME using Plausible or similar will even get a stern letter, much less fined.
    Agreed.
    Plausible just makes false claims like:
    > All the site measurement is carried out absolutely anonymously. Cookies are not used and no personal data is collected. There are no persistent identifiers.
    That's a heavy statement and it is simply not true, as you quoted:
    > an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person
    hash(daily_salt + website_domain + ip_address + user_agent) will fall under this definition.
    But again, you are right, better then anything any other service does
    
    Dylan16807 a year ago
    
    Which part is simply not true?
    The lack of persistence is one of the main design points.
    If you're saying it's collection, that gets complicated because that data has to be there for the server to work at all.
    
    aspect0545 a year ago
    
    What’s your thought on the approach adjust.com takes? They say you can claim legitimate interest
    
    newusertoday a year ago
    
    what are your thought on aggregated data? you can still identify unique visitors but its aggregated data so you can't link it back to the individual.
    I have doubts that just identifying unique visitors would also identify individuals. Their current approach of creating random id which is unique for 24 hours should not violate GDPR? or it would?
    
    number6 a year ago
    
    You begin at a point where you have data to aggregate. This data is linked to individuals.
    Anonymisation of data is data processing and some argue, that it is subject to a privacy impact assessment. Arguing that if done poorly it has great negative consequences for the individual if they can be deanonymized.
    The duration itself does not change the outcome.
    Thus said the approach Plausible takes is much better than any cookie used.
    
    anonzzzies a year ago
    
    I think you can argue if this holds up: you cannot retrieve the ip from the hash (and residential IPs are usually dynamic). The short lifetime together with never storing the hash makes it so you cannot de-anonymise the user.
    No one will get fined for not asking consent for this. Our DPO just said ‘don’t be silly’ when I asked him. But we will see if it gets tested (my bet: it won’t).
    
    ralferoo a year ago
    
    > I think you can argue if this holds up:
    Sadly, reckons don't hold up in court.
    > you cannot retrieve the ip from the hash
    You don't need to retrieve the ip to make it PII, the hash itself is PII.
    You might not think of it as containing actual "personal information", but its sole purpose is to attempt to uniquely identify a person. That makes it PII.
    > (and residential IPs are usually dynamic)
    This actually makes the short lifetime more suitable as a PII, because it reduces the likelihood of the same IP being used by a different person being tracked as the same person.
    > The short lifetime together with never storing the hash makes it so you cannot de-anonymise the user.
    That also doesn't matter, because the lifetime of the token is long enough to track the user through and entire typical session, maybe several.
    The stupid thing in all these shenanigans is that collecting the data isn't itself the problem, it's not getting the user's consent. Just tell the user what you're doing, and it's not a problem - if it's a "technically required" cookie they can make an informed choice to use your site or not, if it's an "optionally required" cookie, they can choose whether to accept or not. Most users won't care and will click on the biggest, most obvious buttons. The ones that do care are likely atypical and would skew your metrics anyway.
    
    JimDabell a year ago
    
    > you cannot retrieve the ip from the hash
    You can as long as you have IPv4 visitors, because the search space is small enough to brute-force. There are only four billion IP addresses. The user-agent complicates things a little but there aren’t many of those, so you could retrieve the IP addresses of most visitors from the hash if you wanted to.
    > residential IPs are usually dynamic
    Usually isn’t good enough. I’ve had residential IPs that are on public record belonging to me personally. IP addresses can be personally identifying information, so they need to be treated that way.
    
    ralferoo a year ago
    
    > the search space is small enough to brute-force
    I get what you're saying - in that if you know the IP address, then you can often easily discover who the individual is. I'd counter that actually, for most people this isn't the case - for many companies, only the ISP, Google, Apple, Facebook etc know who the real user of an IP is... (incidentally, the people most keen too force analytics on you, but that's another issue).
    However, that is all kind of moot. The hash itself is PII, because it can be used to track an individual. PII isn't about the difficulty of determining the specific identity of a user, it's about the difficulty in identifying a specific user. The distinction is subtle, but important.
    Take an example - people are using a wireless hotspot somewhere, maybe you own a coffee shop, and over the course of a few weeks, you're alerted to the fact that someone has been accessing some illegal content that could get your business in trouble. You've been careful to comply with the GDPR, and your logs only include time and hostname of the server accessed. On it's own, there is no PII there. But, combine that with say credit card transactions, or video footage and finding out who was in the coffee shop every time this happened. Then boom! Suddenly, your time has become PII. Maybe not uniquely correlated to a single person, but a group of people. With every instance of a correlation to that person and a group of random people, it doesn't take maybe to narrow it down to a specific individual.
    This is why, to actually comply with GDPR, you need to only store logs for as short a time as is technically required (legally beyond a month is hard to justify, ideally a few days at most) and then you should aggregate into groups where individuals cannot be isolated. If your aggregations result in groups of people that are too small, you need to change the aggregation groups, or report an empty group. It's totally fine to store data like "on this day, n people went from this page to this page, average linger time blah seconds" if n is 10 or more. If n is 1 or close to it, that data is still identifying.
    
    JimDabell a year ago
    
    > I get what you're saying - in that if you know the IP address, then you can often easily discover who the individual is.
    That’s not what I said. I said if you have the hash, you can derive the IP address from it in most cases.
    
    ralferoo a year ago
    
    That part was responding to where you said "Usually isn’t good enough. I’ve had residential IPs that are on public record belonging to me personally. IP addresses can be personally identifying information, so they need to be treated that way."
    My point is that whether you can determine the IP address from the hash or not doesn't matter. The hash itself is PII.
    
    number6 a year ago
    
    You would still have to produce the paperwork for this.
    Most websites don't get fined using GA. Plausible is a huge step in the right direction, but their claims are very strong and not backed up by the GDPR if you take a closer look.
    Regarding fines: most offices will give you a warning instead of a fine, you adjust your cookie banner and you are good to go
    
    anonzzzies a year ago
    
    We don't (and won't) have a consent banner at all; if Plausible would incur a warning, we'll just remove it completely instead.
yreg a year ago

I like Umami: https://umami.is/
- LorenDB a year ago
  
  Currently using Umami, but I've considered switching to Plausible due to Umami's less-than-stellar development performance (e.g. breaking the site details page for a few days recently).
  - qingcharles a year ago
    
    I switched to Umami for now because the Plausible developers were totally disinterested in fixing bugs that silently dropped data.
    These are still JavaScript solutions, so if their JS code is broked then you just don't get the data. You end up with unknown unknowns.
    The only truly reliable data you can get is from your server logs, and obviously you are limited by whatever the browser gives you in the request.
  - gerenuk a year ago
    
    Check out Usermaven.com.
8organicbits a year ago

Also happily using hosted GoatCounter. Last year I noticed some occasional operational hiccups, like service brief downtime, but this year it's been completely stable as far as I can tell.

Ameo a year ago

I'm a very happy self-hosted Plausible user for years now. Solid, simple, and easy to maintain.

dustedcodes a year ago

How are you self hosting it? I find its requirements extremely heavy for a simple analytics solution. It requires a PostgreSQL and Clickhouse database. I don't find self hosting Clickhouse particularly easy. Wish they had an option to just use SQLite as an alternative.
- ayuhito a year ago
  
  I completely agree that the self-hosting story for Plausible is overkill for most websites.
  So much so that I made my own that focuses on self-hostability using SQLite and DuckDB (no external dependencies, can run on a 256MB VM): https://github.com/medama-io/medama
  - dustedcodes a year ago
    
    This looks really nice, I am going to give it a try straight away!
- Ameo a year ago
  
  I use the docker-compose setup they provide. There are only two containers iirc, Clickhouse and their web server.
  Stuck it behind a NGINX frontend and it works just fine.
bugfactory a year ago

Did you use the hosted version, too? I wonder if it is possible to seamlessly switch between hosted and self-hosted.
- Ameo a year ago
  
  I've not tried their hosted version no. I doubt that there would be a seamless way to switch between them since all the data lives in Clickhouse, but I could be wrong.

a1o a year ago

Can I use plausible in a desktop application? I would like to have an idea of exactly which versions of an open source desktop app I maintain are being actively used so I know what to pay attention and invest efforts as I would like my users to be constantly migrating forward - we do have like 20 years of backwards compatibility so we push things forward very slowly.

ioseph a year ago

I don't see any reason why not: https://plausible.io/docs/events-api although you'd have to come up with your own user-agent
clone1018 a year ago

I was able to do it pretty easily with a mobile app, should be just as easy on desktop. You could even register custom “pages” for various parts of the desktop app.
https://github.com/Glimesh/glimesh_app/blob/main/lib/track.d...

pdyc a year ago

Let me plugin my tool as well. Please give it a try https://easyanalytics.win/en IT does not requires cookie consent banner.

kstrauser a year ago

Plausible is very nice, but it lacks much of the information from Matomo (like “after viewing /foo, visitors tend to view…”). Matomo is very nice, but it lacks the free Google Search Console integration (“people are currently finding you from these Google searches: …”) from Plausible.[0]

I’m vain and curious enough to want to see the Google data, but not so much as to pay $160/yr for the Matomo plugin for my personal blog.

[0] This isn’t the same as Google Analytics. You can get this information without installing a tracker on your site.

jgalt212 a year ago

you can fill in this hole with Google Search Console.
https://search.google.com/search-console/about
It's not perfect, but it is free.
- kstrauser a year ago
  
  Yeah, that’s what I do now. It’s not as convenient as having it integrated directly into the stats, but free sure beats $160 for my hobby blog.
bugfactory a year ago

Plausible can fetch and show the used "search terms" if you connect it to your Google Search Console.
- kstrauser a year ago
  
  Right, but Matomo can’t. Matomo has more information otherwise, but Plausible does that part well.

rickette a year ago

When you host your static site on Cloudflare Pages you'll also get Cloudflare Analytics which is cookieless.

Smar a year ago

You still need to ask permission for storing identification data, regardless whether user's browser is told to save cookies or not.
- iamacyborg a year ago
  
  Yep. The legislation has nothing to do with cookies, it’s about storage of unique identifiers. The actual method of storage is largely irrelevant.
  - swores a year ago
    
    For GDPR yes, but there is also a separate and older EU regulation that is specifically about cookies (among other things).
    https://en.m.wikipedia.org/wiki/EPrivacy_Directive
    
    iamacyborg a year ago
    
    Yep, I mentioned PECR in another comment here

mediumsmart a year ago

I opened the site but would it be plausible for their analytics to tell them that Orion does not load the css? Safari does it without consent.

Zaheer a year ago

We use Plausible but have found it quite slow for our needs.

newusertoday a year ago

Can you elaborate? how many views do you have per month.
pdyc a year ago

initial rendering is slow or filtering is slow? are you doing something special with it? what is your typical data size?

phyzix5761 a year ago

In my opinion Cookie Consent Banners have made using the internet an overall worse experience.

ayuhito a year ago

Obligatory plug for Medama, which focuses on easy self-hostability: https://github.com/medama-io/medama

I think Plausible’s self-hosting is not simple, requiring unnecessarily heavy databases like ClickHouse, which can be overkill for the average website owner. Comparatively, this project can effectively run on a 256MB VM for most small website with no external dependencies.

D13Fd a year ago

I used to like simpleanalytics.com but their site has been very slow lately.

theanonymousone a year ago

I was always confused by GDPR. What are the minimum requirements to avoid the banner? Anonymising the IPs and not keeping anything else, or you can keep anything as long as you don't share them with third-parties?

shafyy a year ago

Essential cookies (e.g. a cookie that saves the cart's content in an e-commerce app) are fine. PII (personally identifable information) is never fine (this includes IP addresses, email addresses, more or less exact geolocations) - so anonymized IP is ok.
- gmokki a year ago
  
  What is an anonymized IP? And how would that be useful?
  I think simple hash(IP) is only pseydonymiztion and can be reversed with a bit of work. And thus cannot be stored without consent.
  Of course mapping each IP to random id and not storing the mapping should be completely ok.
  And legitimate reasons allow storing the mapping for a short period for debugging and attack detection.
  - ralferoo a year ago
    
    > Of course mapping each IP to random id and not storing the mapping should be completely ok.
    If it was a different random id for every request, then sure, OK.
    If it's the same random id used on multiple requests, then it becomes PII, as its purpose is to uniquely identify and individual. It should not be logged or stored.
    
    omnimus a year ago
    
    Services like Plausible add time into the mix. So you know that someone visited these 5 pages in 20 min. But you wont know about returning visitors. I think thats pretty significant difference.
    But if what you are saying is true then it's impossible to know how many people visited your website unless you have banner. What about logs then? Sounds like everybody is happily using those because they are "legitimate interest" because servers couldn't work without them but its way more identifying data than what Plausible saves.
    
    ralferoo a year ago
    
    > Services like Plausible add time into the mix. So you know that someone visited these 5 pages in 20 min. But you wont know about returning visitors. I think thats pretty significant difference.
    That doesn't make it any less PII. Also, the 20 minutes thing is just a number you plucked out of thin air - it's actually valid for 24 hours.
    > But if what you are saying is true then it's impossible to know how many people visited your website unless you have banner.
    No, that's not what I'm saying at all. First of all, that claim is clearly false. If your web server logged only the URL and nothing else, no time, nothing, you would have accurate usage counts for every single part of your site.
    For the record, I actually think Plausible attempts to do a good job - it's clear they are trying their best to be privacy focused, not log anything, only provide data in aggregate - that's all good stuff. However, I'm not sure their stance that their don't require consent is valid, because the hash itself is PII. The reason I think the hash is PII is because of how it is being used - to identify an individual user.
    Oh, and servers can work perfectly fine without logs. People like logs, but they're by no means necessary.
    Logs by themselves aren't necessarily a problem if you have a clear data policy in place, and there is a legitimate use for them. The point is disclosure of the data use, and timely deletion of any data that isn't strictly necessary for the business use. So, you can keep PII around relating to billing for as long as they have a subscription, or as long as you are legally required to keep customer records for. After that, they need to be deleted. Anything like access logs that you can justify a business need for can be kept, perhaps a few days or ideally hours until you extract aggregate data, but again you need to state that in your privacy policy, and they should be promptly deleted as soon as reasonably policy.
    And as I said before, all you need to do to comply with the law is to make sure you have the user's consent before tracking them. It isn't really that onerous. The question is, if you don't want the user to know how you're tracking them, why not? What are you hiding?
    
    omnimus a year ago
    
    > And as I said before, all you need to do to comply with the law is to make sure you have the user's consent before tracking them. It isn't really that onerous. The question is, if you don't want the user to know how you're tracking them, why not? What are you hiding?
    This is super wierd spin from what i said. I work on content heavy media sites that are not ad driven. Its either from grants like research or journalism or its presentation of commercial work. Architects, design studios, publishers, writers… All of these clients want to have ballpark numbers of how many people visited the site. Nobody processes or sells this data. Its 10s to 100s visitors a day. We try to use the most private way we know of.
    Its crazy that because of the sick practices of this industry i am suddenly the one suspicious. Some kind of nothing to hide fallacy huh? No we are not hiding anything. We just dont want annoying consent because of visitor counter. The ones hiding something are the ones with tricky psycho designed multi step consent banners. We just dont want to be in same bunch just because few basic stats.
    
    ralferoo a year ago
    
    > All of these clients want to have ballpark numbers of how many people visited the site.
    You don't need cookies for that.
    Again, as I've said before, you can for instance log data for technical reasons, e.g. wanting to post-mortem a failure or attack, as long as the data is deleted promptly as a matter of course. You shouldn't use the PII in that log for analysis without the user's consent (so for a log file, that means you probably should never use the IP address except for endpoints that are only accessible to logged in users), but the URL they accessed isn't PII (unless you start putting identifying tokens in it).
    If you just want ballpark numbers, just extract the URL field only, and count how many times each appears. Obviously, this will give you metrics on how popular each page / asset is, not how many unique users you have. To do that, you have to identify unique users, and to do that you need to have their consent.
    > We just dont want annoying consent because of visitor counter.
    But the law requires you to get their consent.
    > The ones hiding something are the ones with tricky psycho designed multi step consent banners.
    To be fair, I agree with you. They are deliberately designed to be awful in the hopes that the user will just take the least path of resistance and accept their terms. However, it is still a choice. In the cases when I see such a consent form, I either just close the window or I re-open it in incognito mode so I won't get a persistent cookie if it's something I really want to read.
    The point is that the regulatory line needs to be drawn somewhere. The law at the moment says the line is: If PII is required for your site to function, then must ensure the user knows you're doing it. If PII isn't strictly required for your site to function, but it provides a benefit to your company (usually re-framed as how to ultimately helps the customer), then you must request consent. Both of these cases are covered by the usual kind of popup, but that's why you'll see some that you can disable (like sharing data with partners) and some you can't (like cookies for logging in). But you still need consent.
    > We just dont want to be in same bunch just because few basic stats.
    Then just collect basic stats like how many hits each page got. That's fine, you don't need cookies or PII for that. Number of active users isn't a basic stat though, as it clearly requires you to distinguish between different users and any process you use to do that creates PII.
    Perhaps you should consider just explaining why you want the cookie in your popup. If you word it in such a way that explains that you're only using daily active users as a metric to justify continued funding, you'll probably find most people are totally happy to click accept. A message plus simple ACCEPT / DECLINE is fine, as long as the message makes clear what you're doing. Note that you can set an "essential cookie" in response to them clicking DECLINE as long as you've explained that the website uses essential and non-essential cookies, but obviously it shouldn't contain anything other than a simple accept/decline result.
    
    omnimus a year ago
    
    Nobody is setting any cookie. You know these services are cookieless instead use their ip+salt+time hash they send from client. Problem with server side metrics (why google analytics became so popular) is first it generates lots of noise visits from bots. But more importantly its often not possible to implement them because the hosting is handled by unable/unwilling third party.
    I will not jump the gun just yet. We will keep being in this gray zone until i see the authorities have problems with approach of matomo/plausible. I have seen the opposite. If they did we would remove the analytics entirely because there is nothing worse than cookie banner which instantly annoys users and puts you on level with any other mainstream site that does fingerprinted tracking.
    
    shafyy a year ago
    
    It's not a clear and cut case with IPs. As you say, if your servers logs IPs that seems to be classified as "legitimate interest" (for security reasons). But if you use that data to track unique users for product dev, marketing etc. reasons, that's not "legitimiate" interest anymore. At least, this is my understanding.
    For example, it would make stopping a DDoS attack much harder if you would need to anonymize IPs.
    Here's some interesting discussion on this very topic: https://law.stackexchange.com/questions/28603/how-to-satisfy...
    
    ralferoo a year ago
    
    Yeah, great point. It's how you process and store the data that's important.
    One of the key rights individuals have is to request that ALL PII about them is deleted from all of your records, and you have to comply with this request within a certain timeframe, and a maximum of 30 days. This includes backups, logs, everything.
    Obviously, it's impractical to try to edit old backups to remove PII, so you have to be careful how you deal with logs in the first place - you might want them to be backed up on another machine with a maximum lifetime of a few days, you might want to not back them up at all and only backup your aggregated data, etc.
    But keeping logs for a few days can be justified for as you saying DDOS mitigation, post-failure root-cause-analysis, etc, but the defaults for that data should be to delete that data as soon as it's no longer useful for that purpose, which for most companies will be a couple of days, maybe another couple for the weekend. You can keep it still further, for instance for active analysis, but the default should be to delete it as soon as possible.
    
    shafyy a year ago
    
    Exactly. The best PII is no PII. If you need PII for security reasons, keep it as short as possible. Don't collect PII for marketing, product dev etc.
- theanonymousone a year ago
  
  Many thanks. What about "fingerprints" such as browser user agent, country, ...? They are not exactly PII, are they?
  And regarding anonymisation, is it enough to remove the last two parts of an IPv4 IP, or it must be more?
  - shafyy a year ago
    
    It probably depends on how for you go with the fingerprinting. If it's only user agent, I would guess it's ok. If you start adding more and more info to the fingerprint, it will become PII at one point.
    Not sure about how much of IPv4 must be anonymized. If you want to be sure, just anonymize the whole thing. Important to make it random, and not use a hashing function that always gives the same output for the same input IP (in that case, it counts as pseudoanonymized and can be PII).
    Also, IANAL, just a dude who is passionate about online privacy.
    
    theanonymousone a year ago
    
    Thanks. I see many frameworks use hash(ip+ timeFrame). I thought it is to detect sessions, but it seems also to be about anonymisation.
blkhawk a year ago

well AFAIK a simple session cookie doesn't need a banner. Also i think if you do everything local to your system you don't need one either. The point where you need one if you use any system that utilizes third parties to track the user.
So if you store and analyze everything "locally" to your server you don't need cookies and therefore no banner no matter how much you "track" since its all request made to your own server you merly use the telemetry of.
You can't share that data without consent but thats a seperate data protection thing from the cookie banners.
- blkhawk a year ago
  
  Oh and the GPDR is mostly confusing because it was interpreted with malicous complyance by the whole industry - at least in effect if not intention. it is simply easy for upper management to take a "better safe than sorry" approach and by now the banners have reached a degree of dark pattern development that is horrifying in their relentlessness.
  - omnimus a year ago
    
    So much this. The whole ad industry is afraid that most websites would switch to simpler more private compliant alternatives which would break their business (of reselling snooping data). So they are on marketing campaign to paint these alternatives as non compliant and requiring banners too. Basically every fart now needs a consent banner and when you already have a banner why not have this most invasive visitor screen recorder analytics that we send to our 743 partners in real time.
  - yardstick a year ago
    
    Yup.
    What’s that? We need users consent for ad cookies? Ok let’s also make them consent to the session cookie too as a way to confuse them or get them to lazily just click the accept all cookies button rather than find the exact cookies the site needs to run without ads.
ben_w a year ago

IANAL, but for me the thing is mostly clear* and the only question is "what counts as 'legitimate interest'".
https://gdpr.eu/cookies/
1) If it's strictly necessary, e.g. logging in or legal obligation, you're fine and don't need to ask
2) If the data can be associated with a specific human, and it isn't covered by #1, then ask
3) ??? legitimate interest ???
* but I know from experience that this means "don't trust my own feelings of clarity, ask a lawyer"
- bryanrasmussen a year ago
  
  legitimate interest - anything to make your application function.
  you have an online mail service, you have to save email accounts of emails you receive so you can respond to those.
  you allow people to forward their emails received to other email addresses, you need to save those other email addresses.
  This would be in dbs for that stuff if you have third party marketing analytics, just because you have legitimate interest to save email to make application work doesn't mean you can pass that email into third party marketing analytics. That is not legitimate interest.
  if you have a newsletter service and someone signs up to receive newsletter then you need to save their email to send that newsletter. you don't need to ask, they have implicitly given you permission by asking you to send them the newsletter.
  If you have a process for removing users from service for violation of terms then you probably need to be able to keep information about them otherwise they can just say get rid of info and then sign on again - this would come into the parts of the Digital services acts about obligations to users and appeals process for removal etc. but different thing, if you have removed someone you need to be able to identify when they try to come on again.
  - troupo a year ago
    
    > legitimate interest - anything to make your application function.
    Plus the data that you're required to retain by other laws. E.g. banks/financial institutions might be required to retain a lot of data for several years for audit and compliance purposes.
    
    bryanrasmussen a year ago
    
    I figured the parent poster already covered that with > If it's strictly necessary, e.g. logging in or legal obligation, you're fine and don't need to ask
- timeon a year ago
  
  This is bit OT, but that site states:
  > Allow users to access your service even if they refuse to allow the use of certain cookies
  Does it mean that sites like https://www.spiegel.de, are not GDPR compliant?
  - ben_w a year ago
    
    I (weakly) believe it is not compliant, based on the Facebook case regarding the "Pay or OK" model: https://noyb.eu/en/statement-edpb-pay-or-okay-opinion
    But again, IANAL, so don't take my word on that.
iamacyborg a year ago

The banner has nothing to do with GDPR, it’s more to do with PECR.