Firefox team stops collecting data to ensure user privacy

Summary: The Firefox team decided this week to stop collecting unique identifiers that link crash reports from the same user.During the somewhat heated debate during an extended session of its weekly meeting, opponents said the practice violates user privacy, while proponents say having the data visible could help them fix bugs and solve bottlenecks faster -- even though they claim to have never used it before.

The Firefox team decided this week to stop collecting unique identifiers that link crash reports from the same user.

During the somewhat heated debate during an extended session of its weekly meeting, opponents said the practice violates user privacy, while proponents say having the data visible could help them fix bugs and solve bottlenecks faster -- even though they claim to have never used it before. 

Opponents won the debate by arguing that user privacy trumps any development issue. After the meeting, engineering chief Mike Beltzner summed up the issue this way:

"The discussion at the end of the meeting was around what data we should and shouldn't be collecting with crash reports, whether or not that data becomes publicly visible on our Crash Reporter developer website," Beltzner wrote in response to questions submitted by ZDNet. "The questions in the discussion centered around the value in keeping unique identifiers that allow us to associate two crashes from the same user.

"While there is value in being able to do this easily, the potential cost to user privacy felt high, and so some were arguing that we shouldn't have the crash reporter client on user's machines send these unique identifiers," he wrote. "That argument prevailed, and the change will be made such that unique identifiers will no longer be sent. We'll also purge the database of the ones we've collected (but not actually even used) to date and instead find new ways of drawing the correlations required for data analysis which don't have as high a risk to user privacy."

Topics: Enterprise Software, Browser, CXO, Data Centers, Data Management, Legal, Software

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

39 comments
Log in or register to join the discussion
  • Private should always be the primary concern

    In todays Internet world it is becoming more and more difficult to maintain privacy and security. No matter what the justification it should never be permitted to collect personally identifiable data on users without their express permission. Even something as seemingly benign as crash data could be a risk to privacy if that data is correlated to a specific user.

    I understand and agree with the voluntary submission of crash data, as long as the user is given the choice to opt out.

    -Mike D
    http://www.daileymuse.com
    daileyml
    • Privacy is good...

      But I can also see a lot of merit in linking crashes together. If one machine is regularly crashing, then it could be a corrupt registry or settings files, for example and if the crashes from that machine are all over the map, or it crashes constantly on start, the team can re-categorise the reports appropriately.

      The problem is, how do you link the crash reports together, without actually identifying the machine...

      I don't have an answer off the top of my head, so I am happy that they put privacy first.
      pico_D
  • Kudos to the Firefox Development Team...

    ...for making the right choice in a tough decision. They've staked out the moral high ground on this issue. This is another (albeit rather esoteric) reason why I'm happy to be using Firefox under Windows and Linux. Thanks for the heads-up, Paula.
    Curbuntu
  • RE: Firefox team stops collecting data to ensure user privacy

    If Microsoft had done the same thing, the talkback column would have easily garnered over 300 responses.

    Firefox make an initial goob decision and everyone "understands" their position.
    DarienHawk67
  • "Unique Identifier" vs "Personally Identifiable"

    I'm glad that the FF team is worrying about user privacy, but an install-time hash code or some other (to others) meaningless number doesn't necessarily "invade privacy."

    Any troubleshooting in a network environment should begin with "Are you the only one having this problem, or are others having it too?" When the fuse blows on the network printer, no amount of workstation troubleshooting is going to resolve it. On the other hand, if only one user is having the problem, especially repeatedly, then one needs to spend a little more time reviewing that user's workstation environment.

    100 occurrences from the same workstation is not the same as 100 different users experiencing the same crash. I always assume that if I'm sending a crash report, some amount of identifiable information is being sent.

    FF always prompts "Do you want to report the crash?" and gives a REAL Yes/No - if you say "no" then nothing is sent. That was always what I consider the "privacy" option. I just assume that if I say "yes", then some machine-identifiable environment information will be sent, to help analyze the problem. Not name/address/credit card or the like, but relevant environment information about the machine that led to the crash. Why make life more difficult for people who are trying to help us.

    Bottom line is that kudos go to the Mozilla team for making their own work harder, but I for one don't see any major "privacy issue." I work a help desk and have several "frequent flyers" - and I've learned over time to look first at the things those users do wrong most often, when they call. ("Is there a red "X" on the network drive icon?") It saves a lot of time, for those who cause their own problems.

    And thanks, Mozilla, for a GREAT browser.

    PS - when I clicked "Submit" the first time, NoScript flagged a ZDNET Cross-Site-Scripting (XSS) attempt, and blocked the request. That's a perfect demonstration of why I LOVE FF! Thanks again!
    oldbaritone
    • well put

      Ditto on the thank you, Everyone I support runs noscript and has cookies set to ask every time. There goes 80% of all your mayhem.

      The only thing I can see along the lines of 'invasion' might be the ability to geographically locate someone. For example it is determined (via IP) the user spends certain hours at home and others at some other location. But that's a stretch.

      But I'm not going to second guess those folks. Always better to err on the side of greater respect for privacy anyway.
      pgit
    • You're the only guy so far who understands.

      Couldn't agree more. A violation of privacy is only possible when you are able to associate data with a real person. If I send all sorts of embarrassing facts about some anonymous person, but there's no information in the collection that would allow someone to identify the person being talked about, then no violation of that person's privacy has occurred.

      I assume this is why a UID was used in this case: it's anonymous and can't be associated with a real person. No name, no address, no phone number, no IP. Just a UID. Some sort of identifying information is necessary to correlate bug reports, for exactly the reason that Oldbaritone described, and if they can make it a unique identifier that preserves the user's anonymity, that's pretty much the Platonic ideal.

      Any maintenance programmer who deals with bug reports (such as myself) would understand that. The Firefox team probably understands that just fine. They didn't do this to "preserve clients' privacy," because nobody's privacy can be violated by the method they were using. They did it to shut up the ignorant "privacy advocates" who, like most of the people who have replied to this thread so far, don't have the technical competence to understand what they're talking about and scream "Privacy violation!!!" as a knee-jerk reaction to just about anything. Accurately explaining the true situation requires a calm, rational explanation that takes a few paragraphs, and that just can't compete with a provocative sound bite. Sad, but true.
      masonwheeler
      • You have all missed the point

        The more data points you have, the easier it is to determine who the data comes from. When you start stringing all the data points together through a UID, then it is only a matter of time before a customers identification can be determined.

        AOL made the same mistake:

        http://en.wikipedia.org/wiki/AOL_search_data_scandal

        If one of the pieces of data in the bug report is a memory dump, a cookie or a URL where the crash took place, then you are well on your way to discovering the users identification.
        jim@...
        • Re: You have all missed the point

          [i]If one of the pieces of data in the bug report is a memory dump, a cookie or a URL where the crash took place, then you are well on your way to discovering the users identification.[/i]

          If one of the pieces is a memory dump, then that's all they need. The UID provides NOTHING compared to a memory dump. And your name is likely to show up in a memory dump, so I don't see how it matters.

          Regardless, the risk of a loss privacy is minimal, while troubleshooting an issue will now be harder.
          notsofast
      • He's not the only one

        I was going to make a similar post.. thought I'd be
        flamed to hell by the tin foil hatters though, lol.
        AzuMao
    • you are correct

      I hope the FF team is listening
      bobinbc
    • True that.

      I couldn't agree more. I used to work the helpdesk at several large institutions (gov., education, healthcare, software companies, etc.) and years of desktop support. The situations you described were very real and happening daily and, indeed, some users become the "frequent flyers".

      The privacy issues are vital but you have totally nailed it with "relevant environment information about the machine that led to the crash. Why make life more difficult for people who are trying to help us" - "environment" being the details about OS, hardware, other software, and many strictly technical nuances, none of which are directly lighting up the red blinking light over anyone's head in particular. No personal information is disclosed/collected, just the technical environment in which the crash(es) happened.

      Yes, the privacy is very important but many people totally overdo it, and not only in the IT-related world. It's like a person is going to see a doctor, then he/she will refuse to tell where it hurts, LOL. The tin-hatters will always be paranoid and scream bloody murder so I'm glad the FireFox team has applied the good faith approach. They can collect my data if they want, I have nothing to hide :)
      harry.n
    • I couldn't agree more

      It's a number. If I'm crashing left and right, then that information is important to know. Maybe the reason Firefox is crashing is because my machine just messed up.

      Now they could get 30 crashes and they won't know if that's 30 different machines or a single machine (though I suspect htey could extrapolate that information by looking at the data sent back.

      I'm a staunch privacy advocate, but this doesn't give me any more privacy....it just makes it harder to trouble shoot problems.
      notsofast
  • RE: Firefox team stops collecting data to ensure user privacy

    As a University Student on my way to receiving my PH.D in Computer Forensics, Internet Security is the focus of the future and is a primary concerne for 98% of the customers I have serviced in the past. Companies should pay for all information they gather since they sell it to others for profit. As a former Private Investigator, I know the value of buying and selling information.
    paulhalonen
    • Earth to paul

      Crash information is for fixing crashes, not selling..
      AzuMao
    • LOL!

      "crash reports"! hello!!!!
      FF is not buying /selling users' information. They are trying to FIX FIREFOX CRASHES.
      You wanted to sound so high and mighty that you totally missed the point.
      harry.n
  • RE: Firefox team stops collecting data to ensure user privacy

    I personnaly downloaded the Firefox, Thunderbird, and Seamonkey, KNOWING they were in testing stages, and I don't mind if they use personal info to determine if the problems come from the same computer. I have been on these things since 1984 and previously used Netscape. I thought AOL had won and I lost Netscape forever, so I was very tickled to get this product. In all my years I don't remember hearing security problems for people that use Netscape or Firefox compared to Microsoft. Maybe because the hackers don't believe it's a threat, but since Micro went to Vista and now 7,, they should rethink ANYWAY I bought the exact same system for my daughter and one for myself and yet they both behave in different ways, so I can see how it would be helpful to gather this info.
    My opinion is go ahead I'll let ya
    dixie45
  • RE: Firefox team stops collecting data to ensure user privacy

    In this case I would rather have results than privacy.
    slemenda@...
  • RE: Firefox team stops collecting data to ensure user privacy

    You're kidding, right? You want Firefox to PAY for the diagnostic data it takes to keep thier product running in the complex, messy web world? Or what?! Just have thier customers live with the problems? Get real. Read the posting just above yours. This is really just a political move to quite non-technical screamers. I want my browser to work. ALL of the time. They can gather as much info from me as they want to help them make that happen.
    roger@...
  • RE: Firefox team stops collecting data to ensure user privacy

    i don't care if the FF team collects my data - I would like to not have FF crash 5-7 times a DAY!!!!!!!!!!!!

    This is annoying, disruptive, and makes me look at chrome with loving eyes....
    the_zamalek