Rambling about passwords

Common Passwords

What you’re looking at there is a list of the 32 most common passwords from among the set of more than 32 million users of RockYou. The top item, ‘123456’ was used by more than 300,000 users.

We don’t normally get to look at actual user data in sets this large, but one benefit of the recent privacy breach at RockYou is that researchers get a look at the data (and we get to see some of the aggregate results).

Of course some caveats will apply when drawing conclusions from the data. For example, while the site apparently has 32 million users (or at least there are 32 million accounts, which isn’t quite the same thing) I think we can assume that the nature of the site (as far as I can tell from a quick inspection it’s for making slideshows that you can embed into social media? Something like that?) results in some self-selection effects in the pool of users. It wouldn’t surprise me to find that the site skews heavily towards a particular demographic, so that kind of thing would need to be taken into account when reaching any conclusions about the data.

It’s probably also likely that people would consider a password to the site to not be a “high security” item, as compared to say a banking site, or a blog/facebook/whatever. So for at least some users the password selection algorithm would only speak to their behaviour at site the perceive as “low risk”–you can’t draw conclusions about behaviour of users in all situations from what they do at RockYou.

Even with those caveats in place, though, the list says a few things. Most of the passwords are painfully simple. None of them contain symbol characters or mixed case. The ones that do mix letters and digits are the single dumbest examples of a password you could make by doing that. None of that particularly surprises me–it’s what I would expect. Most of the users probably open attachments in email from people they don’t know, and some of them probably think they’re getting a bunch of illicit money out of Nigeria.

What does surprise me a little is the set of first name passwords, and their rankings. I might have expected to see those, but I think I would have expected them to occur in the same order that the names occur in population–they’d be either the user’s name, or the name of someone special to them, presumably, and over a data set this large that should tend towards to same rankings as the names. So why is “nicole” the first one? Why “daniel” next?

I wish I thought “rockyou” being the top ten was a general result–people giving the devil horns to computers–but I’m afraid that at a site called RockYou that one’s probably a gimme.

“chocolate” I can accept–it aligns with my understanding of women. “FRIENDS” I suspect is again site-specific. But what’s up with “babygirl” and “monkey”? Lots of new parents making slideshows for the family? “soccer” one assumes is also a ‘functional’ choice–used for soccer slideshows?.

“tigger”, I admit, confuses me.

Apparently to crack a site like this I won’t need rainbow tables. A dictionary of 5000 common passwords would get me into over 6 million of the accounts:

More disturbing, said Mr. Shulman, was that about 20 percent of people on the RockYou list picked from the same, relatively small pool of 5,000 passwords.

The explanation for this that the researchers offer isn’t a shock:

Security experts suggest that we are simply overwhelmed by the sheer number of things we have to remember in this digital age.

“Nowadays, we have to keep probably 10 times as many passwords in our head as we did 10 years ago,” said Jeff Moss, who founded a popular hacking conference and is now on the Homeland Security Advisory Council. “Voice mail passwords, A.T.M. PINs and Internet passwords — it’s so hard to keep track of.”

I guess that makes sense. Particularly for what’s probably viewed as a “low risk” site by anyone in the user population who even passingly thinks about security. Still, I’m a tinfoil hat guy, so for me passwords need to look like ‘6D@HOyf]PoF’ or I get nervous.

And I need to have a different password for every site I’m going to use as more than a one-off–as a paranoid I would worry that if I used the same password in different places that system operators from one place could misuse the information to access my resources at another site. I actually saw this happen a fair bit in the old dial-up BBS days before the Internet1.

Of course I can’t remember passwords like that for the hundreds of places I have accounts, so I make use of Password Safe (there are lots of equivalent tools–this is just the one I happen to have started using years ago, and it works). It’s a pain in the ass, but that’s the deal with security–it’s a trade off between risk and ass-pain.

While I understand that most people wouldn’t want to be bothered with something like that, I do wonder why we haven’t seen more wide-spread adoption of the site-specific password generators that use a master key to hash up a password for any site based on the site name. People don’t have to understand how the things work, or what a hash algorithm is–all they would need to know is that whenever they press the “password fill” button they need to enter their master password, and the computer will fill the field with the site specific password for that site. To the end user it’s functionally equivalent to using the same password everywhere, but the level of security is vastly improved. There are Firefox plugins that implement this now–things like Password Hasher or PwdHash–but I wonder why something like this isn’t just built into the browsers.

If it were, a leak like RockYou’s wouldn’t have presented the same kind of issues–yes, the malefactors would still have everyone’s RockYou passwords, but there wouldn’t be any issue of being able to reuse those passwords to get access to the users’ resources on other systems.

And we wouldn’t have this embarrassing list to look at.

  1. Yes, I have been a geek a long time.(back)

  2 comments for “Rambling about passwords

  1. January 22, 2010 at 7:31 am

    Tigger is a pretty popular self-nickname. The character is pretty popular if Disney sweatshirts are any guide.

    In addition to the number of passwords, I wouldn’t be surprised to find the range of security levels merging in the typical mind. That is, I imagine it’s hard for users to assess level of threat and they’ll probably converge. That’s not so bad if I “over”-protect myself for the danger here, but it more likely means I under assess the danger at other sites. I think there’s a danger of a sort of security fatigue lurking in the grass.

  2. January 23, 2010 at 1:18 am

    That security-merging issue is particularly interesting in the context of “low risk” sites. If you think “what’s the worst that could happen with a slideshow site” the analysis goes one way. If you think “what other resources could someone potentially access with information gleaned by getting access to this one” the calculus probably changes… but I suspect most people don’t get that far. I mean most probably don’t even think explicitly about risk beyond “money-related accounts must be secure”, but even the ones who do probably don’t think about the larger context.

Leave a Reply

Your email address will not be published. Required fields are marked *

Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Canada
This work by Chris McLaren is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Canada.