Every week I have at least one conversation with a security decision maker explaining why a lot of the hyperbole about passwords – “never use a password that has ever been seen in a breach,” “use really long passwords”, “passphrases-will-save-us”, and so on – is inconsistent with our research and with the reality our team sees as we defend against 100s of millions of password-based attacks every day. Focusing on password rules, rather than things that can really help – like multi-factor authentication (MFA), or great threat detection – is just a distraction.
Because here's the thing: When it comes to composition and length, your password (mostly) doesn't matter.
To understand why, let's look at what the major attacks on passwords are and how the password itself factors into the equation for an attacker. Remember that all your attacker cares about is stealing passwords so they, or others, can access accounts. That's a key difference between hypothetical and practical security – your attacker will only do really wacky, creative stuff you hear about at conferences (or wherever) when there's no easier way and the target of the attack justifies the extra effort.
|Also known as…
|User assists attacker by…
|Does your password matter?
|Breach replay, list cleaning
|Very high – 20+M accounts probed daily in MSFT ID systems
|Very easy: Purchase creds gathered from breached sites with bad data at rest policies, test for matches on other systems. List cleaning tools are readily available.
|Being human. Passwords are hard to think up. 62% of users admit reuse.
|No – attacker has exact password.
|Man-in-the-middle, credential interception
|Very high. 0.5% of all inbound mails.
|Easy: Send emails that promise entertainment or threaten, and link user to doppelganger site for sign-in. Capture creds. Use Modlishka or similar tools to make this very easy.
|Being human. People are curious or worried and ignore warning signs.
|No – user gives the password to the attacker
|Medium: Malware records and transmits usernames and passwords entered, but usually everything else too, so attackers have to parse things.
|Clicking links, running as administrator, not scanning for malware.
|No – malware intercepts exactly what is typed.
|Dumpster diving, physical recon, network scanning.
|Difficult: Search user's office or journal for written passwords. Scan network for open shares. Scan for creds in code or maintenance scripts.
|Writing passwords down (driven by complexity or lack of SSO); using passwords for non-attended accounts
|No – exact password discovered.
|Blackmail, Insider threat
|Very low. Cool in movies though.
|Difficult: Threaten to harm or embarrass human account holder if credentials aren't provided.
|No – exact password disclosed
|Guessing, hammering, low-and-slow
|Very high – accounts for at least 16% of attacks. Sometimes 100s of thousands broken per day. Millions probed daily.
|Trivial: Use easily acquired user lists, attempt the same password over a very large number of usernames. Regulate speed and distributed across many IPs to avoid detection. Tools are readily and cheaply available. See below.
Using common passwords such as qwerty123 or Summer2018!
|No, unless it is in the handful of top passwords attackers are trying.
|Database extraction, cracking
|Varies: Penetrate network to extract files. Can be easy if target organization is weakly defended (e.g. password only admin accounts), more difficult if appropriate defenses of database, including physical and operation security, are in place. Perform hash cracking on password. Difficulty varies with encryption used. See below.
|No, unless you are using an unusable password (and therefore, a password manager) or a really creative passphrase. See below.
Only in password spray and cracking attacks does the password have any bearing at all on the attack vector. Go turn on MFA if you haven't, then let's drill into those to see what makes a “good password” for those cases.
Ok, this one is easy. Your job is to have a password that isn't easily guessed. But when I say easily, I mean *easily*. In the password spray attacks detected by our team in the last year, we found that most attackers tried about 10 passwords (some as few as 2, some as many as 50) over the duration of the attack.
The thing about password spray is that it is detectable, and once detected the login server can shut it down. The faster the criminals go, the faster they are detected, so low and slow is the order of the day. That means each guess is somewhat “precious” – attackers know they need to maximize their impact before they are detected, so they use histograms from existing leaks and use it to generate their attacks.
The graph below shows a recent pair of password spray attacks by way of example. Each color represents the hash values of failed password requests; the “hills” indicate multiple failures on the same hash value in that timeframe. There are two distinct attackers – the lower “hills” try 45 distinct password pulses over 22 hashes at around 4,000 accounts per hour; the higher “spikes” try 15 at around 10,000 accounts per hour, running over two weeks. Note the overlap in hashes used by the attackers.
If your password is not in the exact list your attacker is trying, then you are out of harm's way. The attackers are mostly working from the same lists, so they try the same passwords. Here are the top 10 we are seeing in guessing attacks on our system:
There's nothing fancy here. Users pick these passwords mostly because their high-order bit is simplicity – they can just run their fingers along the keyboard for the top passwords. Attackers try them because statistics of existing breaches tell them to. More targeted or advanced attackers may utilize your complexity and expiration rules to good success; “Summer2019!” will satisfy most complexity requirements, and is easy to remember if you are making your users do complexity and expiration rules (Don't do either; it's demonstrably harmful – read https://aka.ms/passwordguidance to understand why). Attackers may also be clever about trying things your employees, specifically, may use – against Microsoft, for example, attackers might try “Office2019” or “Azure19” or “XboxOne.”
But again, the average attacker is moving so slowly in response to detection systems that they only get a few guesses in. So, your password only matters if it's included in that short list the attacker is trying. As an admin, you want to prevent use of these commonly attacked passwords when passwords are created or changed. We have been using this approach for years in Microsoft account (our consumer identity system which supports Xbox, Skype, Outlook, OneDrive, etc.) and at this point, we have effectively mitigated password spray as a mechanism on active accounts (we have the same system in Azure Active Directory called Password Protection).
So, as far as password spray is concerned – your password doesn't matter – as long as it isn't in the “most common passwords” top 50 list!
Database Extraction and Cracking
Ok, that leaves the last case, the one that gets people into creating really wacky password rules. That is the “what if the database is extracted?” case. This is a popular, scary attack to talk about. We don't have good numbers on how often it happens to Active Directory domain controllers because most organizations are understandably tight-lipped when breached. We have lots of sensors deployed to detect extraction of our cloud credential hashes (we won't discuss the specifics of those sensors for security reasons). We have no evidence of extraction from our cloud systems, but the high-profile breaches of other big systems over the years teaches us to stay humble – and cautious. It *can* happen.
We're going to wallow into a little crypto here to summarize that document. A couple of definitions first:
- Hashing means creating a one-way, non-recoverable transformation of the password. We use SHA256, which transforms ANY string into a 256bit sequence from which the original password cannot be mathematically recovered.
- Iterating means repeating that algorithm, which just burns the same amount of compute horsepower for each turn of the crank.
- Salt means adding something to the password before we hash so that the same password doesn't result in the same hash for different users. Without salt, each password the attacker breaks gives them access to all accounts using that password; with salt, they have to attack one account at a time because the hash can't be precomputed and looked up across the database (Salt isn't a “secret” and is usually stored with the hash – an attacker who has the hash also has the salt).
Azure Active Directory uses 1000 iterations of SHA256 over the salted password to generate our per user, per password hash. If the incoming password is synchronized from on-premises, we receive a hash of that on-premises password then re-hash using the same scheme. What this means is that we never store the password directly; by repeating the algorithm, we can tell if the password we're checking at login is the same as the password that was set by comparing the generated hashes. In addition to this, the database in which the passwords are stored is encrypted (encryption also scrambles data, but the data can be recovered with a key), and then stored on an encrypted drive using Bitlocker.
To extract the data from an on-premises AD environment, the attacker needs to extract the files from a domain controller. Typically, this means the attacker has achieved domain admin status in the network, though variations in security posture can change this. This takes a certain amount of work. To extract the data from the Azure Active Directory cloud environment, the attacker would need access to the environment, the ability to decrypt the database and, if using physical theft, the ability to break the Bitlocker keys. This requires considerably more work. Why put out so much effort when you can just find the password (reuse), guess it (spray) or just ask nicely (phish)?
But let's say the bad guys have secured a database full of hashes – how do they proceed?
- Get a cracking rig. The cryptocurrency markets have driven costs here waaaay down and it is now feasible to build a rig capable of cracking in excess of 100B (yes, that's a B) passwords per second against SHA256 for $20,000 (as of July 2019). Organized criminals and governments can blow that budget away, and quantum may and may not vastly accelerate even these numbers.
- Do the homework to figure out the algorithm, salt and any organization specific password rules (min/max length, complexity, etc.). This is usually straightforward, and in our case, we even publish it. But even if it isn't known, assume the attacker controls at least one account in the system, and can insert a known password from which they reverse engineer the algorithm.
- Build an initial list of passwords to try. They start by taking the >500M passwords which have been disclosed in any breach, phish, or spray attack. Think of this as “every password anyone has ever thought of, ever.” Some guidance says to ban all passwords on this list. Try that and see how successful your users are at choosing passwords at all.
- Try every password on that list against the target account. Statistically, this will break >70% of user passwords. 500M is ½ a billion, home rigs can run 100B guesses a second – so that complete list just takes 5ms to try. An attacker can run the complete list against 200 accounts every second in a rig like this. Yes, this means most accounts fall almost instantly, and that database extraction detection is super important, as is login anomaly detection, and most of all – turn on MFA!
- In the unlikely event that the target account's password still isn't cracked, build a list of all popular phrases, song lyrics, news headlines, whatever they can think of to pick up from search engines, Wikipedia, popular articles, etc. These are available pre-canned in the hash-breaker communities. This may pick up another 5-7% of user passwords.
- Still no luck? The attacker can run every allowable password going as far as the rig and time will allow. Assuming 96 easily typable characters (without any fancy tricks), we get 96 possible values for each position in the password. Brute forcing like this becomes prohibitively expensive quickly, even on our 100B passwords per second rig. In practice, with this rig, the attacker can try every password up to 8 characters in a day, 9 characters in 3 months, 10 characters in 21 years – each additional character takes 96 times longer, so the attack caps out in practical terms at 9 characters with this rig. All of this buys you only perhaps another 5% of passwords broken:
Password Length Possible Permutations Time in seconds Time in minutes Time in hours Time in days 6 782,757,789,696 8 0.13 0.002 0.00009 7 75,144,747,810,816 751 12.52 0.21 0.01 8 7,213,895,789,838,340 72,139 1,202.32 20.04 0.83 9 692,533,995,824,480,000 6,925,340 115,422.33 1,923.71 80.15 10 66,483,263,599,150,100,000 664,832,636 11,080,543.93 184,675.73 7,694.82
(While we're here, lets point out that increasing the iterations here basically is linear slowdown for the attack. Going from 1,000 to 10,000 iterations doesn't even buy one additional character. We'd need to go 100,000 to move each row by one character – at the cost of 100 times the servers, rack space, and energy consumption.)
- Finally, the attacker can use predictable patterns (e.g. always start with a capital letter, follow with 3-6 lower case letters, 2-4 numbers and add an exclamation mark at the end) to create a high-ish probability subset of guessable passwords out to perhaps 12 characters. Returns are now vanishingly small, a few percent.
- Because of the salt, all that was for ONE account (but with about 85% probability of having succeeded). The attacker must now start over at step one for the next account whose password they want.
The point is – your password, in the case of breach, just doesn't matter – unless it's longer than 12 characters and has never been used before – which means it was generated by a password manager. That works for some, but is prohibitive for others. If you are using a password manager, use the maximum possible length – there's no usability downside if you are already cutting and pasting.
Password managers have their own issues (usability, single high value target, etc.) but in this case a password manager makes a meaningful difference (against this unlikely event) by generating a long, random, string.
Or you could just enable MFA. Ultimately, compromise via database extraction and cracking ends up being similar to guessing,phish, or replay – the attacker must try logging in with the compromised password, and at that point MFA is your safeguard.
The Inevitable Punchline
Your password doesn't matter except for password spray (avoid the top guessed passwords with a dictionary checker of some kind) or brute force (use more than 8 characters, or use a password manager if you are *really* nervous). That's not to say your password isn't terrible. It's *definitely* terrible, given the likelihood that it gets guessed, intercepted, phished, or re-used.
Your password doesn't matter, but MFA does! Based on our studies, your account is more than 99.9% less likely to be compromised if you use MFA.
With the increase in sophisticated MFA phishing and bigger cracking rigs (including quantum) what we *really* need is a cryptographically strong credential bound to the client hardware that stores a benign artifact online – which makes the inevitable punchline better creds (like FIDO2). But the assessment of current and next generation creds is the subject for another blog.
Stay safe out there!