What is MD5, SHA-1, and SHA-256? A Developer Guide to Hashing

Hash functions are one of those foundational concepts in computing that show up everywhere once you know to look for them. The 40-character string you see next to a software download is a hash. The gibberish in a URL that identifies a Git commit is a hash. The reason your password can be verified without the service storing your actual password involves hashing. File deduplication in cloud storage uses hashing. Content delivery networks use hashes to identify cached resources.

Understanding the three most common hash functions - MD5, SHA-1, and SHA-256 - means understanding not just what they are, but why some are still used and some have been retired from security use, and what the right choice is for different situations.

What a Hash Function Does

A cryptographic hash function takes an input of any size - a single character, a sentence, a gigabyte file - and produces a fixed-length output called a hash or digest. The output length is determined by the algorithm: MD5 always produces 128 bits (32 hex characters), SHA-1 always produces 160 bits (40 hex characters), SHA-256 always produces 256 bits (64 hex characters).

Two properties make hash functions useful. First, the same input always produces the same output. Hash a file today and hash it again a month from now on a different computer - you get the same hash. This lets you use hashes as a form of fingerprint. Second, any change to the input, no matter how small, produces a completely different output. Change one character in a million-character document and the hash changes entirely. This property is called the avalanche effect.

Hash functions are also one-way: given a hash value, there's no mathematical way to reconstruct the original input. The only way to find an input that produces a given hash is to try many inputs until you find one that works.

What Collision Resistance Means

A hash function is collision-resistant if it's computationally infeasible to find two different inputs that produce the same hash output. This property is crucial for security applications. If you can find two documents with the same hash, you can potentially forge digital signatures or bypass integrity checks.

All hash functions eventually have collisions - since they produce fixed-length output from arbitrary-length input, there must be inputs that share the same hash by the pigeonhole principle. The question is whether those collisions can be found in practical time. For a strong hash function, finding a collision should take longer than the age of the universe with current computing power. When researchers find ways to produce collisions efficiently, the function is considered 'broken' for security purposes.

MD5: Still Useful, No Longer Secure

MD5 was designed in 1991 and was the dominant hash function for most of the 1990s and early 2000s. In 2004, researchers demonstrated a practical collision attack - the ability to produce two different inputs with the same MD5 hash in a few hours. By 2008, the attack had been refined to the point where a fake SSL certificate could be created by exploiting MD5 collisions.

For any security application, MD5 is no longer appropriate. Don't use it for password hashing, digital signatures, certificate validation, or any context where collision resistance matters. But MD5 is still widely used for non-security checksums. The reason is simple: it's fast, it's universally supported, and for verifying that a file wasn't corrupted during download (as opposed to detecting a malicious tampering), collision resistance isn't required. If the file hash matches the expected hash, the file is intact.

SHA-1: Officially Deprecated

SHA-1 produces a 160-bit hash and was the primary replacement for MD5 through the 2000s. It was used in SSL/TLS certificates, code signing, and many security protocols. In 2017, Google's research team demonstrated the first practical SHA-1 collision (called the SHAttered attack), which required enormous computational resources but proved the theoretical weakness was exploitable.

SHA-1 has been formally deprecated from all security uses. Major browsers stopped accepting SHA-1 certificates years ago. Any new security implementation should use SHA-256 or stronger. Like MD5, SHA-1 is still occasionally encountered in legacy systems and older file checksums, but no new use is appropriate.

SHA-256: The Current Standard

SHA-256 is part of the SHA-2 family published by NIST in 2001. It produces a 256-bit hash and is currently considered cryptographically strong - no practical attacks against it are known. It's used in TLS certificates, code signing, Bitcoin's proof-of-work, Git commit hashes, and most modern security protocols.

SHA-256 is fast enough for most general-purpose use but slow enough that naive brute-force attacks on SHA-256 password hashes are impractical at very long lengths. However, it's still too fast for direct use in password hashing. Modern hardware can compute billions of SHA-256 hashes per second, which means an attacker with a leaked password hash database can try very large dictionaries quickly.

Why Password Storage Needs Something Different

Storing passwords as SHA-256 hashes in a database is significantly better than storing them in plaintext, but it's not the right solution. The problem is speed: SHA-256 is designed to be fast, and that's the wrong property for password hashing. An attacker with your hash database can try millions or billions of candidate passwords per second.

Purpose-built password hashing functions like bcrypt, scrypt, and Argon2 are intentionally slow. They introduce a configurable cost factor that makes each hash computation take hundreds of milliseconds on purpose. You can increase the cost factor as hardware gets faster to keep pace. This means an attacker trying to crack passwords has to spend orders of magnitude more time per guess. Bcrypt, scrypt, and Argon2 are the correct choices for user password storage in any current application.

Online Quick Tools provides a free hash generator that computes MD5, SHA-1, and SHA-256 for any input instantly in your browser. Useful for file integrity checking, debugging, testing, and verifying hash values - all without your data leaving your device.