What is a hash function?
A cryptographic hash function takes an arbitrary input — a string, a file, a 4 GB ISO image — and produces a fixed-size output called a hash, digest, or fingerprint. The MD5 hash of "hello" is the same length as the MD5 hash of the entire Linux kernel source: 128 bits, written as 32 hex characters. Hash functions have three properties that make them ubiquitous in software:
- Deterministic — the same input always produces the same output.
- Fast to compute, infeasible to reverse — a one-way function. Given a hash, you cannot recover the original input.
- Avalanche — flipping a single bit of input changes ~50% of the output bits.
"hello"and"hellp"produce completely different hashes.
Cryptographic hashes additionally aim for collision resistance — it should be computationally infeasible to find two different inputs that produce the same hash. When this property breaks (as it has for MD5 and SHA-1), the algorithm is considered broken for security purposes.
Hash functions show up everywhere in computing: file integrity checksums (verifying a downloaded ISO matches the publisher's value), Git commit IDs (SHA-1 / SHA-256), Bitcoin transaction IDs (double-SHA-256), digital signatures (sign the hash, not the document), Content-Addressable Storage (IPFS, deduplication), HMAC for API authentication, ETags for HTTP caching, and partition keys for distributed databases.
Hash algorithm comparison — MD5 vs SHA-1 vs SHA-256 vs SHA-512
Picking the right algorithm matters. Use the wrong hash for password storage and you'll end up in a breach disclosure. Use SHA-512 for ETags and you waste cycles. Here's the matrix:
| Algorithm | Output size | Status (2026) | Speed | Use for | Don't use for |
|---|---|---|---|---|---|
| MD5 | 128 bits (32 hex) | Broken (collisions) | Very fast | Non-security: cache keys, ETags, deduplication, file naming | Signatures, certificates, integrity against attackers, password storage |
| SHA-1 | 160 bits (40 hex) | Deprecated (SHAttered, 2017) | Fast | Legacy Git history, legacy systems | New code. Migrate to SHA-256. |
| SHA-256 | 256 bits (64 hex) | Secure | Fast (with hardware acceleration) | Default modern choice. Signatures, certificates, blockchain, integrity, content addressing. | Password storage (use bcrypt/argon2 instead — see below) |
| SHA-512 | 512 bits (128 hex) | Secure | Faster than SHA-256 on 64-bit CPUs | High-security signatures, when 256 bits feels insufficient, on 64-bit servers | Resource-constrained devices; situations where 256-bit is enough |
| SHA-3 / Keccak | 224, 256, 384, 512 bits | Secure (different math from SHA-2) | Slower than SHA-256 in software | Defense-in-depth (different design than SHA-2) | When SHA-256 is sufficient and ecosystem support matters |
| BLAKE3 | 256 bits (extensible) | Secure | Fastest cryptographic hash (parallelizable) | Big files, high-throughput systems, modern apps | Compatibility with legacy systems (not in Web Crypto API) |
Why MD5 and SHA-1 are "broken"
"Broken" means researchers have demonstrated collision attacks — given an input, they can craft a different input that produces the same hash. For MD5, collisions are produced in seconds on a laptop (since 2008). For SHA-1, Google's 2017 SHAttered attack demonstrated practical collisions. Once collisions are practical, attackers can forge signatures, swap files in supply chains, and break integrity guarantees. For security-sensitive use cases, use SHA-256 or SHA-3 in 2026 and beyond.
For non-security use (cache keys, change detection, deduplication where attackers can't influence input), MD5 and SHA-1 are still fine and faster. Git still uses SHA-1 by default for commit IDs, but is migrating to SHA-256.
Cryptographic vs non-cryptographic hashes
Not every hash function aims for cryptographic security. Knowing which kind you need saves performance:
| Type | Examples | Speed | Collision resistance | Best for |
|---|---|---|---|---|
| Cryptographic | SHA-256, SHA-512, SHA-3, BLAKE3 | Slower (millions of ops/sec) | Designed to resist deliberate attacks | Signatures, certificates, content addressing, security checks |
| Non-cryptographic | xxHash, MurmurHash, CityHash, FNV, CRC32 | 10–100× faster (billions/sec) | Random collisions only — attackers can craft collisions | Hash tables, bloom filters, network checksums, deduplication where input is trusted |
Rule of thumb: if an attacker could control or influence the input, use a cryptographic hash. Otherwise, faster non-cryptographic hashes are usually a better fit.
⚠️ Don't use these hashes for password storage
This is the most-misunderstood point about hashing. SHA-256 and SHA-512 are far too fast for password storage. A modern GPU computes 7+ billion SHA-256 hashes per second. A leaked database of SHA-256-hashed passwords can be cracked in hours.
For password hashing, use a slow, memory-hard, salted algorithm:
| Password hash | Year | Status | Recommendation |
|---|---|---|---|
| argon2id | 2015 (PHC winner) | Best | Default for new applications. Use defaults; raise time/memory cost as hardware improves. |
| scrypt | 2009 | Strong | Solid alternative to argon2id. Used by major cryptocurrencies. |
| bcrypt | 1999 | OK | Battle-tested. 72-byte input limit. Slightly weaker than argon2/scrypt against GPU attacks. |
| PBKDF2 | 2000 (RFC 2898) | Adequate | Use only when FIPS-140 compliance forces it. Set ≥ 600,000 iterations for SHA-256. |
| SHA-256 / SHA-512 alone | — | Don't | Too fast. Use bcrypt/argon2 instead. |
| MD5 / SHA-1 alone | — | Never | Both broken AND too fast. Don't. |
argon2id or bcrypt library on the server side. Need a strong random password to hash? Use the password generator first.
Why bcrypt and Argon2id replace SHA-256 for passwords
SHA-256 was designed for speed — exactly the wrong property for password storage. Modern GPUs evaluate ~7 billion SHA-256 hashes per second, so an 8-character password leaks in hours. Argon2id (Password Hashing Competition winner, 2015) and bcrypt deliberately throttle to milliseconds per hash and consume gigabytes of RAM, defeating GPU and ASIC parallelism. Both algorithms add unique per-user salts automatically — never roll your own salting on top of SHA-256. For tokens that need to look like a hash but originate from a session, see the JWT generator for signed tokens, and the UUID generator for unique non-secret identifiers.
Hashing in 8 programming languages
Same SHA-256 input, same output — every language. Below are minimal, copy-paste snippets for each runtime. Pair these with the Base64 encoder when you need to encode raw digest bytes for HTTP headers or JSON payloads.
JavaScript / Browser (Web Crypto API)
// SHA-256 of a string (Web Crypto is async)
async function sha256(text) {
const bytes = new TextEncoder().encode(text);
const hash = await crypto.subtle.digest('SHA-256', bytes);
return Array.from(new Uint8Array(hash))
.map(b => b.toString(16).padStart(2, '0')).join('');
}
await sha256("hello world");
// → "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"
// Available algorithms: SHA-1, SHA-256, SHA-384, SHA-512
// Note: MD5 is NOT in Web Crypto — use a JS lib like crypto-js if needed
Node.js
import { createHash } from 'node:crypto';
// String → hex hash
const hash = createHash('sha256').update('hello world').digest('hex');
// "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"
// Available: 'md5', 'sha1', 'sha256', 'sha384', 'sha512', 'sha3-256', etc.
// Streaming a large file (memory-efficient)
import { createReadStream } from 'node:fs';
const h = createHash('sha256');
createReadStream('big-file.iso').pipe(h).on('finish', () =>
console.log(h.digest('hex'))
);
Python
import hashlib
# String → hex digest
hashlib.sha256(b"hello world").hexdigest()
# 'b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9'
# Available: md5, sha1, sha224, sha256, sha384, sha512, sha3_256, blake2b, blake2s
# Hash a large file in chunks (don't load into memory)
def sha256_file(path):
h = hashlib.sha256()
with open(path, 'rb') as f:
for chunk in iter(lambda: f.read(65536), b''):
h.update(chunk)
return h.hexdigest()
PHP
// String → hash
hash('sha256', 'hello world');
// "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"
// File → hash (memory-efficient)
hash_file('sha256', '/path/to/big-file.iso');
// List supported algorithms
print_r(hash_algos());
// Password storage — use password_hash, NOT raw hashing
$secure = password_hash($plain, PASSWORD_ARGON2ID);
password_verify($plain, $secure); // true / false
Java
import java.security.MessageDigest;
import java.nio.charset.StandardCharsets;
byte[] bytes = "hello world".getBytes(StandardCharsets.UTF_8);
byte[] digest = MessageDigest.getInstance("SHA-256").digest(bytes);
StringBuilder hex = new StringBuilder();
for (byte b : digest) hex.append(String.format("%02x", b));
// hex.toString() = full SHA-256 hex
Go
import (
"crypto/sha256"
"encoding/hex"
)
hash := sha256.Sum256([]byte("hello world"))
fmt.Println(hex.EncodeToString(hash[:]))
// "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"
// Streaming for large files
h := sha256.New()
io.Copy(h, file)
fmt.Println(hex.EncodeToString(h.Sum(nil)))
Rust
use sha2::{Sha256, Digest};
let mut h = Sha256::new();
h.update(b"hello world");
let result = h.finalize();
println!("{:x}", result);
// b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
// BLAKE3 (faster) via blake3 crate
let h = blake3::hash(b"hello world");
Bash / shell
# File checksums
md5sum file.iso
sha1sum file.iso
sha256sum file.iso
sha512sum file.iso
# String hash (note: trailing newline from echo!)
echo -n "hello world" | sha256sum
# b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9 -
# Verify a checksum
sha256sum -c sha256sums.txt # checks all files listed
Common use cases — when to hash what
File integrity verification (checksums)
Download a Linux ISO and want to verify it wasn't corrupted in transit? Compare your locally-computed SHA-256 to the published value. The chance of accidental corruption producing the same hash is 1 in 2^256 — astronomically small. Use this tool's "drag a file" mode against a known-good hash.
Git commit IDs and content addressing
Git uses SHA-1 (migrating to SHA-256) to identify every commit, tree, and blob. Two files with identical content have identical hashes — automatic deduplication. IPFS, Docker layers, and Cargo registries all rely on the same property.
HTTP ETags and cache busting
Web servers compute a hash of file contents and send it as an ETag header. Browsers store it; on the next request, they include If-None-Match: <etag>. If the server's current hash matches, it returns 304 Not Modified — no body, instant cache hit. MD5 is fine here (no security threat from clients).
Digital signatures
You don't sign a 1 GB document — you sign its hash. RSA, ECDSA, Ed25519 all hash the message first, then sign the (much smaller) hash. The hash algorithm is part of the signature scheme: SHA-256 for RS256, SHA-384 for RS384, etc. JWTs do exactly this for HS256/RS256/ES256.
HMAC — message authentication codes
HMAC combines a secret key with a hash to produce a tag that proves both integrity (message wasn't tampered with) and authenticity (sender knows the key). Used in API request signing (AWS, Stripe webhooks), TLS, IPSec. HMAC-SHA256 is the modern default.
Password reset tokens, deduplication, partition keys
Hashing user emails for partition keys (privacy-preserving sharding); hashing image bytes to deduplicate uploads; hashing reset tokens for storage so a database leak doesn't expose live tokens. SHA-256 is the safe default.
SHA-256 file checksum verification on Linux, macOS, and Windows
Every major OS ships a built-in SHA-256 verifier — no third-party tool needed. On Linux, sha256sum file.iso outputs the digest; pair with sha256sum -c SHA256SUMS to verify against a published manifest. On macOS, shasum -a 256 file.iso behaves identically. On Windows, certutil -hashfile file.iso SHA256 works in any cmd or PowerShell session, or use Get-FileHash -Algorithm SHA256 file.iso in PowerShell 5+. Drop the file onto the "File Hash" tab above to compute the hash directly in the browser — useful when you don't trust the local CLI environment or want to compare against a clipboard value. For binary-safe encoding of the result, pipe through the Base64 encoder.
MD5 vs SHA-256 collision resistance — when each fails
MD5 collisions are produced in seconds on a laptop (Wang's 2004 attack); SHA-1 fell to Google's SHAttered attack in 2017 with ~6,500 GPU-years of compute, now reproducible far cheaper. SHA-256 has no known practical collision attacks in 2026 — the best published cryptanalysis attacks reduced rounds, not the full 64-round function. If your threat model includes attacker-controlled inputs (signatures, certificates, supply-chain artifacts), SHA-256 or BLAKE3 is the floor. For pure deduplication or cache keys with no adversary, even MD5 is acceptable and faster.
Hash function best practices for 2026
- Use SHA-256 unless you have a specific reason not to. It's the modern default, hardware-accelerated on every CPU since 2013 (Intel SHA Extensions), and supported in every Web Crypto API browser.
- Never use MD5 or SHA-1 for security-critical work. Both have practical collision attacks. Fine for cache keys, ETags, and non-adversarial integrity checks.
- Always salt password hashes. A salt is a per-user random value mixed into the password before hashing.
bcryptandargon2idhandle salting automatically; rawSHA-256(password)is broken even with a salt because it's too fast. - Use constant-time comparison for security-sensitive equality checks.
===in JavaScript or==in Python compares byte-by-byte and exits early — leaking timing information about how many bytes matched. Usecrypto.timingSafeEqual(Node) orhmac.compare_digest(Python) instead. - Hash files in chunks. Loading a 10 GB file into memory before hashing exhausts RAM. Every language's hash library supports streaming via
update()calls or pipe operations. - Verify the encoding before hashing.
"héllo"in Latin-1 produces a different hash than"héllo"in UTF-8. When publishing checksums, specify the encoding (or the file is binary and encoding doesn't apply). - Strip trailing newlines. Shell commands like
echo "text"add a newline, producing a different hash than the same text without it. Useecho -norprintf.