Every time you visit a website, your browser performs a security ritual that most people never notice. It checks the server's TLS certificate, verifies the certificate chain back to a trusted root authority, and establishes an encrypted connection. This happens in milliseconds, on every page load, billions of times per day across the internet. The primary purpose is security — preventing eavesdropping and man-in-the-middle attacks. But the same mechanism proves something else entirely: the identity of the server you are talking to.
That proof of identity is the foundation of how Burnt verifies data from any website.
What does HTTPS actually guarantee?
HTTPS is HTTP over TLS (Transport Layer Security). The TLS protocol does two things: it encrypts the connection so third parties cannot read the data in transit, and it authenticates the server's identity so you know who you are communicating with. The second property — server authentication — is the one that matters for verification.
Here is how server authentication works. When you connect to chase.com, the server presents a TLS certificate issued by a Certificate Authority (CA) like DigiCert or Let's Encrypt. That certificate binds the domain name "chase.com" to a specific public key. The CA has verified that the entity requesting the certificate controls the domain. Your browser checks the certificate's validity, confirms it was issued by a trusted CA, and verifies the cryptographic chain. If everything checks out, the browser knows — with mathematical certainty — that it is communicating with a server authorized to operate as chase.com.
This chain of trust is built on a hierarchy of Certificate Authorities. Root CAs are embedded in your operating system and browser. They sign intermediate CA certificates, which in turn sign server certificates. The entire chain is verifiable by anyone. The system has been running at internet scale for over two decades, and it processes billions of certificate validations every day.
What the browser already knows
When you log into your bank over HTTPS, your browser has already verified that you are talking to the real server. The data the server returns — your account balance, transaction history, account status — is guaranteed to come from that authenticated server. Not from a phishing site. Not from a man-in-the-middle. From the server that the Certificate Authority confirmed controls that domain. This is the same guarantee that protects online banking, e-commerce, and every other sensitive web interaction.
How does Burnt use TLS for verification?
Burnt builds on top of the trust that TLS already establishes. When a user needs to verify data from a web source — say, their bank balance exceeds a threshold, or their employer's HR portal shows active employment — the process works like this:
- The user authenticates. The user logs into their account on the relevant website (their bank, employer portal, insurance carrier, government agency) through a secure browser session. Their credentials never leave the browser. Burnt does not see, store, or transmit login information.
- Burnt verifies the server identity. During the session, Burnt verifies the TLS certificate chain for the website. This confirms that the data is coming from the server that is cryptographically bound to that domain. If the certificate chain does not validate — if someone tried to substitute a different server or intercept the connection — the verification fails.
- The data is extracted and verified. From the authenticated session, Burnt extracts only the specific data point needed: account balance above threshold, employment status active, insurance coverage valid. The raw page content is processed in memory and discarded. Only the verification result persists.
- A cryptographic proof is produced. The verification result includes a proof tying the verified fact to the authenticated source. This proof is independently verifiable — a third party can confirm that the data was extracted from a session with a cryptographically authenticated server.
The critical distinction is provenance. The data is not just extracted from a website. It is extracted from a website whose identity has been verified through the TLS certificate chain. The fact that the data came from the claimed source is cryptographically proven, not assumed.
How is this different from screen scraping?
Screen scraping and TLS-based verification may appear similar on the surface — both involve extracting data from a website. But the difference is fundamental, and it matters for anyone relying on the output.
A screen scraper loads a web page and copies the visible content. It might parse the HTML, extract text, and report what it finds. But it makes no assertion about where that data came from. The scraped content could be from the real website, from a cached copy, from a locally modified version, or from a completely fabricated page. The scraper has no way to tell, and neither does anyone consuming its output.
Burnt's approach is different in three ways:
- Server identity is verified. The TLS certificate chain confirms that the data came from a server authorized to operate as that domain. This is not a heuristic check (like looking at the URL bar). It is a cryptographic verification against the same Certificate Authority infrastructure that secures all of HTTPS.
- Session integrity is maintained. The data is extracted from a live, authenticated session where the user has logged in. The data reflects the user's actual account state, not a public page or a cached version.
- A verifiable proof is generated. The output includes a cryptographic proof that ties the extracted fact to the authenticated session. A downstream consumer can independently verify that the data came from the claimed source.
The analogy is the difference between photocopying a document and verifying it at the issuing office. A scraper produces a copy. Burnt produces a verified fact with provenance.
Every HTTPS connection already proves you are talking to the real server. Burnt uses that proof to verify the data the server returns. The trust infrastructure was already there — we just built on top of it.
What does this unlock beyond email verification?
DKIM-based email verification, which we covered in a previous post, is powerful but bounded. It works for any data source that sends email — which is most organizations — but it is limited to the information that appears in transactional emails. A booking confirmation contains a flight number and date. A pay notification contains a pay amount and period. But the email does not contain everything the source knows about the user.
HTTPS/TLS-based verification removes that boundary. It covers any data visible in a user's authenticated web session. This opens up verification sources that email cannot reach:
- Government portals. DMV records, tax filing status, benefits eligibility, voter registration. Government agencies run web portals but rarely send detailed transactional emails.
- Bank and brokerage dashboards. Account balances, investment holdings, transaction history. These are available through authenticated sessions but are not typically emailed.
- Employer HR systems. Employment status, job title, tenure, benefits enrollment. HR portals contain rich employment data that goes far beyond what appears in a payroll notification email.
- Insurance carrier portals. Policy status, coverage details, claims history. Carrier websites display the complete policy picture, not just the summary in a confirmation email.
- Utility and service accounts. Account status, usage history, payment standing. Utility companies maintain portals with detailed account information.
Together, DKIM and HTTPS/TLS cover virtually every digital data source. DKIM handles any organization that sends email. HTTPS handles any organization that runs a website. Between them, the gap in verification coverage effectively disappears.
The infrastructure for web-based verification has been hiding in plain sight. TLS certificates have been authenticating server identities for decades. Every browser performs this verification automatically, on every connection, without the user or the server doing anything special. The chain of trust from Certificate Authorities to server certificates is well-established, universally deployed, and cryptographically sound.
It was built to secure web connections. It also happens to provide exactly the trust foundation needed to verify data from any website on the internet.
Frequently asked questions
HTTPS uses TLS certificates to prove the identity of the server you are communicating with. When a user logs into their bank's website over HTTPS, the TLS certificate chain proves the data is coming from the real bank server. Burnt verifies this certificate chain and extracts specific data points from the authenticated session.
No. Screen scraping copies visible content without verifying where it came from. Burnt verifies the TLS certificate chain to confirm the server's identity, then extracts data from a cryptographically authenticated session. The difference is provenance: Burnt can prove the data came from the claimed source.
Any website that uses HTTPS and has a user login can serve as a verification source. This includes banks, employers, government portals, insurance carriers, airlines, utilities, and any other organization that provides user accounts. Over 95% of the web is now served over HTTPS.
DKIM verifies data from email sources, while HTTPS/TLS verifies data from web sources. Together they cover virtually every digital data source. DKIM handles any organization that sends email. HTTPS handles any organization with a website. The combination provides near-universal verification coverage.