Strategic Initiatives
12333 stories
·
45 followers

BeyondMemory · Threat Intelligence for SOC and CTI Teams

1 Share

LLM (google/gemini-3.1-flash-lite-20260507) summary:

  • Documented Exposure: A comprehensive dataset of over 200,000 public pastes gathered from various developer utility websites over seven years was analyzed and categorized.
  • Data Sensitivity: Extracted documents frequently contained highly sensitive information such as live cloud infrastructure credentials, database connection strings, social security numbers, and internal service desk records.
  • Emergence Of AI Tooling: The prevalence of leaking data through utility services has increased due to current workflows involving artificial intelligence coding assistants that encourage pasting production errors into external tools.
  • Methodology Transparency: Researchers performed systematic, unauthenticated scraping of the providers publicly available recent links feed to demonstrate the lack of security surrounding stored professional workflows.
  • Turkish Sector Impact: Specific analysis of Turkish data found records containing national identification numbers, bank account details, and private insurance policy documents accessible on public URLs.
  • Platform Insecurity: The service utilized for the analysis exhibits a critical stored cross site scripting vulnerability, allowing attackers to execute code in the browser of any user viewing contaminated pastes.
  • Institutional Necessity: Regulatory bodies are encouraged to classify the use of third party public paste services as a regulated data processing activity to ensure proper oversight of organizational information.
  • Mitigation Recommendations: Proposed security strategies include restricting network access to public formatting tools, adopting offline developer utilities, and treating artificial intelligence prompts as potential data egress points.

Seven years inside the public "Recent Links" feeds of a family of JSON and code "beautifier" tools. What engineers pasted; whose data it was; what the rise of the AI coding assistant changed; and what a Turkish data controller is supposed to do about the TCKNs and IBANs sitting on a stranger's server right now. And the part we did not go looking for: the formatter itself carries a stored cross-site-scripting flaw, so the service holding all of this data can be made to run an attacker's code in your browser.


A morning at the keyboard, somewhere

It is mid-afternoon at a tax-preparation company in the United States. A document-delivery service keeps failing, so an engineer copies a JSON callback out of the debugger to clean it up: one client's filing, caught mid-pipeline. The record carries the client's name, their Social Security number in the clear, and, a few fields down, a live access key for the cloud queue that ships the document. The engineer pastes it into a public JSON formatter. The formatter saves the paste under a six-hex identifier and adds the resulting URL to its public "Recent Links" feed, where, months later, our scraper retrieves it.

That paste is one of the documents in our corpus. The client is one of several thousand people whose most private records have been routed, in pieces, through this single public service over the years. The vendor does not know. The client does not know. The company's information-security team almost certainly does not know either, because if it did, the paste would not still be retrievable as we write this.

Now move the same scene to Istanbul, or Ankara, or Izmir. A different engineer, a different debugger, the same instinct. The payload that comes out is a retail bank customer's full credit limit and outstanding debt, balances in Turkish lira. Or a taxpayer's invoice lifted straight off the national e-Fatura rails. Or a company's entire member table, the whole roster in one file. The instinct is identical, the tool is identical, and so is the outcome: the data is gone the moment it leaves the laptop.

The pattern is not new. What has changed in recent years is the shape of who is doing it, what they are pasting, and, increasingly, why. This is a report on that change, and on Türkiye's specific, measurable place inside it.


Why a JSON formatter, and why now

There is nothing remarkable about a JSON formatter as a piece of software. It accepts a blob of text, indents it, and offers a button to save the result under a shareable URL. The remarkable part is that millions of engineers, every year, choose to do their debugging through one, and that the service, by default in many cases, publishes the saved paste on its own public listing page.

watchTowr Labs documented this surface in November 2025 on a sample of roughly 80,000 documents. Our work extends and quantifies theirs: around 200,000 documents, more than double their sample, collected over roughly seven years, ending May 2026, then read and sorted by what each one actually held: personal data, sector context, and a category that did not exist in the pre-LLM corpus: the workflow exhaust of human-AI interaction.

And <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> is only one storefront. The same operator runs <a href="http://codebeautify.org" rel="nofollow">codebeautify.org</a> and a family of sibling "beautifier" tools that share a single save backend and a single pool of saved pastes, so a link saved through one is retrievable through the others. We harvested that shared pool across its sibling tools (<a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a>, <a href="http://codebeautify.org" rel="nofollow">codebeautify.org</a>, and the rest), reaching back roughly seven years. The documents this report analyses are the validated, deduplicated core of that harvest; the raw multi-tool surface behind them is larger still.

The headline number: at least 1,078 documents in this corpus carry a high-confidence flag for one or more named credentials, identifiers, or live secrets, and a further 2,167 carry the same flags at medium confidence. If you have ever debugged a production payload in a formatter on the open internet, the corpus probably contains your work. The point of this report is to argue that this is a structural problem, to show its shape in 2026 specifically, and to put a number on what it means for one country's regulated sectors.


How we collected it (and what we did not do)

Every document referenced here was retrieved by issuing the same HTTP request the operator's own front-end issues to render the paste-viewer page. We did not bypass any authentication, because there was none to bypass; the endpoint is unauthenticated and the listing surface enumerates the identifiers. A residential proxy budget and a Saturday afternoon will replicate the corpus we describe at single-figure dollar cost. The hard problem was never retrieval. It was making the retrieved data legible.

Concretely: the "Recent Links" page lists saved pastes ten at a time, paginated by a plain offset: /recentLinksPage/json/0, then /10, then /20, onward for as long as you care to walk it. Each row resolves to a six-hex identifier. Hand that identifier to the same endpoint the site's own viewer calls, and the original paste falls straight back out:

POST /service/getDataFromID
Content-Type: application/x-www-form-urlencoded

urlid=<six-hex-id>&toolstype=json

No token, no session, no rate limit. The only real obstacle is the operator's Cloudflare front door, so we drove a genuine browser engine over the Chrome DevTools Protocol with no automation flags set, rotated through mobile and residential IPs, and moved the cursor and scrolled on human timing, enough to read as the ordinary visitor we technically were. It is the kind of stack a competent team assembles in a weekend.

The harvester walking the public listing ten links at a time, then pulling each paste back through the viewer endpoint:

The harvest script paginating the public listing and retrieving pastes

Reading a corpus this size by eye is not possible, so we did it in passes. One looked for the things that should never be in a paste at all: live tokens, keys, connection strings, card numbers. A second looked for Turkish personal data and validated it rather than guessing. A TCKN counts only once it passes the official national-ID checksum, an IBAN only once it passes MOD-97. A third tried to place each document behind a real Turkish signal: a .tr host, Turkish text, lira amounts, a Turkish bank code.

We did not run a canary-token experiment. watchTowr did, and observed retrieval of a planted canary by an unrelated party within 48 hours of submission. That experiment, independently published, is sufficient evidence for the threat model we describe in this report.


The leaks themselves, redacted

None of what follows is Turkish; Türkiye gets its own section later. The problem is global, and the worst of it lands in regulated industries everywhere. Most of it we show as it appeared in the formatter; a few documents we would rather only describe than screenshot, further down.

The only thing we changed before publishing is redaction. Every live secret, identifier, payment card, IBAN, name, phone number, email, address, geo-coordinate and internal hostname has been replaced with a bracketed <REDACTED-X> marker that names the class of value that sat there. Everything else is exactly as the engineer left it.

Service configuration, with four bundled credentials

One server-config object carrying four different live secrets at once (an AWS key, a database connection string, a Google AI key, and a SendGrid key), almost certainly pasted by someone fixing a typo in a config file.

Service configuration with four bundled credentials, redacted

A live session token, with the user attached

A login-success response carrying a fresh token, the user's contact details, and, for good measure, a MongoDB connection string inside the same store_config object.

Live session token with the user attached, redacted

We decoded a sample of these tokens to confirm they were the genuine article and not test stubs. The payloads carried real issuer and subject claims and expiry timestamps that placed the token, at the moment the engineer pasted it, inside its valid window. We did not test a single one against a live endpoint, and we reproduce no decoded claim here. Structural validation was enough to know what they were.

A data-protection vendor's Google Cloud key

A cloud backup vendor's own infrastructure-registration payload, pasted while debugging a Google Cloud onboarding. It carries the service account's client email, its privateKeyId, and the full clientPrivateKey, the credential that authenticates to the cloud project named three fields away.

Data-protection vendor's Google Cloud key, redacted

A background-check dossier on a private citizen

A commercial skip-tracing report, the kind a debt collector, landlord or investigator pulls, beautified to read it more easily. One named subject tied to dozens of neighbours, phone numbers, voter registrations, professional licences, property assessments and prior addresses, with property valuations attached. Nobody was breached; a paying subscriber ran the lookup and saved it to a public URL. It is the most invasive single profile in the corpus.

A commercial background-check dossier on a private citizen, redacted

A tax filing, with the SSN and a live cloud key in one paste

A document-delivery callback from a US tax-preparation platform. It carries a filer's name and Social Security number in the clear and, a few fields away, a live Azure Service Bus key for the queue that ships the document. Identity and infrastructure, leaked in the same object.

A US tax filing with an SSN and a live Azure key, redacted

A global payroll platform's private API key

An integration payload from a global employer-of-record platform, captured mid-handshake. It carries the full client certificate and the matching private key for one of the company's internal private APIs, the credential that authenticates its service-to-service traffic.

A global employer-of-record platform's private API key, redacted

An EU regulator's internal ticketing system

A single issue exported from the internal service desk of a European Union regulatory agency: issue keys, workflow states, linked tickets and reviewer fields, served from the agency's own Jira host. Not a credential, but a clear window into how a supranational regulator runs its casework, sitting on a public URL.

A European Union regulator's internal service-desk issue, redacted

A major bank's internal Jira issue

A single issue exported from a major US custody bank's internal tracker, dense with several hundred populated custom fields: project keys, reviewer names, workflow history and internal URLs, all addressed to the bank's own Jira host. One of the highest sensitive-field counts of any document in the corpus.

A major US bank's internal Jira issue, redacted

A few we describe, not show

Not every document needs a screenshot to land, and a few we would rather not reproduce even redacted. Three more from the global, non-Turkish pile, told rather than shown.

A US cable provider's customer file. A customer-record export, dropped into the corpus by a routine engineering paste: named addresses, ZIP codes, communication-preference flags, and the account identifiers tying them together. There is no credential in it, and it does not need one. A customer-record dump is exactly the post-incident artifact a regulator's investigator arrives looking for. This one is already public.

A military hospital's HR record. An employee record pasted line by line: name, date of birth, national identification number, home address, organizational unit, and, in a field the engineer probably never looked at twice, a 280 KB base64-encoded portrait photo. The paste is in the hospital's own language and names the hospital in the payload. Someone was debugging on the way back from lunch, and the security policy did not travel as far as their IDE.

A global bank's repository inventory. An internal Bitbucket-inventory feed (repository names, project keys, "Dev"/"Prod" markers, and the underlying Jira project URLs) tied to a major US bank's internal hostnames. Small, prosaic, and recurrent: the same event reappears across submissions, so it is a scheduled job, not a one-off slip. For an attacker mapping the bank's internal structure, it is a starting point that needs no exploit at all.

The bulk of it

The fragments above are a small, readable sample of a much larger pile, and the pile is the part that is hard to believe. The same scan returns live secrets across nearly every class an attacker would want, and it rarely finds them one at a time. Single documents carry dozens of distinct secrets; the richest carry several hundred, a whole environment's worth of keys, tokens and credentials beautified into one paste and saved to a public URL.

A redacted tour of what sits in the corpus, by class:

  • Cloud keys. AWS access-key and secret pairs in config blobs ("accessKeyId": "<REDACTED>", "secretAccessKey": "<REDACTED>"), with Cloudflare, Datadog, PagerDuty and Twilio tokens beside them.
  • Private keys. PEM-encoded RSA keys (-----BEGIN RSA PRIVATE KEY-----<REDACTED>) and ssh-rsa AAAAB3...<REDACTED> authorized-keys, pasted straight out of job configs and cluster settings.
  • Database credentials. Connection strings with the password in the clear ("password": "<REDACTED>"), MongoDB URIs and AMQP strings among them.
  • Payment and SaaS keys. Stripe live keys (sk_live_<REDACTED>), SendGrid keys, and payment-partner credentials (key IDs and secrets) sitting in lender and checkout configs.
  • Workflow secrets. Atlassian/Jira and Bitbucket payloads, one of them carrying several hundred distinct secret tokens in a single document, plus OAuth refresh tokens and decoded JWTs with the user still attached.
  • LLM provider keys. OpenAI and Google AI keys, the everyday exhaust of the AI coding assistant.
  • Identity and PII. US Social Security numbers in populated customer records, passport and national-ID payloads, Active Directory and Kerberos config, and Turkish TCKNs and IBANs by the file.

We reproduce none of the live values. The counts behind these classes are the kind you re-run twice because you do not trust them the first time, and the originals remain a single unauthenticated request away on the operator's listing surface as we write.


Türkiye on the clipboard

We are a UK-based firm, and much of our team is Turkish, so Türkiye is close to home for us. When this work began, we expected to anchor the whole report on a Turkish critical-infrastructure deep-dive. We owe the reader, and the regulators we want to act on this, the same honesty we would want from anyone writing about somewhere they are this close to: the dataset does not sustain a "Türkiye is hemorrhaging more than everyone else" story, and we are not going to manufacture one. What it sustains is something more useful and, for a Turkish data controller, more actionable.

What the numbers say, and what they don't

The blunt sweep looks alarming. 2,087 documents in Turkish. 800 carrying a personal-data signal. 262 with at least one checksum-valid TCKN. 40 valid Turkish IBANs, 31 of them in a single paste. 18,019 Turkish-format phone numbers.

Then you open the documents, and the honesty tax comes due. A validated TCKN is a strong signal, not a perfect one. The paste that scored highest in the entire corpus, 46 "national IDs" and 61 "phone numbers", turned out to be Apple's end-of-day market data: CUSIP, CIK, and tax-ID strings that happen to pass the checksum. The one with over a thousand "phone numbers" was a delivery-dispatch queue. A 30-TCKN hit was Apple again. A 407-"phone" hit was a Spanish department store's category tree. The country tagger was no better: it filed an Indian pharma company's travel-booking system under "Turkish," and a US meal-kit app too, because one of its food groups is named "Turkey." The license-plate detector alone returned 622,100 matches, which is roughly the moment any honest researcher quietly deletes the license-plate detector.

So we did the slow thing. We separated the genuinely-Turkish documents by hand, using Turkish text, .tr hostnames, lira amounts, and Turkish bank codes, and threw the coincidences back. What is left is far smaller than the sweep, and it is entirely real. Seven of them are below, and they reach from one citizen's wallet into the government's own e-invoicing rails and a company's entire member table, redacted to the bone.

Seven real ones

Each of these is confirmed Turkish. Each is described at the sector level, with live values removed.

A citizen, whole, in a single paste (automotive / consumer finance). A response from a Turkish vehicle-trade app, saved to a formatter to "validate the JSON." Skip the throwaway session name at the top; what sits underneath is one real person's life. A named individual (name, surname, TCKN) in Adana, listed beside their car (a 2013 diesel, the plate, status: "Onay bekleniyor", awaiting approval), and in the very same object their bank cards in the clear: the full card number, the CVV, the expiry month and year, and the name on the card, one tagged Ana Ödeme Kartı (primary card), another Onaylandı (approved). Two working payment cards, a national ID and a vehicle record, on one public URL. Everything you would need to be this person at a checkout.

A citizen's full identity in a single paste, redacted

A professional chamber's certified tradesmen (vocational licensing). A dataset of vocational-qualification certificates issued through one of Turkey's professional chambers: MYK/TÜRKAK-accredited Çelik Kaynakçısı (steel welder) records from a chamber of the Union of Chambers of Turkish Engineers and Architects. Each record pairs a real tradesman's name and TCKN with the chamber president as the signing authority. This is not a template: the names and national IDs are populated, record after record.

A professional chamber's certified tradesmen list, redacted

A private citizen, doxxed by their own chat export (personal data). Not a corporate system this time, one person. Someone saved an export of their Instagram direct messages, and partway through the thread a shoe order runs the usual course: the seller asks for name, shoe size and address, and the customer types all three. What lands on the public URL is her full name, her mobile number, and her home address down to the building number and neighbourhood, in Ağrı in the east of the country. No credential, no breach, no API, just a named human being's front door, left retrievable by anyone who walks the six-hex space.

A private citizen's chat export, redacted

An insurance policy, mid-issuance (insurance). A Turkish insurer's policy-issuance response, and not just once: the same object recurs across several pastes, the signature of a job running on a schedule. Premium 7,664.64 TL, commission and tax broken out, a one-year term, a sum insured of 246,979 TL, the agent code, CC_MAIL_ORDER as the payment method, and a policyPartners block that names the insured outright: national ID (TCKN), name, surname, and role INSURED.

An insurance policy mid-issuance, redacted

A taxpayer's invoice, off the national e-invoicing rails (government / tax). A record from Türkiye's Revenue Administration e-invoice system (GİB e-Fatura): an approved FATURA with its state-issued document number, the seller's saticiVknTckn tax-identity number and name, the invoice date, and the ETTN, the unique invoice identifier the state assigns. One taxpayer's invoice, lifted off the national e-invoicing rails onto a public URL.

A taxpayer's e-invoice from the national system, redacted

A bank customer's limit and debt (retail banking). A major Turkish private bank's mobile card app, caught mid-session. The response spells out the customer's credit limit (MÜŞTERİ LİMİTİ ₺10.000,00), their total balance (TOPLAM BAKİYE −₺5.507,15, i.e. in the red), their usable limit (₺3.746,89), and, in the cards array, the card number and type. The bank's own …isbank…/maximummobil asset host names it; we don't.

A bank customer's limit and debt, redacted

A company's entire member table (multi-person PII). Not one citizen this time. A Turkish hosting company's member export: 47 customer records in a single paste, each with a full name, a personal email, a mobile number, a registration date and, in plain text, the account's two-factor AuthenticatorKey. A whole customer database, the 2FA seeds included.

A company's entire member table, redacted

Behind these seven sits the long tail: once the coincidences are stripped out, a few hundred genuinely-Turkish documents still carry citizens' national IDs, phone numbers, addresses, and plates, in customer-service tickets, municipal device logs, and short development blobs. We are not reproducing those.

Why this is a KVKK question, not a curiosity

Here is the framing we want a Turkish data controller and the KVK Kurumu to take from this section. If your engineering team has ever used <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> or any of its analogues, whether your processing activity is compliant under KVKK Article 12 reduces to one question: do any of your pastes contain personal data? For the controllers behind the seven documents above, the answer is already yes, the data is already public, and neither fact is in dispute. "Paste to a third-party web service" is a covered processing activity in fact; the only open question is whether your guidance and your egress controls treat it as one.

Are you a data controller who needs to know? We will run a private aggregate query against the corpus, at no cost, on a written request from an officer of any controller, Turkish or otherwise, and tell you whether your data is in it. Write to <a href="mailto:info@beyondmemory.io">info@beyondmemory.io</a>.


The LLM era: what an AI coding assistant actually leaks

The most distinctive subset of the corpus, against any pre-2024 baseline, is the set of pastes whose shape is unmistakably the input or output of a large-language-model workflow. There was no equivalent surface in 2022, a much smaller one in 2024, and by 2026 it is a paste class in its own right.

System prompts, captured mid-flight. We identified 487 documents at medium-or-high system-prompt likelihood and 131 at the strictest threshold. A meaningful fraction are not toy prompts; they are production system prompts with named individuals, internal product names, and example PII embedded in the few-shot demonstrations. In one, a user opens a chat with an "assistant", asks for a summary, and then pastes several thousand characters of what reads like a family-court witness statement, naming the opposing party and a minor child. The chat application held that document in memory and never touched a public service. The user, separately, saved their own copy on a formatter while preparing the prompt. The two systems never communicated. The data leaked anyway.

Retrieval contexts, leaked one chunk at a time. Thirty documents carry a confirmed RAG-output shape: chunk_id, source, metadata, sometimes paired with embedding arrays. A RAG output is, by definition, material the system retrieved from inside the organization's own knowledge base. Paste it to clean it up and you have published the retrieval target in full. We name none of the thirty.

"Paste this for the assistant to fix." Thirty-one documents are recognizable as an engineer copying a production error into a chat with an AI assistant: a stack trace, plus an Authorization header, plus an internal hostname, plus a messages array. The intent is to ask the assistant to debug the error. The side effect is to publish the cleaned-up error, through the same browser, on a service the assistant never touched. One is a perfect ASP.NET yellow-screen whose title is the immortal "Padding is invalid and cannot be removed."

AI provider keys, in passing. Ninety-eight documents contain at least one live-shaped LLM-provider API key: 72 Google AI, 23 in the composite bucket (Mistral, Cohere, Replicate, Groq, OpenRouter, Together), 3 on OpenAI's prefix. The standout is a hand-written Node script for an AI cricket-betting tipster bot that leaks a matched set (a live Telegram bot token, the operator's chat ID, and an OpenAI key) all declared as friendly consts under a comment reading // Set your values.


Why an unauthenticated public formatter still has secrets in 2026

Several arguments are worth surfacing.

First: engineers do not perceive the formatter as a third-party service. They perceive it as software running in their own browser tab, even when the Save button persists their input to the operator's database and exposes the URL on a public feed. The mental model is wrong, and the wrongness is structural, not personal. Calling individual engineers careless does not move the problem.

Second: secret-scanning tooling in the IDE catches secrets at commit time, not at paste time. The egress paths an engineer takes while debugging are not the egress paths corporate security has instrumented.

Third: AI coding assistants have, for many engineers, formalized "paste a payload somewhere clean and ask the assistant for help" as a workflow. The "somewhere clean" is, in practice, often a formatter. The formatter is shared, indexed, harvested.

Fourth: the operator of the formatter has neither the incentive nor, under its own terms of service, the obligation to remediate. The paste was authorized by the engineer who saved it. The operator's only legitimate remediation is to switch off the public listing surface, which would not retrieve the harvested copies already in circulation.

The architecture was never designed against a 2026 threat model. There is no reason to expect it to defend against one without intervention.


The adversary already has this

watchTowr Labs' November 2025 canary-token experiment is the cleanest public evidence we can cite. They planted a token-shaped string on the same operator's "Recent Links" feed and observed automated retrieval by an unrelated party within 48 hours. That party was not them. Their conclusion, which we adopt, is that the operator's listing surface is a routine open-source-intelligence feed for at least one third-party harvester whose intentions are not characterizable.

Beyond watchTowr's direct evidence, the architectural argument stands on its own. Six-hex identifiers are walkable. The endpoint is unauthenticated. The listing page enumerates the identifiers as a service to humans. Any party with a residential proxy budget and a Saturday afternoon can replicate the corpus we describe at single-figure dollar cost. The marginal effect of this publication on the threat model is zero. The marginal effect on the operator's incentives, and on the regulator community's awareness, is the point of writing it.


A stored XSS in the formatter itself

Everything above treats <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> as a passive warehouse: engineers put sensitive data in, third parties take it out. While assembling this report we found the warehouse has a second problem of its own making. The page that renders a saved paste writes the paste's contents back into the document without encoding them for safety, so a paste can carry its own JavaScript. Save the right string, hand someone the link, and your code runs in their browser inside the <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> origin. That is a stored, persistent cross-site-scripting flaw, on a tool that ranks near the top of the results for "json formatter" and is opened millions of times a month.

The site does gesture at a defence: it blocklists a few obvious tokens. That is the weakest class of XSS control, and it falls to the oldest trick there is, which is to never spell the word you are forbidden to spell. Our proof-of-concept writes none of the blocked tokens; it assembles them at runtime:

"><svg onload=alert(self['docu'+'ment']['domain'])>

The "><svg onload=…> breaks out of the surrounding attribute and executes with no <script> tag at all; self['docu'+'ment']['domain'] reaches document.domain without the string document ever appearing. The alert is deliberately harmless: it pops the origin to prove whose security context the code runs in. Swap the body for something useful and the same hole does real work.

We confirmed it firing straight from a paste's title, a field the site reflects with no sanitisation at all, so the payload runs for anyone who merely browses the public "Recent Links" listing, no shared link required:

Stored XSS firing from an unsanitised paste title, popping the jsonformatter.org origin

The same payload live on the public Recent Links listing

Why this is worse here than on an average site: by this report's own measurement, that origin is a standing pile of other people's credentials, tokens, customer records and national IDs. A stored XSS on it lets an attacker run code that can read what a victim's page can read, ride a logged-in session, silently rewrite a paste to attack whoever opens it next, or phish under a domain developers already trust, at the scale of a tool the whole industry pastes into. A warehouse with a broken lock is one thing. A warehouse with a broken lock and a trip-wire already on the door is another.

Disclosure. Unlike the data exposure, which is a by-design property of the service with no patch to ship, this is a concrete, fixable bug. We reported it to the operator on 3 June 2026, before publishing, with the proof-of-concept above and a fix recommendation: encode output for its HTML context (or write it with textContent), add a Content-Security-Policy, and stop treating a keyword blocklist as the control. We publish the detail now, the operator notified, for the same reason as the rest of this report: the people most exposed are the millions who paste into this origin every month, and they are better served knowing than not. The proof-of-concept is benign (it reads only its own origin), and we accessed no user data in confirming the bug.


On publication without coordinated disclosure

The next five paragraphs are written for our lawyers, the operator, and every affected organization, and they are meant to be read straight.

No active credential, identifier, or other live secret appears anywhere in this report. Every figure presented is an aggregate. Every example is described, not quoted. Beyondmemory retains the underlying corpus solely on infrastructure under our exclusive control, behind access controls commensurate with the sensitivity of the data, and will destroy the corpus at the conclusion of this research program. Our retention policy is available on request to any regulator, operator, or affected organization with standing to ask.

We have deliberately chosen not to name any organization whose data appears in the corpus. Where a finding could be identified to a specific entity, we have described it at the sector level only. Public attribution of named victims would compound harm rather than reduce it and is not necessary to support the analytic claims of this report. Organizations who suspect their data may be in the corpus, or who wish to confirm exposure for incident response purposes, may contact <a href="mailto:info@beyondmemory.io">info@beyondmemory.io</a> for a good-faith confidential check.

Every document referenced in this report was retrieved by issuing the same HTTP requests that <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> itself issues to its own server when a visitor lands on its "Recent Links" page and opens an individual paste. No authentication was bypassed because none was present. No rate-limit or access control was circumvented because the operator does not impose access controls on the surface in question. The data was, and remains, publicly retrievable to any party with a browser and patience.

Prior independent research has demonstrated that third parties harvest public paste services on an ongoing basis. watchTowr Labs, in November 2025, deployed canary tokens onto <a href="http://jsonformatter.org" rel="nofollow">jsonformatter.org</a> and observed automated retrieval by an unrelated party within 48 hours of submission. Our publication does not introduce a new threat. It documents an existing one at a scale, and across a population of affected organizations, that the public record does not yet reflect. The marginal risk created by this report is zero; the marginal awareness it creates is the point of writing it.

This research is published because the audience that can act on it sits across multiple Turkish institutions: the Bilgi Teknolojileri ve İletişim Kurumu (BTK), the Bankacılık Düzenleme ve Denetleme Kurumu (BDDK), the Enerji Piyasası Düzenleme Kurumu (EPDK), the Ulusal Siber Olaylara Müdahale Merkezi (USOM), the Kişisel Verileri Koruma Kurumu (KVK Kurumu), and the CISOs of the organizations whose data appears in the corpus. Beyondmemory will share, at no cost, our methodology, our detection rules, and, where a lawful basis to do so can be agreed, indicator lists and per-document attribution evidence with any of the above bodies on written request. The public-facing version of this research is intentionally redacted; the private-cooperation version is not.


What to do about it

Six interventions, in increasing order of difficulty.

1. Stop allowing developer workstations to reach public paste services. This is one line in an egress proxy configuration. Most mature enterprise networks already have the proxy; they have simply never pointed it here.

2. Move the formatting to local tooling. Modern IDEs format JSON with no network call. Offline browser extensions exist. There is no defensible reason for a developer at a regulated organization to send a production payload to a stranger's server in 2026 to add indentation.

3. Treat the AI coding-assistant workflow as a sensitive egress channel. When an engineer copies a payload to feed an assistant, make sure the destination is the assistant, not the assistant and the formatter and whichever screenshot tool was open at the time. Policy is the cheapest intervention here; tooling is the second cheapest.

4. For regulators (the KVK Kurumu, BDDK, BTK, EPDK, and equivalents elsewhere): treat "paste to a third-party web service" as the covered processing activity it already is. The number of documents in this corpus that would satisfy a KVKK Article 12 breach-notification threshold is non-trivial; the number whose controllers know about the exposure today is near zero. The gap between those two numbers is the entire point.

5. For national CERTs (USOM and equivalents): treat public paste tooling as a routine OSINT feed. The cost is a few weeks of engineering. The yield is continuous indicator-of-compromise visibility for a constituency that does not currently report this class of exposure, because it does not know the exposure occurred.

6. For the operator: the public listing is a policy choice; the stored XSS is not. The "Recent Links" surface is a design decision you are entitled to make, even if we think it is the wrong one. The cross-site-scripting flaw we found is not a decision, it is a defect. Encode paste output for its HTML context, add a Content-Security-Policy, retire the keyword blocklist, and confirm that <a href="http://codebeautify.org" rel="nofollow">codebeautify.org</a> and the sibling tools sharing the same viewer are fixed in the same pass.


Closing

The corpus exists. It will keep growing for exactly as long as the operator leaves its listing surface public, which by every indication it intends to do. The threat model has been demonstrated to be active by at least one party other than ourselves. The interventions that close the exposure are not novel, not expensive, and have been available for years.

For Türkiye specifically, the honest finding is not that the country is uniquely exposed. It is that the country is exposed in exactly the same way as everyone else: a citizen's ID and bank cards in one paste, a welder's certificate and national ID, a brokerage balance, and a few hundred more documents carrying citizens' national IDs. The legal framework that already governs all of it, KVKK Article 12, is not yet being applied to this channel, because nobody had been measuring the channel. We measured it.

The question for the institutions reading this is not whether the exposure exists; the corpus answers that. The question is whether the response begins now, or after the next research firm, ours or someone else's, publishes the same paragraph, with your data in it, six months from now.


Private cooperation channel: <a href="mailto:info@beyondmemory.io">info@beyondmemory.io</a> (monitored).

Read the whole story
bogorad
3 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

Exclusive | Real-Time Satellite Intelligence Is Making Ukraine’s Drone Strikes Deadlier Than Ever - WSJ

1 Share

LLM (google/gemini-3.1-flash-lite-20260507) summary:

  • Technological Integration: frontline units now receive high definition satellite images directly on mobile devices to facilitate military operations.
  • Operational Speed: usage of commercial satellite data reduces the sensor to shooter cycle by ninety percent compared to traditional methods.
  • Collaborative Development: the satellite delivery system results from a partnership involving companies from the united states the netherlands and ukraine.
  • Strategic Utility: advanced software tools allow for the comparison of historical and current images to identify tactical changes in russian infrastructure.
  • Enhanced Accuracy: satellite coordinates maintain a margin of error within five meters providing sufficient guidance for long range drone attacks.
  • Resource Efficiency: soldiers depend less on short range reconnaissance drones which face high risks of interception or jamming by opposing forces.
  • Decentralized Intelligence: direct access to data bypasses centralized review processes in kyiv enabling faster battlefield decision making.
  • Defense Innovation: the application of commercial orbital sensors demonstrates a new tactical framework for identifying and targeting logistics hubs and depots.

Video taken by a drone captures a strike carried out by the Ukrainian army, based on Vantor satellite intelligence delivered to the front lines. CREDIT: Bravo1Alpha

June 4, 2026 10:00 pm ET

The small unit of the Ukrainian Armed Forces, stationed about 10 kilometers from the front line in the country’s southeast, knew there was something afoot in a building obscured by thick tree cover. The spring foliage hid its outline but not the signals from the electronic devices within.

The team launched a reconnaissance drone, which couldn’t see much through the trees. But the soldiers had another card to play: high-definition, near-real-time images taken by commercial satellites, delivered directly to their phones, tablets and laptops.

The satellite sensors showed the thick, metal frames of armored vehicles—the type used by senior Russian military officials—parked around the building. After three days of surveilling the site from orbit, the unit determined it was a Russian meeting spot for planning operations, members said. Then they struck the building and vehicles with an attack drone, one of the members said.

“It was good work,” he said. “We made problems for our enemy.”

Over the past six months, during small-team missions to test the technology, images from commercial satellites operated by Colorado-based Vantor have improved the speed and precision of Ukraine’s drone attacks. The rapid delivery to soldiers of geospatial intelligence has shortened by as much as 90% the time it takes to locate and strike Russian assets, according to the technology providers and people involved in the missions. Augmenting the images is software that lets users identify and investigate targets in detail.

Ukrainian Armed Forces member viewing Vantor software displaying geospatial intelligence.Members of the Ukrainian Armed Forces viewing Vantor software displaying geospatial intelligence. Bravo1Alpha

In this grinding war now well into its fifth year, Ukraine continues to spark new and unexpected technology innovation that its weary military hopes might provide an edge against the Russians. After a brutal winter, Ukraine has emerged this spring with a tactical and technological advantage over Russia. Part of that is being driven by Ukraine’s improvements in midrange strikes on Russia’s logistics hubs, warehouses and air defenses. Using faster, more accurate satellite imagery to guide strikes is part of Ukraine’s strategy for launching more precise attacks from a distance. 

The Ukrainian military’s deployment of the program marks the first known instance of unclassified, commercial satellite imagery going directly to a soldier to guide real-time battle decisions, according to the companies and military analysts. The same satellites used to monitor illegal fishing and update Google Maps have found a new and deadly application.

The technology is a trans-Atlantic collaboration between Vantor, Dutch geospatial intelligence company Bravo1Alpha, U.S.-based Persistent Systems and Ukrainian defense firm Burevii. 

The Ukrainian involved in the strike on the Russian planning site said the new technology helps preserve Kyiv’s two scarcest assets: “It is money, it is time,” he said. With access to the satellite images, his team didn’t have to rely on surveillance drones that can be expensive and are more easily jammed or shot down by the Russians.

During a springtime mission, called Starfall II, a Ukrainian unit spent 2½ weeks destroying billions of dollars in Russian assets. Among the targets was a Russian ammunition depot in occupied Ukraine that soldiers had identified after pulling a satellite image of the structures, which had once been used to store grain, members of the team said. Comparing the new image with older photos of the property dating back to before the Russian invasion, soldiers identified changes that convinced them it was no longer an agricultural operation and spotted fresh tire tracks that were consistent with military vehicles unloading ammunition. Members of Ukraine’s Brigade 422, a midrange strike team, dispatched attack drones.

“Every ammunition depot you destroy is at least a couple of Ukrainian soldiers’ lives you save,” said one member of the operation, a technical adviser assisting the armed forces.

A member of the Ukrainian Armed Forces uses a mobile phone to view Vantor satellite data.A member of the Ukrainian Armed Forces interacts with Vantor satellite data on a mobile phone. Bravo1Alpha

The satellite intelligence has allowed them to do within hours what used to require weeks, either because of a lag in getting intelligence out to the front, or the relative slowness of launching a drone and waiting for it to survey large areas, often made slower by fog or snow. 

“Compressing the sensor-to-shooter cycle is the defining trend of this war at the tactical level,” said Franz-Stefan Gady, a military analyst and founder of defense advisory firm Gady Consulting.

As with every technology, satellites have their limitations: They are not particularly helpful on days of thick cloud cover, which is much of the winter in Ukraine, and can’t loiter over a moving target.

Satellite imagery itself is nothing new in war. Commercial and government satellite operators have long been key intelligence sources. Vantor published satellite images of Russian tanks and troops in position near the border of Ukraine before the war began. Ukraine has been heavily dependent on U.S. intelligence sources to conduct strikes.

Vantor’s push into defense helped it reach $900 million in annual recurring revenue last year, when the company, which is owned by private equity, also added more than 10 European defense and intelligence customers. Part of what those agencies are seeking, Vantor said, is the capability now being used by Ukraine.

Vantor’s images go directly from the satellite to the soldier’s tablet, phone or laptop in as little as 15 minutes, bypassing a centralized review in Kyiv that has tended to slow down the flow of intelligence to the front line by hours or days. 

One Ukrainian fighter said intelligence received from human sources on the location of Russian targets required at least two days of review time in Kyiv. A former soldier in Ukraine said geospatial intelligence was sometimes so stale by the time it reached the units at the front line, soldiers couldn’t act on it. Military analysts say the turf war over access to satellite images between Ukraine’s government and military branches has hindered the dissemination of intelligence.

The press office for the Armed Forces of Ukraine declined to comment. The military intelligence unit didn’t respond to a request for comment.

The Vantor software allows soldiers to compare a current satellite image with historical images, as Brigade 422 did with the ammunition depot, and see infrastructure changes or movement. Artificial intelligence monitors large areas and detects when a target shifts. The software generates 3-D renderings that soldiers can use to simulate the best flight path for a drone.

A drone launches to conduct a strike after soldiers used geospatial intelligence to determine the target and the best flight path. CREDIT: Bravo1Alpha

Vantor’s 10 satellites cover 7 million square kilometers of Earth a day, hitting any one point on the globe 12 to 15 times, said Will Cocos, Vantor’s chief transformation officer and a former Navy SEAL. Typically, the coordinates on Vantor’s images are within 5 meters (16 feet) of an object’s real position, plenty accurate for a 50-kilogram (110-pound) explosive, the Ukrainian users said.

Ukraine is now previewing for much of the West what’s possible when the chain of intelligence gets compressed, said intelligence analysts. U.S. Special Operations Command last year added new software to provide near-real-time commercial satellite images on soldiers’ mobile devices, a Socom spokeswoman said.

Army spokesman Maj. Sean Minton said the service doesn’t yet send satellite intelligence directly to soldiers’ devices, but is working toward it through a broader effort to create a high-speed information system that gives soldiers of all ranks access to satellite data “free from headquarters reviews.” 

Removing some of these intermediaries responsible for vetting might speed things up, but it also raises the risks that soldiers get wrong information—and act on it, said Nand Mulchandani, former chief technology officer for the Central Intelligence Agency and the Defense Department’s artificial intelligence office.

“There are processes in place that slow things down, but there are processes in place for a reason,” Mulchandani said.

Copyright ©2026 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

Heather Somerville is a reporter at The Wall Street Journal in San Francisco covering technology and national security. Her articles explore the national-security implications of emerging technology, U.S. efforts to counter China's rise as a technology power, and the relationship between Silicon Valley and the U.S. defense complex.

Heather joined the Journal in 2019 to cover venture capital and technology companies. Before that, she wrote about venture capital and Silicon Valley startups for Reuters and the Mercury News. She was previously a reporter for the Fresno Bee and the Charlotte Observer and wrote about national security for outlets in Washington, D.C.

Read the whole story
bogorad
6 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

U.K. Regulator to Let Publishers Opt Out of Google AI Search Features - WSJ

1 Comment

LLM (google/gemini-3.1-flash-lite-20260507) summary:

  • Regulatory Intervention: the uk competition and markets authority mandates that google allow publishers to exclude their content from ai search features.
  • Market Control: bureaucratic regulators are attempting to empower publishers to dictate the terms under which their content is utilized by data aggregators.
  • Antitrust Scrutiny: the european commission is conducting an inquiry into whether google unfairly leverages its search dominance to obtain privileged access to data.
  • State Mandates: the uk government intends to utilize new legislation to classify massive digital corporations as entities subject to strict conduct requirements.
  • Corporate Compliance: google is initiating a testing phase for tools designed to satisfy regulator demands while maintaining control over search traffic distribution.
  • Traffic Dependence: the company explicitly reminds publishers that opting out of ai-driven search functions will result in a loss of associated web traffic.
  • Strategic Designation: the competition and markets authority continues to monitor google under the pretext of its strategic market status to enforce arbitrary operational changes.
  • Bargaining Pretenses: government officials claim intervention is necessary to force companies to provide publishers with more leverage during commercial negotiations.

Updated June 3, 2026 5:09 am ET


A circular Google logo with the letter G formed by a colorful banana.Google has developed its own AI platform, Gemini, and recently rolled out AI features in its traditional search engine. Annegret Hilse/Reuters

U.K. antitrust regulators said they would allow publishers to opt out of feeding their content to power artificial-intelligence features in Google’s online searches.

The Competition and Markets Authority said Wednesday that the move aims to give publishers control over how their content is used by AI and put them in a stronger position to negotiate with Google.

The tech giant has developed its own AI platform, Gemini, and rolled out AI features in its traditional search engine.

Regulators have grown increasingly concerned with how Google powers its AI tools, which essentially aggregate information on the internet to answer users’ queries in addition to Google Search’s links to other websites. The European Commission started in December an antitrust investigation into whether Google imposes unfair terms on publishers or is hurting competition online by giving itself privileged access to third-party content for its AI features.

It is also the next step in the U.K. watchdog’s bid to enforce the Digital Markets, Competition, and Consumer bill, which seeks to level the playing field for businesses online. Under the law, tech companies like Google are labeled as having strategic status due to their control over dominant platforms like search engines. Once designated, the CMA can impose so-called conduct requirements for them to follow.

Mrinalini Loew, general manager of Google’s Search Ecosystem, said in a blog post Wednesday that the company is listening to publishers and engaging with the CMA. It said it is starting to test a new tool that lets website owners manage how their content appears in AI search features, saying that sites that do choose to opt out of appearing in AI search results will not receive traffic from them, and how sites use the tool won’t factor into how their content is ranked in reach outside of its AI services.

“We are beginning to roll these features out to a subset of website owners in the UK, allowing for thorough testing before rolling them out to website owners globally,” Loew said.

The CMA said it is monitoring the changes Google implemented and their implications for businesses. The CMA deems Google’s search service to have strategic market status, which allows the regulator to introduce targeted requirements on how it operates.

“With features like AI Overviews rapidly reshaping online search, it is crucial that content publishers, including news organizations, have appropriate bargaining power over how their content is used,” CMA Chief Executive Sarah Cardell said.

News Corp, owner of The Wall Street Journal, has a commercial agreement to supply content on Google platforms.

Copyright ©2026 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

Adrià Calatayud is deputy breaking news editor for the EMEA desk at Dow Jones Newswires in Barcelona. Adrià covers European telecommunications, media and logistics companies. He was previously a market insights writer and a reporter at the U.K. desk in Barcelona.

Before joining the company in 2017, he worked for Spanish news agency EFE as a correspondent in China and the U.S.

Edith Hancock covers European competition enforcement for Dow Jones Newswires and The Wall Street Journal from Brussels. Before joining WSJ, Edith worked as a competition reporter for Politico Europe. She holds a master's degree in Business and Economics journalism from Columbia University in New York and in Interactive Journalism from City University in London.


Up Next


Videos

Read the whole story
bogorad
1 day ago
reply
uk is done. has been for quite some time. ok then.
Barcelona, Catalonia, Spain
Share this story
Delete

Why ‘Nvidia Inside’ Can Work in the PC Market - WSJ

1 Share

LLM (google/gemini-3.1-flash-lite-20260507) summary:

  • Strategic Pivot: nvidia extends its market dominance by launching central-processing units for personal computers to displace established x86 providers.
  • Market Speculation: financial markets reacted with predictable frenzy as shares of nvidia and affiliated hardware manufacturers rose on artificial intelligence hype.
  • Competitive Displacement: the plan targets the weakening grip of intel and amd, who currently struggle to manufacture excitement for their existing hardware products.
  • Industry Stagnation: despite corporate narratives of a revolutionary artificial intelligence boom, global personal computer sales remain tepid and below historical peaks.
  • Forced Adoption: recent sales of hardware with artificial intelligence features occur mostly because consumers lack viable alternatives at preferred performance levels.
  • Substantial Revenue: while pc-related business grows, it remains merely a secondary enterprise compared to the vast sums extracted from data center hardware sales.
  • Technical Obstacles: transition to arm-based architectures faces significant friction due to entrenched software dependence on legacy x86 instruction sets.
  • Brand Cultism: the strategy relies heavily on the hope that the current artificial intelligence cachet will replicate the historical success of marketing slogans like intel inside.

Nvidia CEO Jensen Huang holding two laptops displaying video games.Nvidia CEO Jensen Huang introduced the RTX Spark laptop during his keynote speech at Computex 2026 in Taipei. I-Hwa Cheng/AFP/Getty Images

The “Intel Inside” marketing campaign made Intel a household name and a ubiquitous personal-computer chip supplier in the 1990s.

These days, “Nvidia Inside” has a lot more selling power.

That is what Nvidia NVDA 6.26%increase; green up pointing triangle is betting with its new line of PC chips, set to be in Windows-based computers launching later this year. With its artificial-intelligence cachet, it is very likely that Nvidia will succeed, potentially upending an order in the PC world that has prevailed for pretty much the past five decades.

Investors are optimistic. Nvidia’s shares jumped more than 6% on Monday, while Windows-maker Microsoft MSFT 2.28%increase; green up pointing triangle rose more than 2%. Shares of PC makers Dell Technologies DELL 10.70%increase; green up pointing triangle and HP HPQ 8.51%increase; green up pointing triangle both surged more than 8%. Arm Holdings ARM 15.73%increase; green up pointing triangle, which licenses the basic blueprints that Nvidia uses in its PC chips, jumped more than 15%.

The direct impact of Nvidia’s PC play on its finances will likely be limited, given the enormous size of the company’s business selling AI chips for data centers. But the move does put Nvidia in a position to supercharge the market for AI-enabled computers and disrupt incumbents in the process. 

Nvidia isn’t exactly a new entrant to the market. It has been a big player in PCs for decades through its graphics chips, which produce sharper and smoother images on computer monitors—a capability videogamers prize. But before detailing its latest chips at a trade show in Taiwan, it hadn’t made the central-processing units at the computational hearts of PCs. Intel INTC -4.67%decrease; red down pointing triangle and Advanced Micro Devices AMD -1.16%decrease; red down pointing triangle dominate that market.

Nvidia’s new chips, which combine a CPU with the company’s wildly popular AI-computing hardware, come at a time of weakness for Intel. Intel remains the biggest supplier of CPUs for traditional Windows PCs, with a market share of about 64% in the final quarter of last year, according to Mercury Research. But Intel and other players in the PC market haven’t been able to convince huge numbers of consumers or companies to buy new computers because of the AI capabilities of their chips.

Around 270 million PCs were sold last year, according to Gartner, up by around 9% from 2024. That isn’t a stellar increase amid an AI boom that is supposed to transform how people work and live. The total is still below its Covid-era peak of about 340 million in 2021.

So far, most sales of AI-enabled PCs to date have been by default. Computer makers are adding neural-processing chips that enable some on-device AI functions to all of their higher-performing product lines. 

“The buyers choosing AI PCs today aren’t necessarily doing so for the AI,” said Jitesh Ubrani of market research firm IDC, which tracks PC sales. “They’re doing so because, at a certain performance tier, there’s no alternative.” 

For Nvidia, PCs are a small part of its business now. But the company boasts a strong appeal in the segment, which bodes well for its latest effort. Nvidia’s PC-related revenue jumped 41% in the fiscal year ended January to a little over $16 billion, thanks in part to the introduction of new videogaming chips under the company’s popular Blackwell brand. Total PC unit sales grew only 8% during the calendar year, according to IDC data. 

Created with Highcharts 9.0.1Global PC-unit shipments per yearSource: IDC
Created with Highcharts 9.0.12015'20'25050100150200250300350400 million

Whether Nvidia makes further inroads with its PC chips will be a test of the power of its brand and its close association with the AI boom. It will likely be much easier for Nvidia to sell people and companies on new AI-ready computers than it has been for Intel or AMD—or for that matter, Apple AAPL -1.84%decrease; red down pointing triangle, which also has a popular line of computers that use homegrown chips.

Nvidia’s rise won’t come without challenges—the largest of which may be the software stickiness that has built up around Intel’s and AMD’s processors. They use a basic chip architecture called x86. But software that works on x86 processors needs to be adapted to work well on Nvidia’s or Apple’s Arm-based chips.

There is now a version of Microsoft’s Windows operating system for Arm-based chips, and software engineers have been developing more programs for Arm as the number of Arm-based PCs grows. But Arm as a PC platform remains at a software-development disadvantage vis-à-vis x86, including in areas such as gaming, which remains an important end market for Nvidia.

Ultimately though, Nvidia’s entrance into CPUs for PCs is likely to further erode the remaining x86 advantage, especially in the world of Windows-based PCs, where it has been difficult for new Arm-based players to gain a foothold.

“Intel Inside” worked for many years. But in tech, even memorable marketing slogans have a shelf life.

Copyright ©2026 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

Asa Fitch is a writer covering technology for The Wall Street Journal’s Heard on the Street column, based in New York. He also co-writes the Journal's weekly AI & Business newsletter. Asa previously reported on semiconductor companies from the Journal’s San Francisco and New York bureaus, where he covered Nvidia’s rise amid the AI boom and Intel’s struggles to turn itself around.

Prior to that, Asa spent a decade as a foreign correspondent in the Middle East. He joined the Journal in Dubai, where he initially covered business and finance before shifting to covering regional politics and conflicts. He covered the Gaza War in 2014, the military campaign against Islamic State in Iraq, the Yemeni civil war, and Iranian elections and politics, including the country’s nuclear deal in 2015.

Asa began his career as a general-news reporter in Connecticut and a personal finance reporter in New York. He is a graduate of Carleton College and Columbia University's journalism school.

Dan Gallagher is a columnist for The Wall Street Journal’s Heard on the Street, where he covers the technology and media industries. His work for the Heard column spans the businesses of artificial intelligence, semiconductors, internet, software, hardware, advertising, streaming TV and film. Dan also co-writes WSJ's AI & Business newsletter.

He joined the column in 2013 after nearly two decades covering tech as a news reporter—starting well before smartphones got smart. 

Dan previously spent 10 years at MarketWatch, where he built a data news team and served as technology editor. He is based in San Francisco.


Up Next


Videos

Read the whole story
bogorad
3 days ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

Trump’s Border Triumph // A recently revised estimate from the Congressional Budget Office finds 1.5 million fewer illegal immigrants in the country than would have been the case had Biden’s policies continued.

1 Share
  • Policy shift impact: Congressional Budget Office (CBO) data indicates a reversal in immigration trends, shifting from a projected increase of 1.1 million to a net decrease of 360,000 illegal aliens in 2025.
  • Administrative enforcement: The sharp decline is attributed to executive actions implemented in January 2025, specifically the reinstatement of Migrant Protection Protocols and the termination of categorical parole and CBP One app facilitation.
  • Border control effectiveness: Over 90 percent of the reduction in illegal immigration results from stricter border policies rather than interior deportations, with releases at ports of entry and between ports dropping by over 94 percent compared to 2024.
  • Deterrence of illegal crossings: Estimations show an 83 percent reduction in unauthorized entries—evasions of capture—following the end of policies perceived as incentives for migration.
  • Historical context: The swing in the illegal alien population between the final three years of the previous administration and the first three years of the current one is estimated at 6.7 million people, surpassing the combined populations of Phoenix and Los Angeles.
  • Legal mandate fulfillment: The observed reductions validate the efficacy of existing federal statutes, such as the Immigration and Nationality Act's mandate that asylum-seekers be detained during proceedings, contradicting assertions that legislative reform was a prerequisite for border security.

The establishment political consensus has long held that it’s impossible to rein in illegal immigration until Congress passes “comprehensive immigration reform.” The Congressional Budget Office has recently made clear that all it really takes is a president willing to enforce federal immigration laws, as he is constitutionally required to do.

The CBO recently revised its estimates concerning the number of illegal aliens entering or leaving the United States. More than a year ago, just before Inauguration Day, the CBO estimated that a net 1.1 million “other foreign nationals”—those lacking “a legal immigration status”—would be added to the U.S. population in 2025. Now the CBO has revised that estimate downward by a whopping 1.5 million illegal aliens—from an increase of 1.1 million to a decrease of 360,000. This downward revision of 1.5 million people is equivalent to the population of San Antonio.

Finally, a reason to check your email.

Sign up for our free newsletter today.

First Name*
Last Name*
Email*
Sign Up
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply.
Thank you for signing up!

What caused this “significant reduction?” The congressional scorekeeper’s answer is clear. The decline was “driven largely by administrative actions taken since January 20, 2025.” The CBO particularly emphasizes an action that President Donald Trump took on his first day in office: “Executive Order 14165 reinstated the policy of Migrant Protection Protocols, which require people who want to apply for asylum in the United States to return to the territory from which they came.” The CBO adds, “That order also ended all categorical parole programs and ceased the use of the CBP One app as a method of paroling or facilitating the entry of people into the United States.”

Under President Joe Biden, illegal aliens who “arrived between official ports of entry”—that is, along the open (southwestern) border—“were generally released into the United States,” writes the CBO. It adds, “People could also use the CBP One app to schedule an appointment at a port of entry” and then “be released into the United States.”

The Trump administration almost immediately stopped the Biden administration’s lawless practice of releasing illegal aliens into the U.S. interior. This accounts for the CBO estimate that 1.5 million fewer illegal aliens were living in the U.S. at the end of 2025 than would have been the case with a continuation of Biden’s policies. The 46th president’s “equity”-based policies brought about a border crisis by design.

While the mainstream press emphasizes ICE raids and deportations, the Trump administration’s removal of aliens from the U.S. interior accounted for less than one-tenth of the CBO’s downward revision in net illegal immigration—it estimates that 120,000 people were removed from the U.S. interior in 2025. Far from being driven by deportations, more than 90 percent of the Trump administration’s success in reducing illegal immigration has come from limiting border crossings, per the CBO’s figures.

The CBO estimates that 540,000 illegal aliens arrived along the open border in 2024, between the ports of entry, and were released into the U.S. by the Biden administration. In 2025, such releases dipped to 20,000—a 96 percent decrease. Similarly, 960,000 illegal aliens arrived at the ports of entry and were released into the U.S. in 2024, compared with only 60,000 in 2025—a 94 percent drop. (Many of those released under Trump were unaccompanied minors, required by law to be released to sponsor families.)

What’s more, only about half of the 2025 releases occurred from February through December. Most of the releases at the ports, and many of the releases along the open border, occurred in January, when Biden was still in office for most of the month. In fact, roughly the same number of illegal aliens were released into the country during three weeks of Biden as during 49 weeks of Trump.

In addition, the Trump administration dramatically cut the number of people who evaded capture and snuck across the border. The CBO estimates that about 300,000 people escaped across the border in this manner in 2024 but only about 50,000 did so in 2025—an 83 percent reduction. In a 2023 immigration case, U.S. District Court Judge T. Kent Wetherell said the Biden administration’s “actions were akin to posting a flashing ‘Come In, We’re Open’ sign on the southern border.” When the Trump administration effectively unplugged that sign, the number of people trying to sneak across the border dropped dramatically.

In all, the CBO estimates that about 80,000 people lacking “a legal immigration status” were released into the U.S. last year (roughly half of them during the 20 days of Biden), while 50,000 snuck across the border and 260,000 overstayed their visas. Meanwhile, 400,000 decided to leave voluntarily, 120,000 were removed from the interior of the U.S., and 225,000 attained permanent legal status. That amounts to a one-year reduction in the illegal alien population of about 360,000, whereas the CBO had previously projected an increase of about 1.1 million.

The CBO now projects that over the first three years of the Trump administration, the number of illegal aliens living in the U.S. will decrease by 1 million (with reductions of 360,000 in 2025, 330,000 in 2026, and 330,000 in 2027). Over the last three years of the Biden administration, the CBO estimates a 5.7 million increase (2 million in 2022, 2.4 million in 2023, and 1.3 million in 2024) in the number of illegal aliens living in the U.S. That 6.7 million swing—from 5.7 million to negative 1 million—exceeds the combined populations of Los Angeles and Phoenix. That’s the difference between three years of Biden’s policies versus three years of Trump’s, according to the CBO.

The success that the Trump administration has had in reversing the flow of illegal immigration—simply by enforcing existing laws—was broadly thought to be impossible. Less than a year before Trump took office, the Wall Street Journal editorial board wrote that “the President needs Congress to fix the underlying incentives at the border.” A few months later, the Biden White House issued a fact sheet that began, “Since his first day in office, President Biden has called on Congress to secure our border.” When asked about illegal immigration during a 60 Minutes interview in October 2024, Vice President Kamala Harris said that “we need Congress to be able to act to actually fix the problem.”

It turns out that fixing the problem required only executing the laws already on the books—specifically the law requiring that asylum-seekers be detained while their cases are heard, rather than being released into the interior of the country. The Immigration and Nationality Act declares that “if an alien asserts a credible fear of persecution, he or she shall be detained for further consideration of the application for asylum.” Supreme Court Justice Samuel Alito writes that these detention “requirements, as we have held, are mandatory.”

The foreign-born portion of the U.S. population rose from 4.7 percent in 1970 to 16.2 percent in 2023, and probably reached about 16.8 percent in 2024—easily breaking the previous all-time record of 14.8 percent set in 1890, at the height of the great waves of nineteenth-century immigration. It will take many years of enforcing existing immigration laws to see that percentage dip back down to something approaching historical norms. But as the CBO now acknowledges, the Trump administration is making significant headway—even if the congressional scorekeeper didn’t see it coming.

Jeffrey H. Anderson is president of the American Main Street Initiative and served as director of the Bureau of Justice Statistics at the U.S. Department of Justice from 2017 to 2021.

Read the whole story
bogorad
3 days ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

A Weak Justification for Dropping ShotSpotter

1 Share

LLM (google/gemini-3.1-flash-lite-20260507) summary:

  • Flawed Methodology: the cited analysis from the university of chicago justice project relies on inconsistent timeframes that fail to account for seasonal variations in crime and emergency call volumes.
  • Data Discrepancies: local reporting highlights that the underlying data used to justify shotspotter removal is unreliable and suffers from significant statistical oversights.
  • Subjective Comparisons: the research team cherry-picked a winter period for post-shotspotter response times, essentially comparing cold months with statistically lower call volumes to warmer, high-activity months.
  • Hidden Variables: the study conveniently excludes gunshot-related calls from its response time metrics without providing transparent justification for why these were omitted from broader analysis.
  • Contextual Distortion: chicago's recent fluctuations in crime align with national trends, making it impossible to attribute local improvements specifically to the administrative decision to axe public safety technology.
  • Questionable Impact: statistical evidence suggests the purported gains in response efficiency are drastically overstated, with the proposed improvements potentially losing half their significance if a proper control group were applied.
  • Implementation Challenges: while proponents frame the removal as a success, actual operational outcomes are obscured by the city's broader staffing deficits and inability to effectively leverage existing policing tools.
  • Ideological Bias: the current municipal rhetoric prioritizes political performance over rigorous evidence, settling for superficial metrics rather than addressing the core complexities of urban public safety.

Courtesy Scott Olson/Getty Images

“The data proves we’re doing what works: investing in communities, improving response times, and giving police officers tools that are effective,” Chicago Mayor Brandon Johnson announced on X last week. “Not wasting taxpayer dollars for walkie talkies on poles that overpolice communities without improving safety.”

He was referring to a brief analysis from the University of Chicago Justice Project, which purported to show a 4.2-minute improvement in response time to “Priority 1” 911 calls—calls that require an immediate response—as well as a decline in violent crime after the city ended its use of ShotSpotter.

ShotSpotter is, basically, a system of microphones that monitor for the sound of gunshots in designated areas and, after human review of any gunfire-like bangs, turns the audio and estimated coordinates over to the cops.

The Justice Project’s analysis is weak sauce, despite the intense coverage it’s received in the Chicago media. The results have little bearing on the question of whether Chicago did the right thing by ending ShotSpotter in late 2024, or whether Johnson’s administration is doing the wrong thing by dragging its feet as it searches for a replacement.

To begin with, as the crime-watchers at CWBChicago promptly reported, the underlying data are less than perfect—the city doesn’t record response times as reliably as it should. Further, there are two significant sleights of hand in the analysis itself, key details the authors buried in fine print at the bottom of the web page where they presented their results.

First, the analysis uses two different time periods. When analyzing crime rates, the authors contrast the first nine months of 2024 and the first nine months of 2025—a sensible way of addressing the fact that crime is seasonal, rising in the summer and plummeting in the (frigid, if you’re in Chicago) winter. They explicitly note this time period not just in the fine print but also in the main text of the analysis.

By contrast, when they look at response times, the main text merely refers to an unspecified period “after ShotSpotter was removed.” Only by reading the fine print do we learn they are comparing “the six months right before the shutdown of ShotSpotter in 2024 and the first six months right after the shutdown.”

Which is to say that they are comparing, basically, a spring and summer with ShotSpotter to a fall and winter without it. This is a problem because, much like crime in general, Chicago’s Priority 1 calls have a strong seasonal pattern:

Chicago Priority 1 911 Calls

Source: Chicago Inspector General 911 call data, subset to Priority 1 calls.

There are simply fewer calls to respond to at colder times of the year. Further, this will have a bigger impact on the higher-crime areas where ShotSpotter was installed—which helps explain why, according to the analysis, ShotSpotter areas had twice the response-time improvement that non-covered areas did.

Incidentally, that same number implies that if the authors had used non-covered beats as a control group, rather than just comparing response times in ShotSpotter areas at two different parts of the year, their headline 4.2-minute result would have been cut in half.

The other issue CWBChicago flagged is that the researchers removed gunshot calls from the data. To be clear, there’s nothing inherently wrong with analyzing high-priority non-gunshot calls specifically. If, for example, ShotSpotter improved responses to gunshot calls but undermined responses to other calls, that would be worth knowing. But this is something readers should have been told up front, rather than being buried in fine print.

I’d like to add a few points to those raised by CWBChicago as well.

One is that Chicago’s recent crime decline has mirrored an improvement for the overall U.S.; over the past four years, murders have fallen by nearly half in both. This makes it difficult to tell what’s driving trends in the Windy City specifically, either citywide or in the higher-crime neighborhoods that had ShotSpotter—and it shows why accounting for the annual seasonality of crime trends, while necessary, does not isolate the impact of ShotSpotter itself.

If you thought ShotSpotter magically cut crime by 75 percent everywhere it’s installed, I suppose the similar crime declines in Chicago and the rest of the country should disabuse you of that notion. More realistically, though, the system might modestly reduce crime by getting cops to scenes a bit faster and producing extra evidence, and this analysis can’t distinguish such nuances from broader trends and noise in the data.

The balance of work between Priority 1 911 calls and ShotSpotter alerts is also worth pondering. As depicted in the chart above, Chicago police receive around 30,000 to 50,000 Priority 1 calls per month. Under ShotSpotter in 2024, the department was also getting about 2,000 to 3,000 monthly ShotSpotter alerts that weren’t accompanied by a 911 call about the same noises. The department’s work is divided amongst 11,500 officers.

What to make of such figures? The department has a lot on its plate, its current staffing is not sufficient for the challenges it faces, and responding to ShotSpotter calls certainly can have some impact on response times for other matters. Given that Priority 1 response times exceed ten or even 15 minutes on many beats even in the authors’ second period, these are clearly serious concerns. Nonetheless, the high ratio of top-priority 911 calls to ShotSpotter alerts should make us skeptical of a big role for the latter in overall response times.

ShotSpotter comes with real tradeoffs, which I laid out in considerable detail in a report last year. In general, it does seem to get cops to the scene of likely gunfire more quickly than 911 calls and boost evidence recovery, and it does sometimes alert cops to incidents they wouldn’t have heard about otherwise.

It also leads cops to scenes where they can’t find physical evidence of gunfire, though, and making the most of it requires manpower than some departments are lacking, with studies divided at best as to whether it leads to concrete improvements in crime and clearance rates. Eric Piza’s studies of Chicago have showed improved response to gunshots and increased firearm recoveries but no measurable improvement in crime or clearances.

I could buy that Chicago wasn’t making the most of the technology or simply didn’t have the staff to do so. And I’m eager to see more studies on the impact of shutting the system off. But hopefully future studies will feature more serious analysis than what the mayor is pushing.

Share

From the Manhattan Institute

Other Work of Note

Share

Read the whole story
bogorad
3 days ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete
Next Page of Stories