Strategic Initiatives
11821 stories
·
45 followers

Gemini 3 Pro: the frontier of vision AI

1 Share
  • Gemini 3 Pro introduction: Multimodal model with advances in visual and spatial reasoning, leading benchmarks in document, spatial, screen, and video understanding.
  • Document perception: Handles messy documents via OCR, derendering to HTML/LaTeX/Markdown, examples include 18th-century ledger table, equations, Nightingale diagram.
  • Document reasoning: Multi-step analysis on 62-page Census report, compares Gini index changes, identifies policy causes, assesses income shares.
  • Spatial understanding: Outputs pixel-precise coordinates, open vocabulary object identification for robotics and AR tasks like sorting trash or pointing to parts.
  • Screen understanding: Perceives and interacts with desktop/mobile UI for tasks like Pivot Table revenue summaries.
  • Video understanding: Processes high FPS for action details, reasoning on cause-effect, converts long videos to code/apps.
  • Applications: Education for diagram problems and homework correction; medical imaging benchmarks; law/finance document workflows.
  • Media resolution: Preserves aspect ratios, tunable resolution for fidelity vs. cost; available in Google AI Studio and developer docs.

Your browser does not support the audio element.

Listen to article

This content is generated by Google AI. Generative AI is experimental

8:05 minutes

0:00

8:05

Gemini 3 Pro represents a generational leap from simple recognition to true visual and spatial reasoning. It is our most capable multimodal model ever, delivering state-of-the-art performance across document, spatial, screen and video understanding.

This model sets new highs on vision benchmarks such as MMMU Pro and Video MMMU for complex visual reasoning, as well as use-case-specific benchmarks across document, spatial, screen and long video understanding.

[

Vision AI benchmarks table

](https://storage.googleapis.com/gweb-uniblog-publish-prod/images/Gemini3Pro-AFVAI_Benchmark_RD5_V02.original.png)

1. Document understanding

Real-world documents are messy, unstructured, and difficult to parse — often filled with interleaved images, illegible handwritten text, nested tables, complex mathematical notation and non-linear layouts. Gemini 3 Pro represents a major leap forward in this domain, excelling across the entire document processing pipeline — from highly accurate Optical Character Recognition (OCR) to complex visual reasoning.

Intelligent perception

To truly understand a document, a model must accurately detect and recognize text, tables, math formulas, figures and charts regardless of noise or format.

A fundamental capability is "derendering" — the ability to reverse-engineer a visual document back into structured code (HTML, LaTeX, Markdown) that would recreate it. As illustrated below, Gemini 3 demonstrates accurate perception across diverse modalities including converting an 18th-century merchant log into a complex table, or transforming a raw image with mathematical annotation into precise LaTeX code.

Input image of an old merchants handbook ledger along with an output image that clearly reconstructed transcription

Example 1: Handwritten Complex Table from 18th century Albany Merchant’s Handbook

Input image of a scan of an equation alongside an output of the model solving the equation

Example 2: Reconstructing equations from an image

Image showing input of a scanned diagram into a an interactive chart

Example 3: Reconstructing Florence Nightingale's original Polar Area Diagram into an interactive chart (with a toggle!)

Sophisticated reasoning

Users can rely on Gemini 3 to perform complex, multi-step reasoning across tables and charts — even in long reports. In fact, the model notably outperforms the human baseline on the CharXiv Reasoning benchmark (80.5%).

To illustrate this, imagine a user analyzing the 62-page U.S. Census Bureau "Income in the United States: 2022" report with the following prompt: “Compare the 2021–2022 percent change in the Gini index for "Money Income" versus "Post-Tax Income", and what caused the divergence in the post-tax measure, and in terms of "Money Income", does it show the lowest quintile's share rising or falling?”

Swipe through the images below to see the model's step-by-step reasoning.

Pdf image highlighting the numbers -1.2 and 3.2

Visual Extraction: To answer the Gini Index Comparison question, Gemini located and cross-referenced this info in Figure 3 about “Money Income decreased by 1.2 percent” and in Table B-3 about “Post-Tax Income increased by 3.2 percent”

Pdf image highlighting the ARPA policies lapsing in 2021 and the stimulus payments ending

Causal Logic: Crucially, Gemini 3 does not stop at the numbers; it correlates this gap with the text’s policy analysis, correctly identifying Lapse of ARPA Policies and the end of Stimulus Payments are the main causes.

Pdf highlighting the numbers 2.9 and 3.0 for 2021 and 2022 respectively

Numerical Comparison: To compare the lowest quantile’s share rising or falling, Gemini3 looked at table A-3, and compared the number of 2.9 and 3.0, and concluded that “the share of aggregate household income held by the lowest quintile was rising.”

Final model response text

Final Model Answer

2. Spatial understanding

Gemini 3 Pro is our strongest spatial understanding model so far. Combined with its strong reasoning, this enables the model to make sense of the physical world.

  • Pointing capability: Gemini 3 has the ability to point at specific locations in images by outputting pixel-precise coordinates. Sequences of 2D points can be strung together to perform complex tasks, such as estimating human poses or reflecting trajectories over time.
  • Open vocabulary references: Gemini 3 identifies objects and their intent using an open vocabulary. The most direct application is robotics: the user can ask a robot to generate spatially grounded plans like, “Given this messy table, come up with a plan on how to sort the trash.” This also extends to AR/XR devices, where the user can request an AI assistant to “Point to the screw according to the user manual.”

Image showing a cluttered box, a bottle, a screwdriver, a pouch and a measuring tape on a table. A line connects a clear path between the measuring tape and the box created by Gemini 3 Pro

A picture of a cluttered kitchen counter with open cabinets. Three lines show the trajectory between the mug, the glass and the bowl and specific spots in the cabinet where they should go, created by Gemini 3 Pro

Picture of a circuit board with each distinct item labeled by Gemini 3 Pro

3. Screen understanding

Gemini 3.0 Pro’s spatial understanding really shines through its screen understanding of desktop and mobile OS screens. This reliability helps make computer use agents robust enough to automate repetitive tasks. UI understanding capabilities can also enable tasks like QA testing, user onboarding and UX analytics. The following computer use demo shows the model perceiving and clicking with high precision.

Task: Summarize the total revenue for each promotion type in a new sheet (Sheet2) with the promotion names as the column headers using the Pivot Table feature.

4. Video understanding

Gemini 3 Pro takes a massive leap forward in how AI understands video, the most complex data format we interact with. It is dense, dynamic, multimodal and rich with context.

  1. High frame rate understanding: We have optimized the model to be much stronger at understanding fast-paced actions when sampling at >1 frames-per-second. Gemini 3 Pro can capture rapid details — vital for tasks like analyzing golf swing mechanics.

Sorry, your browser doesn't support embedded videos, but don't worry, you can download it and watch it with your favorite video player!

By processing video at 10 FPS—10x the default speed—Gemini 3 Pro catches every swing and shift in weight, unlocking deep insights into player mechanics.

2. Video reasoning with “thinking” mode: We upgraded "thinking" mode to go beyond object recognition toward true video reasoning. The model can now better trace complex cause-and-effect relationships over time. Instead of just identifying what is happening, it understands why it is happening.

3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code.

5. Real-world applications

Here are a few ways we think various fields will benefit from Gemini 3’s capabilities.

Education

Gemini 3.0 Pro’s enhanced vision capabilities drive significant gains in the education field, particularly for diagram-heavy questions central to math and science. It successfully tackles the full spectrum of multimodal reasoning problems found from middle school through post-secondary curriculums. This includes visual reasoning puzzles (like Math Kangaroo) and complex chemistry and physics diagrams.

Gemini 3’s visual intelligence also powers the generative capabilities of Nano Banana Pro. By combining advanced reasoning with precise generation, the model, for example, can help users identify exactly where they went wrong in a homework problem.

[

Image showing input of a handwritten equation on the left and the model's correction annotated on top of the handwritten equation

](https://storage.googleapis.com/gweb-uniblog-publish-prod/documents/Gemini3Pro-AFVAI_NB_RD1-V01.png)

Prompt: “Here is a photo of my homework attempt. Please check my steps and tell me where I went wrong. Instead of explaining in text, show me visually on my image.” (Note: Student work is shown in blue; model corrections are shown in red). [See prompt in Google AI Studio]

Medical and biomedical imaging

Gemini 3 Pro 1 stands as our most capable general model for medical and biomedical imagery understanding, achieving state-of-the-art performance across major public benchmarks in MedXpertQA-MM (a difficult expert-level medical reasoning exam), VQA-RAD (radiology imagery Q&A) and MicroVQA (multimodal reasoning benchmarks for microscopy based biological research).

[

Image showing a stained kidney cortex image on the left and the model prompt and response on the right

](https://storage.googleapis.com/gweb-uniblog-publish-prod/documents/Gemini3Pro-AFVAI_Medical2_RD1-V01.png)

Input image from MicroVQA - a benchmark for microscopy-based biological research

Law and finance

Gemini 3 Pro’s enhanced document understanding helps professionals in finance and law tackle highly complex workflows. Finance platforms can seamlessly analyze dense reports filled with charts and tables, while legal platforms benefit from the model's sophisticated document reasoning.

“We’re impressed by Gemini 3's improvements in advanced legal reasoning, especially its ability to understand and edit contracts with complex redlines. This has been particularly valuable for our in-house customers due to the high volume and variability of the legal contracts they handle.”

H

Harvey.ai

6. Media resolution control

Gemini 3 Pro improves the way it processes visual inputs by preserving the native aspect ratio of images. This drives significant quality improvements across the board.

Additionally, developers gain granular control over performance and cost via the new media_resolution parameter. This allows you to tune visual token usage to balance fidelity against consumption:

  • High resolution: Maximizes fidelity for tasks requiring fine detail, such as dense OCR or complex document understanding.
  • Low resolution: Optimizes for cost and latency on simpler tasks, such as general scene recognition or long-context tasks.

For specific recommendations, refer to our Gemini 3.0 Documentation Guide.

Build with Gemini 3 Pro

We are excited to see what you build with these new capabilities. To get started, check out our developer documentation or play with the model in Google AI Studio today.

POSTED IN:

Read the whole story
bogorad
7 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

AI-equipped police body cameras in Canada's Edmonton spark ethical concerns | AP News

1 Share
  • Edmonton pilot project: Police body cameras with AI detect faces of 7,000 high-risk individuals on watchlist.
  • Axon Enterprise involvement: Company conducts early-stage field research outside U.S. to assess performance and safeguards.
  • Watchlist details: 6,341 people flagged for violent, armed, escape risk; 724 with serious warrants.
  • Safety objective: Alerts officers to dangerous persons during investigations or calls for assistance.
  • Axon CEO statement: Testing gathers insights for responsible use, technology now more accurate.
  • Former ethics board chair view: Calls for public debate, testing, evidence of benefits before deployment.
  • Pilot parameters: Limited to 50 officers, daylight hours through December, matches reviewed post-incident by humans.
  • Broader context: Alberta mandates body cameras; Axon leads market, faces accuracy and bias mitigation efforts.

Police body cameras equipped with artificial intelligence have been trained to detect the faces of about 7,000 people on a “high risk” watch list in the Canadian city of Edmonton, a live test of whether facial recognition technology shunned as too intrusive could have a place in policing throughout North America.

But six years after leading body camera maker Axon Enterprise, Inc. said police use of facial recognition technology posed serious ethical concerns, the pilot project — switched on last week— is raising alarms far beyond Edmonton, the continent’s northernmost city of more than 1 million people.

A former chair of Axon’s AI ethics board, which led the company to temporarily abandon facial recognition in 2019, told The Associated Press he’s concerned that the Arizona-based company is moving forward without enough public debate, testing and expert vetting about the societal risks and privacy implications.

“It’s essential not to use these technologies, which have very real costs and risks, unless there’s some clear indication of the benefits,” said the former board chair, Barry Friedman, now a law professor at New York University.

Related Stories

Border Patrol is monitoring US drivers and detaining those with 'suspicious' travel patterns

Border Patrol is monitoring US drivers and detaining those with 'suspicious' travel patterns

Takeaways from AP report on how Border Patrol monitors US drivers for ‘suspicious’ travel

Takeaways from AP report on how Border Patrol monitors US drivers for ‘suspicious’ travel

What does 'agentic' AI mean? Tech's newest buzzword is a mix of marketing fluff and real promise

What does 'agentic' AI mean? Tech's newest buzzword is a mix of marketing fluff and real promise

Axon founder and CEO Rick Smith contends that the Edmonton pilot is not a product launch but “early-stage field research” that will assess how the technology performs and reveal the safeguards needed to use it responsibly.

“By testing in real-world conditions outside the U.S., we can gather independent insights, strengthen oversight frameworks, and apply those learnings to future evaluations, including within the United States,” Smith wrote in a blog post.

The pilot is meant to help make Edmonton patrol officers safer by enabling their body-worn cameras to detect anyone who authorities classified as having a “flag or caution” for categories such as “violent or assaultive; armed and dangerous; weapons; escape risk; and high-risk offender,” said Kurt Martin, acting superintendent of the Edmonton Police Service. So far, that watch list has 6,341 people on it, Martin said at a Dec. 2 press conference. A separate watch list adds 724 people who have at least one serious criminal warrant, he said.

“We really want to make sure that it’s targeted so that these are folks with serious offenses,” said Ann-Li Cooke, Axon’s director of responsible AI.

If the pilot expands, it could have a major effect on policing around the world. Axon, a publicly traded firm best known for developing the Taser, is the dominant U.S. supplier of body cameras and has increasingly pitched them to police agencies in Canada and elsewhere. Axon last year beat its closest competitor, Chicago-based Motorola Solutions, in a bid to sell body cameras to the Royal Canadian Mounted Police.

Motorola said in a statement that it also has the ability to integrate facial recognition technology into police body cameras but, based on its ethical principles, has “intentionally abstained from deploying this feature for proactive identification.” It didn’t rule out using it in the future.

The government of Alberta in 2023 mandated body cameras for all police agencies in the province, including its capital city Edmonton, describing it as a transparency measure to document police interactions, collect better evidence and reduce timelines for resolving investigations and complaints.

While many communities in the U.S. have also welcomed body cameras as an accountability tool, the prospect of real-time facial recognition identifying people in public places has been unpopular across the political spectrum. Backlash from civil liberties advocates and a broader conversation about racial injustice helped push Axon and Big Tech companies to pause facial recognition software sales to police.

Among the biggest concerns were studies showing that the technology was flawed, demonstrating biased results by race, gender and age. It also didn’t match faces as accurately on real-time video feeds as it did on faces posing for identification cards or police mug shots.

Several U.S. states and dozens of cities have sought to curtail police use of facial recognition, though President Donald Trump’s administration is now trying to block or discourage states from regulating AI.

The European Union banned real-time public face-scanning police technology across the 27-nation bloc, except when used for serious crimes like kidnapping or terrorism.

But in the United Kingdom, no longer part of the EU, authorities started testing the technology on London streets a decade ago and have used it to make 1,300 arrests in the past two years. The government is considering expanding its use across the country.

Many details about Edmonton’s pilot haven’t been publicly disclosed. Axon doesn’t make its own AI model for recognizing faces but declined to say which third-party vendor it uses.

Edmonton police say the pilot will continue through the end of December and only during daylight hours.

“Obviously it gets dark pretty early here,” Martin said. “Lighting conditions, our cold temperatures during the wintertime, all those things will factor into what we’re looking at in terms of a successful proof of concept.”

Martin said about 50 officers piloting the technology won’t know if the facial recognition software made a match. The outputs will be analyzed later at the station. In the future, however, it could help police detect if there’s a potentially dangerous person nearby so they can call in for assistance, Martin said.

That’s only supposed to happen if officers have started an investigation or are responding to a call, not simply while strolling through a crowd. Martin said officers responding to a call can switch their cameras from a passive to an active recording mode with higher-resolution imaging.

“We really want to respect individuals’ rights and their privacy interests,” Martin said.

The office of Alberta’s information and privacy commissioner Diane McLeod said she received a privacy impact assessment from Edmonton police on Dec. 2, the same day Axon and police officials announced the program. The office said Friday it’s now working to review the assessment, a requirement for projects that collect “high sensitivity” personal data.

University of Alberta criminology professor Temitope Oriola said he’s not surprised that the city is experimenting with live facial recognition, given that the technology is already ubiquitous in airport security and other environments.

“Edmonton is a laboratory for this tool,” Oriola said. “It may well turn out to be an improvement, but we do not know that for sure.”

Oriola said the police service has had a sometimes “frosty” relationship with its Indigenous and Black residents, particularly after the fatal police shooting of a member of the South Sudanese community last year, and it remains to be seen whether facial recognition technology makes policing safer or improves interactions with the public.

Axon has faced blowback for its technology deployments in the past, as in 2022, when Friedman and seven other members of Axon’s AI ethics board resigned in protest over concerns about a Taser-equipped drone.

In the years since Axon opted against facial recognition, Smith, the CEO, says the company has “continued controlled, lab-based research” of a technology that has “become significantly more accurate” and is now ready for trial in the real world.

But Axon acknowledged in a statement to the AP that all facial recognition systems are affected by “factors like distance, lighting and angle, which can disproportionately impact accuracy for darker-skinned individuals.”

Every match requires human review, Axon said, and part of its testing is also “learning what training and oversight human reviewers must have to mitigate known risks.”

Friedman said Axon should disclose those evaluations. He’d want to see more evidence that facial recognition has improved since his board concluded that it wasn’t reliable enough to ethically justify its use in police cameras.

Friedman said he’s also concerned about police agencies greenlighting the technology’s use without deliberation by local legislators and rigorous scientific testing.

“It’s not a decision to be made simply by police agencies and certainly not by vendors,” he said. “A pilot is a great idea. But there’s supposed to be transparency, accountability. ... None of that’s here. They’re just going ahead. They found an agency willing to go ahead and they’re just going ahead.”

—-

AP writer Kelvin Chan in London contributed to this report.

Read the whole story
bogorad
7 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

How Stablecoins Can Help Criminals Launder Money and Evade Sanctions - The New York Times

1 Share
  • Traditional illicit storage: Smugglers, money launderers, and sanctioned individuals used diamonds, gold, and artwork.
  • Drawbacks of luxury goods: Cumbersome to move and difficult to spend.
  • New option for criminals: Stablecoins, cryptocurrencies pegged to the U.S. dollar.
  • Lack of oversight: Stablecoins operate largely outside traditional financial systems.
  • Easy acquisition: Bought with local currency and transferred across borders instantly.
  • Re-entry to banking: Convertible to traditional systems like debit cards, often undetected.
  • Evidence sources: New York Times review of corporate filings, forum messages, and blockchain data.
  • Illicit volume estimate: Chainalysis report shows up to $25 billion in stablecoin transactions last year, used by Russian oligarchs and others.

Smugglers, money launderers and people facing sanctions once relied on diamonds, gold and artwork to store illicit fortunes. The luxury goods could help hide wealth but were cumbersome to move and hard to spend.

Now, criminals have a far more practical alternative: stablecoins, a cryptocurrency tied to the U.S. dollar that exists largely beyond traditional financial oversight.

These digital tokens can be bought with a local currency and moved across borders almost instantly. Or they can be returned to the traditional banking system — including by converting funds into debit cards — often without detection, a New York Times review of corporate filings, online forum messages and blockchain data shows.

A report released in February from Chainalysis, a blockchain analysis firm, estimated that up to $25 billion in illicit transactions involved stablecoins last year. And as more Russian oligarchs, Islamic State leaders and others have begun using the cryptocurrency, the rise of these dollar-linked tokens threatens to undermine one of America’s most potent foreign policy tools: cutting adversaries off from the dollar and the global banking system.

Read the whole story
bogorad
7 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

Exclusive | An AI Startup Looks Toward the Post-Transformer Era - WSJ

1 Share
  • Pathway announcement: Palo Alto startup to reveal Dragon Hatchling architecture running on Nvidia AI infrastructure and AWS cloud.
  • Commercial rollout: Architecture shipped, trained models planned for next year with immediate production compatibility.
  • Memory advantage: Provides AI memory surpassing LLMs, enabling continuous learning and adaptive systems.
  • AGI potential: Positioned as faster route to artificial general intelligence akin to human cognition.
  • Team expertise: Founded 2020 by PhD holders including CEO Zuzanna Stamirowska and Google Brain veteran Jan Chorowski.
  • Funding secured: Over $20 million raised, backed by transformer paper co-author Lukasz Kaiser.
  • Architecture innovation: Uses brain-like synapse updates and unified memory for lifelong learning.
  • Applications: Targets complex tasks in finance, supply chains, fusion research, and innovation acceleration.

One early-stage startup developing a transformer alternative, Palo Alto, Calif.-based Pathway, plans to announce Monday that its “Dragon Hatchling” architecture now runs on Nvidia AI infrastructure and Amazon Web Services’ cloud and AI tech stack.

The company has shipped Dragon Hatchling architecture, but doesn’t plan to release the commercial models trained on it until next year. Once that happens, its Nvidia and AWS compatibility means companies would be able to put it into production “the next day,” Pathway said.

Dragon Hatchling imbues AI with memory that large language models can’t match, according to Pathway, theoretically enabling a new class of continuously learning, adaptive AI systems. The company also casts its approach as a potentially faster way to get to artificial general intelligence, which some people describe as similar to human-level cognitive ability.

The company isn’t alone in this quest. It regards large and well-established Anthropic as its biggest obstacle. It faces other challenges, too, such as convincing potential users who have just learned one set of AI vocabulary and skills that they should adopt something new.

Regardless of whether Pathway fulfills its ambitions, it will at least get a chance to make its case to the market. Its arrival also reinforces the intense scientific effort driving AI forward, even as big deals, big valuations and big personalities command the attention.

‘Equations of reasoning’

“This is just fun, right?” said Zuzanna Stamirowska, co-founder and chief executive officer at Pathway, when I met with her and another member of the team at Wall Street Journal headquarters in November. She was enthusing about Pathway’s approach, likening it to scientists’ discovery of thermodynamics, which accelerated the Industrial Revolution by shifting society from simply building engines to understanding the laws of heat and energy that govern them.

Pathway has identified what Stamirowska calls equations of reasoning, fundamental mathematical axioms that explain how intelligence emerges from smaller, local interactions in the brain, she said. That means it can explain how and why intelligence works, rather than just observing that it does, which has been a struggle with transformer-based models.

That also helps Pathway address large language models’ typical limits when it comes to building on previous interactions, by strengthening or weakening synapses over time according to their use, said Stamirowska, who holds a Ph.D. in complex systems and has published research on emergent behavior in dynamic networks. She has also received France’s i-Lab innovation prize and been called one of “100 geniuses whose innovation will change the world” by the magazine Le Point.

“Memory is key to intelligence and efficient reasoning,” Stamirowska said. 

Zuzanna Stamirowska, co-founder and CEO of AI startup Pathway and co-author of its Baby Dragon Hatchling architecture.

Pathway co-founder and CEO Zuzanna Stamirowska. Ashley Kaplan

In the transformer, short-term memory and long-term memory are organized in an incompatible manner, with no clear way to transfer from short-term memory to long-term memory, according to Stamirowska. “It is not just a technicality. It is a foundational obstacle,” she said.

Pathway’s architecture organizes short-term memory very differently than the transformer, with an update mechanism that resembles what is found in the brain, and, crucially, has the same storage pattern as long-term memory, according to Stamirowska. “This opens the door to lifelong learning with transfer from short- to long-term memory, and moving smoothly to longer reasoning,” she said.

The company was founded in 2020 by Stamirowska, Chief Operating Officer Claire Nouet, Chief Scientific Officer Adrian Kosowski and Chief Technology Officer and Google Brain veteran Jan Chorowski. The 26-member team includes eight Ph.D.s, including Kosowski, a theoretical computer scientist, mathematician and quantum physicist who received his doctorate at age 20.

The company said it has raised more than $20 million, including more than $16.2 million in venture funding and about $3.8 million in non-dilutive research and development grants. Backers include Lukasz Kaiser, one of eight Google researchers who kicked off the transformer era in 2017 with the paper Attention Is All You Need, and early-stage investor TQ Ventures. The company declined to disclose its valuation.

Stamirowska said the name of Pathway’s architecture is inspired by the dragons in Terry Pratchett’s novel “Color of Magic,” which appear more frequently as the characters think about them. “For now, we have presented to the world an architecture, hence it’s a hatchling,” she said.

Accelerating innovation

The company expects the architecture to have broad applications in solving problems in business, finance and beyond.

Pathway Chief Commercial Officer Victor Szczerba distinguishes between “commodity” AI tasks such as approving a customer discount and more demanding projects such as end-of-quarter financial planning. “This process spans eight weeks, involves coordination across 10 departments, and requires maintaining context over a long period,” Szczerba said. “Pathway’s architecture is designed to handle this complexity by remembering sequences and consequences over time, rather than resetting with every interaction.”

The technology could be useful in solving complex supply chain variability. For example, a steel manufacturer faced with a sudden shortage of tungsten could apply Pathway’s framework that can learn from limited amounts of private data, without exposing that data to the world. Other potential applications exist in areas such as fusion research, space exploration and optimization of global trading networks, according to Stamirowska.


Newsletter Sign-up

WSJ | CIO Journal

The Morning Download delivers daily insights and news on business technology from the CIO Journal team.

Subscribe


In all of these examples, the key is a need for real innovation. To come up with a new spaceship design, an AI model can’t just access lots of data on other spaceships and learn. It requires a model capable of generalizing or learning to reason, rather than pattern matching.

“We will speed up innovation cycles dramatically,” Stamirowska said. “The problem with current transformer-based models is that they need a lot of data, and they don’t generalize outside….of what they have seen.”

While a bubble may have formed around the concept of the large language model, that need not be true of AI itself, according to Martín Farach-Colton, chair of the Computer Science and Engineering Department at NYU Tandon School of Engineering. LLMs face limits when it comes to three areas: determining how models arrive at their answers; their ability to generalize plans beyond the criteria of data they are trained on; and multimodality, which is the ability to process text, images, video and spatial reasoning simultaneously. However, considerable effort is being made in addressing these shortcomings, especially the third.

Pathway’s architecture could help address the first two problems, according to Farach-Colton, who said he has known some Pathway team members professionally but has no financial or commercial ties to the company.

“The market may be overvaluing the current iteration of the technology [LLMs] while potentially underestimating or misunderstanding the necessity of the next architectural leap,” he said.

Write to Steven Rosenbush at steven.rosenbush@wsj.com

Read the whole story
bogorad
8 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

Why More Hispanics Are Identifying As White // Latin American immigrants and their descendants are assimilating to the majority culture.

1 Share
  • 2024 Election Results: Donald Trump secured 48% of the Hispanic vote, nearly matching Kamala Harris's 48%, and won 51% among foreign-born naturalized Hispanic immigrants.
  • Sociological Assimilation: Hispanics are integrating into America's mainstream culture and increasingly identifying as white over generations through intermarriage and cultural blending.
  • Decline of Ethnic Identity: By the third generation, one in four Hispanic descendants no longer identify as Hispanic, rising to half by the fourth, due to high intermarriage rates exceeding 40% with non-Hispanic whites.
  • Rejection of Leftist Policies: Hispanic immigrants, following border issues closely, shifted toward Trump amid concerns over illegal immigration, crime, and distinguishing themselves from border crossers.
  • Education and Intermarriage: College-educated Hispanics are eight to ten times more likely to marry non-Hispanic whites, accelerating assimilation and reducing reliance on ethnic grievance politics.
  • Historical Precedents: Similar to early 20th-century Italians who overcame racial stereotypes to blend into the white majority within three generations through intermarriage and suburbanization.
  • European Ancestry and Language: Many Hispanics report European roots, with over 25% of recent Latin American immigrants proficient in English, up from 15% in the 1970s, fostering mainstream identification.
  • Political Realignment: Assimilating Hispanics embrace family, faith, work, and patriotism, diminishing identity politics and bolstering the American melting pot for greater national unity.

The 2024 presidential election confirmed a political phenomenon that has been building for years: Hispanics are no longer a reliable Democratic constituency. Donald Trump won 48 percent of the Hispanic vote, just three points shy of Kamala Harris’s share. Perhaps even more striking, he carried 51 percent of foreign-born, naturalized Hispanic immigrants—a direct rebuke to the assumption that immigrants instinctively side with the Left.

This shift is not simply about politics. It reflects a deeper sociological reality: Hispanics are assimilating into America’s mainstream and, over generations, becoming white, in both culture and identification. That is not something to fear. It is the story of American integration, repeated again in our time.

Finally, a reason to check your email.

Sign up for our free newsletter today.

First Name*
Last Name*
Email*
Sign Up
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply.
Thank you for signing up!

Progressives worked for decades to establish “Hispanic” as a distinct ethic category, hoping Latin American immigrants and their descendants would embrace it as a shared political identity. They assumed that these voters would sympathize with radical pro–illegal immigration policies, such as sanctuary cities that protect criminals, out of a sense of solidarity.

But the opposite is increasingly true. Hispanic immigrants follow news about the border and illegal immigration more closely than native-born Hispanics and shifted more toward Trump probably because of Biden’s open-border disaster—not only because they are more likely to be the victims of illegal-immigrant crime but also because they see themselves as distinct from illegal border crossers, even those from the same home country.

Further, when analysts talk about the “Hispanic vote,” they overlook a key fact: not all people of Latin American ancestry identify as Hispanic. One analysis showed that by the third generation, about one in four descendants of Hispanic immigrants no longer identify as Hispanic. By the fourth generation, half of all descendants no longer do.

What explains this pattern? Hispanics intermarry at high rates with non-Hispanic whites. As of 2016, over 40 percent of all interracial marriages in the U.S. involved one Hispanic and one non-Hispanic white spouse. The number of these couples climbed from 1.4 million in 2000 to 2.4 million by the 2010s and is almost certainly higher today. Children of these marriages often stop identifying as Hispanic, especially if Spanish is not spoken at home, which is often the case by the second and later generations. Moreover, nearly one in five Latin American immigrant married women are married to a native-born American, suggesting quick assimilation.

The ethnic attrition of Hispanics has real political consequences. Hispanics who intermarry and assimilate most quickly are often those who speak English fluently and are better educated. Their children are more inclined to identify as white and to join the cultural mainstream. Hispanics who graduate college are between eight and ten times more likely to marry a non-Hispanic white person than a Hispanic who didn’t complete high school. As Hispanics have gotten better educated, they also have mixed with whites at a higher rate.

Assimilation reduces the appeal of left-wing, ethnic grievance politics. If you don’t think of yourself as a minority, you’re less likely to vote like one. If you don’t speak Spanish, you care less about calculated political appeals in foreign languages.

As more Hispanics stop thinking of themselves as an identity group, their political preferences will shift accordingly to the average. That helps explain why Republicans have made inroads with Hispanic voters at every level, from residents in South Texas border towns several generations removed from their Mexican immigrant heritage to the more recent immigrants in Florida’s Cuban and South American communities.

Skeptics may balk at the claim that Hispanics are “becoming white.” But history offers precedents. In the early twentieth century, for example, Italians were considered racially suspect. They were stereotyped as criminals, unassimilable Catholics, and not quite white. Yet within three generations, they had intermarried, moved to the suburbs, and disappeared into the white majority.

Of course, some Italians today think of themselves as Italian-American, just as some Irishmen consider themselves Irish-American. The same will occur with people who today identify with various Latin American nationalities. But these cultural identities, often used to define things like food or sports preferences, do not constitute political loyalties, let alone national ones. They are the kind of diversity that is enriching, not threatening.

More Hispanics are identifying as white in part because a substantial share of them already report European ancestry. This share is larger among more recent Latin American immigrants.  Today, third- and fourth-generation Hispanics are following the same path. Many also have substantial European ancestry. In Argentina, for example, a majority of the population descends from Italian and Spanish immigrants, and tens of thousands still hold EU citizenship through family lineage. For their American-born children and grandchildren, blending into the white mainstream is not a major leap.

Many immigrants identify as white on their arrival. One in four immigrants from Latin America identify as white. Among some groups, such as Cubans and Venezuelans, that rate exceeds one in three.

English proficiency is another measure of Hispanic assimilation. New immigrants from Latin America are much more proficient in English than they were decades ago. While barely 15 percent of young Latin American immigrants spoke English well or at a native level in the 1970s, over 25 percent of them do today.

These shifts are not limited to white-identifying descendants of Latin American immigrants. Indeed, the 2024 election reveals that even those who identify as Hispanic are moving beyond identity politics and toward mainstream values: family, faith, work, patriotism. That is good for the United States.

Far from being a permanent minority locked into the Democratic coalition, Hispanics are showing that the American melting pot still works. They are increasingly marrying into the white majority, shedding ethnic labels, and embracing mainstream politics. Indeed, Hispanics’ cultural assimilation means that in the future, ethnic identity will matter less—and America will be more united.

This article is part of a series on political realignment among ethnic groups in the United States.

Daniel Di Martino is a fellow at the Manhattan Institute and a Ph.D. candidate in economics at Columbia University.

Photo by David McNew/Getty Images

Read the whole story
bogorad
8 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete

The Somali Fraud Story Busts Liberal Myths // Mass immigration, antiracism, and the welfare state lead inexorably to fraud.

1 Share
  • Somali Fraud Schemes: Decade-long operations in Minnesota stole billions from welfare programs, with funds funneled to Al-Shabaab terrorists in Somalia through sophisticated enterprises exploiting generous state benefits.
  • Media Trajectory: Story gained rapid traction on conservative platforms, escalated to Fox News, prompting calls to revoke Temporary Protected Status for Somali recipients.
  • Trump's Response: President announced revocation of TPS for Somalis, followed by social media attacks on Governor Tim Walz and promises to halt Third World immigration and asylum cases.
  • Liberal Media Reactions: CBS News misrepresented the report and retracted under pressure; New York Times confirmed key facts in a feature but avoided addressing root causes of the fraud.
  • Cultural Incompatibilities: Somali attitudes toward government and civil society clashed with Minnesota's welfare system, using racism accusations to evade scrutiny, as seen post-George Floyd.
  • National Immigration Issues: Biden-era mass migration imported millions, including illegal entrants bringing negative cultural elements, exemplified by Venezuelan gangs in Colorado and Haitian migrants in Pennsylvania.
  • Policy Recommendations: Trump administration should impose bank account proof requirements, heavy remittance taxes on illegals, and accelerate prosecutions of immigrant fraud nationwide.

There is a moment when every news story either achieves lift-off or tumbles back to the earth. Having covered a few that drove national headlines, I’ve discovered there is no universal formula for which ones hit the stratosphere, and which do not.

Our recent story detailing Minnesota’s Somali fraud rings has been one of the lucky ones, achieving liftoff in record time. City Journal reporter Ryan Thorpe and I summarized a decade of Somali fraud schemes that stole billions of taxpayer dollars, some of which ended up with Al-Shabaab terrorists back in Somalia. These were sophisticated criminal enterprises that exploited Minnesota’s generous welfare state, deployed accusations of racism to deter scrutiny, and looted the public treasury until local prosecutors did the hard work to bring them down.

Finally, a reason to check your email.

Sign up for our free newsletter today.

First Name*
Last Name*
Email*
Sign Up
This site is protected by hCaptcha and its Privacy Policy and Terms of Service apply.
Thank you for signing up!

The meta-story—how a news item weaves its way through public discourse—is also worth considering. When we published the story, it quickly dominated the conversation on conservative social media. It filtered upward to primetime Fox News, where, on Laura Ingraham’s program, I summarized the piece and called on President Trump to revoke Temporary Protected Status (TPS) for all Somalis in Minnesota.

Within hours, the president, who had been following the story, announced that he would revoke TPS for all Somali recipients. Then, over the Thanksgiving holiday, Trump raised the stakes with a blistering social media tirade that ripped into Somali fraudsters, accused Minnesota governor Tim Walz of mental deficiencies, and promised to stop all asylum cases and immigration from the Third World. This sequence of events turned the Minnesota fraud into the debate of the moment.

The next step in the process is for the liberal media to respond. Right on cue, CBS News published a story misrepresenting our report and “debunking” that misrepresentation—a claim that it eventually retracted under pressure. The New York Times did somewhat better, publishing a long feature on the Somali fraud, confirming key details, and opening the floodgates for discourse on the center-left. The spotlight thus turned to Governor Walz, who was at the helm when Somali thieves robbed Minnesota of billions.

On the surface, the Times story was an acknowledgment that this was a real scandal that the liberal press had missed. But the paper did not address the underlying narrative about why the fraud happened. Yes, the story is about a criminal enterprise, but it runs deeper than that. The story has touched a nerve because it busts liberal myths about immigration, anti-racism, and the welfare state.

Minnesota has long prided itself on its generous welfare programs and reputation for good governance. But after the mass arrival of the new Somali population—many of whom brought with them different attitudes toward government and civil society—these programs became a weak point. George Floyd’s 2020 death in Minneapolis demonstrated that scrutiny could be deflected by making baseless accusations of “racism” against anyone who raised questions about the missing funds.

The uncomfortable truth for Times readers is that all cultures are not equal. Therefore, not all cultures are compatible with all political systems. In this case, the Somali criminal enterprise is incompatible with a generous welfare state, particularly in the context of a racial politics that intimidates whistleblowers and other honest brokers.

Though this story was particular to Minnesota, disruptive mass immigration is a national phenomenon. During the four years of the Biden administration, America imported millions of foreigners, many illegally. Some of these have brought, or are trying to bring, negative aspects of their home culture to the United States.

Indeed, cultural incompatibility was a campaign theme during the 2024 election.  Venezuelan gangs took over apartment buildings in Colorado. Haitian migrants overwhelmed deindustrialized towns in the Rust Belt. The Somali fraud story is another point in this plotline.

The Trump administration claims to be on pace to “shatter” records of forced deportations and so-called self-deportations, but more must be done. The administration should put financial restrictions on illegal immigrants, like requiring proof of legal status for maintaining a bank account; and implement massive remittance taxes to reduce the profitability of illegal immigration and fraud. And it must line up the manpower to turbocharge the prosecution of immigrant fraud, in Minnesota and elsewhere.

The New York Times won’t spell it out in block print, but even devoted liberals are starting to ask questions about the welfare state’s combability with mass migration. The shocking scope and scale of the Somali fraud in Minnesota made this a story that could no longer be ignored.

Christopher F. Rufo is a senior fellow at the Manhattan Institute, a contributing editor of City Journal, and the author of America’s Cultural Revolution.

Photo by Jeff Swensen/Getty Images

Read the whole story
bogorad
8 hours ago
reply
Barcelona, Catalonia, Spain
Share this story
Delete
Next Page of Stories