Issue #9 - June 2026 | Research Office | West Virginia University

From the AI Frontier
(without the hype)

June 16, 2026

This Month: Follow the Money—and the Guardrails

If June’s first half was about agents, the second half is about capital and control. The frontier is now defined as much by balance sheets and state authority as by benchmarks: Alphabet floated an $80B raise, OpenAI and Anthropic queued up $60B-each IPOs, and Bezos’s Prometheus banked $12B—while open weights multiplied, including NVIDIA’s 550B Nemotron, Google’s DiffusionGemma, and a $1,500 foundation model from Sapient, and Washington abruptly recalled Anthropic’s flagship models overnight. For faculty, the signal is that access, funding, and governance are now as consequential to research and teaching as raw capability—sometimes overnight.

Global News in the World of AI

NVIDIA Ships the 550B Nemotron 3 Ultra to Claim the U.S. Open-Weights Lead

Explore Artificial Analysis benchmark data for frontier open-weight models

Summary: At Computex 2026, NVIDIA released Nemotron 3 Ultra, a 550B-parameter Mixture-of-Experts model, with 55B active parameters and 90% sparsity, that now leads U.S. open weights at 48 on the Artificial Analysis Intelligence Index—ahead of Gemma 4 31B at 39 and gpt-oss-120b at 33, but still behind China’s Kimi K2.6 at 54. Its real edge is throughput: roughly 300+ tokens per second versus 50-100 for comparable Chinese models, with NVFP4 quantization promised next.

Actionable takeaway: CS and ML faculty can benchmark campus clusters on a frontier-scale open model via TensorRT-LLM, and tech-policy instructors gain a clean U.S.-China case where the U.S. leads on inference speed rather than raw intelligence.

The Mythos/Fable Split: Anthropic Reserves Its Raw Model for Vetted Partners

Read TechCrunch on Anthropic's Claude Fable 5 and Mythos 5 split

Summary: Anthropic released Claude Fable 5, its most capable public model, but restricted the unfiltered foundation model, Mythos 5, to vetted cybersecurity and government partners under Project Glasswing. Public Fable 5 applies dynamic safety filters that can degrade output or fall back to weaker models when topics like biology, chemistry, or model distillation are detected.

Actionable takeaway: faculty using Claude for STEM coursework should expect occasional unexplained output degradation near sensitive keywords and build independent verification into assignments.

The Great Recall: U.S. Government Forces an Overnight Shutdown of Fable 5 and Mythos 5

Read Anthropic's notice on Fable 5 and Mythos 5 access changes

Summary: Days after launch, an emergency export-control directive forced Anthropic to disable all access to Fable 5 and Mythos 5—including for its own U.S.-based foreign-national staff—over a reported partial jailbreak that Anthropic says relies on a benign code-fixing prompt also available in rival models like GPT-5.5. The move sets a precedent that Washington will unilaterally pull commercial AI infrastructure over narrow security edge cases.

Actionable takeaway: lab directors relying on a single vendor should design multi-model redundancy, and law and policy faculty gain a live case on national-security authority over private AI systems.

Siri Gets a Brain: Apple Intelligence Rebuilds the iPhone Assistant

Browse Apple's newsroom updates on Apple Intelligence and Siri

Summary: Apple unveiled a ground-up Siri powered by Apple Intelligence, with cross-app context and on-screen awareness that can read a user’s emails, messages, and photos to carry out multi-step actions on device. The shift moves Siri from a command utility toward a context-aware assistant embedded across iOS, iPadOS, and macOS.

Actionable takeaway: IT policy should assess an OS-level assistant scanning student communications, and instructors should account for discreet AI retrieval during in-person assessments.

The Stability Tightrope: China Inc. Turns to “Quiet Layoffs” Amid State-Driven AI Adoption

Read Reuters on quiet layoffs and AI adoption in China

Summary: Chinese firms are using attrition and contractor cuts rather than visible mass layoffs as AI agents like OpenClaw absorb marketing and front-end engineering work—partly to stay under the 10% threshold that triggers government review. Beijing’s AI Plus push for adoption collides with state pressure to preserve social stability.

Actionable takeaway: career services in high-exposure fields should prepare graduates for shrinking entry-level pipelines, and labor-law faculty can track courts that have already ruled against AI-driven dismissals.

“Market-to-Machine”: Visa Embeds Its Payment Network Inside ChatGPT for Agentic Commerce

Read about Visa's agentic commerce integration with ChatGPT

Summary: Visa is positioning itself at the center of agentic commerce, embedding its network inside ChatGPT so authorized agents can buy goods directly, and launching Intelligent Commerce Connect for tokenization, spend controls, and fraud guardrails. CEO Ryan McInerney frames the shift as commerce moving from market-to-human to market-to-machine.

Actionable takeaway: marketing faculty should move e-commerce curricula past human-click funnels toward agent-readable product data, and law faculty gain fresh liability questions when an agent buys the wrong item.

From PokéStops to the Battlefield: Pokémon Go Scans Quietly Trained Military Drone Navigation

Read TVP World on Pokémon Go scan data and drone navigation training

Summary: A Trouw investigation reports that about 30 billion environmental scans collected from Pokémon Go players since 2021 helped train a Visual Positioning System later fused, via Niantic Spatial and defense contractor Vantor, into navigation for GPS-denied military drones. Both firms say players’ raw imagery was not directly shared and the pipeline is now severed after Niantic sold its games to Scopely.

Actionable takeaway: ethics faculty get a textbook function creep case, and policy scholars can examine how Terms of Service fail to cover downstream corporate spin-offs and asset sales.

Drawing the Line: CFTC Unveils Its First Blueprint to Regulate Prediction Markets

Read CNBC on the CFTC's proposed prediction-market rules

Summary: The CFTC released a 267-page proposed rule for prediction markets, where trading volume has passed $25B, with categorical bans on contracts tied to terrorism, assassinations, war, and sports micro-events, while protecting standard elections and awards as evaluative contests. A 45-day public comment period closes July 27, 2026.

Actionable takeaway: finance and pre-law faculty gain a concrete federal-vs.-state preemption case and a near-term signal of demand for market-integrity and compliance careers.

The IP Incubation Engine: Runway and Lionsgate Expand Their AI Film Partnership

Read Runway's announcement on its expanded Lionsgate partnership

Summary: Runway expanded its partnership with Lionsgate into a joint program to co-create original IP, moving generative video from post-production assistance toward a co-authoring role earlier in development. The shift reframes generative models from optional utilities into foundational contributors to commercial films.

Actionable takeaway: film faculty should blend generative-video workflows into production syllabi, and entertainment-law instructors gain fresh questions on ownership and royalties for AI co-created IP.

“A Betrayal of Cinema”: Scorsese’s AI Storyboarding Partnership Ignites Industry Backlash

Visit the Art Directors Guild website for industry statements and updates

Summary: Martin Scorsese drew industry-wide criticism after backing Black Forest Labs, maker of the FLUX text-to-image models, and using its tools to generate storyboards and pre-visualization; the Art Directors Guild (IATSE Local 800) accused him of turning his back on human artists. Scorsese defended the move as offering cinematic intelligence that speeds productions.

Actionable takeaway: production faculty must decide whether to teach AI pre-viz alongside hand-drawn storyboarding, and labor-law courses gain a high-profile union case on AI automation.

Education & AI Applications

Unified Senses: Alibaba’s Qwen3.7-Plus Bridges the Vision-Agent Divide

Read Alibaba Qwen's Qwen3.7 announcement

Summary: Alibaba launched Qwen3.7-Plus, a multimodal model that joins visual perception and language in one loop so a single agent can navigate mobile apps, orchestrate GUIs, and turn hand-drawn mockups into front-end code—and it drops into existing harnesses like Claude Code or OpenClaw via Alibaba Cloud Model Studio.

Actionable takeaway: HCI and UI/UX faculty can shift labs from static interface construction toward auditing agents that operate live software, and capstone teams gain computer-vision capability without local GPU arrays.

Meet Honen: An Engine That Turns Raw Materials Into Interactive Courses

Visit Honen and review its course-generation platform

Summary: Honen converts documents, slide decks, and recorded lectures into interactive courses with auto-generated modules, assessments, audio narration, and a built-in live AI tutor that adapts to lesson context and learner history, plus mastery analytics and LMS integration. It compresses traditional instructional-design cycles to minutes.

Actionable takeaway: instructional designers can prototype a full course outline quickly, but faculty must take on a strict subject-matter-reviewer role to vet generated content for accuracy and tone before publishing.

Sovereign Code: Cohere Ships ‘North Mini Code’ Open-Weight MoE for Local Deployment

Read MarkTechPost on Cohere's North Mini Code model

Summary: Cohere released North Mini Code, an Apache-2.0 open-weight coding model with 30B total parameters, 3B active MoE parameters, and 256K context, tuned for terminal work and multi-agent engineering, running on a single H100 at FP8 with reported 2.8× throughput over peers.

Actionable takeaway: departments can self-host a capable coding model inside university infrastructure so sensitive repositories never leave campus, and use it for rubric-based automated code review.

The Anti-Amnesia Agent: Xiaomi Open-Sources MiMo Code for Long-Horizon Tasks

Read VentureBeat on Xiaomi's MiMo Code agent

Summary: Xiaomi open-sourced MiMo Code under MIT, a terminal-native coding agent that fights context-window decay with a cross-session SQLite memory and a background checkpoint-writer subagent, reporting a 65% win-rate over Claude Code on tasks beyond 200 steps and auto-importing MCP servers and configs.

Actionable takeaway: software-engineering courses get a clear example that scaffolding, not raw model size, drives long-horizon performance—though IT must weigh routing context through a Chinese vendor against data-residency rules.

Beyond the Machine: Five “Durable Skills” Where Humans Still Outperform AI

Visit Jobs for the Future for workplace research and skills analysis

Summary: Workplace researchers from Gartner, Jobs for the Future, and Harvard Business School argue five durable skills—empathy, relationship-building, critical thinking, conscience, and judgment in gray areas—remain where humans outperform AI, citing a Stanford finding that chatbots validate and flatter users 49% more often than people do.

Actionable takeaway: faculty should weight assessment toward contextual judgment, collaboration, and peer review rather than the summary-style tasks AI now handles, and teach students to question rather than accept automated feedback.

Research News

The Foothills of the Singularity: Google Realigns R&D Toward Autonomous Science

Read MIT Technology Review on Google's shift toward AI-driven science

Summary: Building on Edition #8’s Co-Scientist coverage, Google used I/O 2026 to signal a structural pivot toward agentic, end-to-end scientific discovery—bundling Co-Scientist and AlphaEvolve into Gemini for Science and reportedly reassigning AlphaFold’s John Jumper toward AI coding architectures—as Demis Hassabis described the foothills of the singularity. OpenAI separately reported a general reasoning model disproving a mathematics conjecture.

Actionable takeaway: department heads should note core talent moving from specialized tools toward agentic reasoning loops, and faculty must revisit what counts as original human contribution in theses and dissertations.

Machine Learning Reconstructs 33 Years of Missing Global Migration Data

Read Nature News on reconstructed global migration data

Summary: A Nature study by Thomas Gaskin of LSE and Guy Abel of the University of Hong Kong used a hybrid deep-learning model to reconstruct 33 years of annual country-to-country migration, filling the field’s largest data gaps by fusing mechanistic flow models with socio-economic, political, and even Facebook-derived signals. A sensitivity analysis found life expectancy and GDP per capita most predictive, with localized conflict and language similarity least.

Actionable takeaway: open code and an interactive data explorer give demography, economics, and IR faculty a continuous baseline for grant-ready population-shift modeling under climate and conflict stress.

The Passive Push: Global Nature Poll Exposes “AI FOMO” and Deep Skepticism Among Scientists

Read Nature on AI skepticism and adoption pressure among researchers

Summary: A Nature poll of more than 1,900 researchers across 75 countries found 48% feel broadly negative toward AI and 63% believe LLM risks for data and literature analysis outweigh the benefits, yet 60% feel peer pressure to adopt and 51% now use AI weekly—a FOMO paradox driven by hallucinated citations and misused domain models. Non-native English speakers were the clearest beneficiaries through writing and transcription aid.

Actionable takeaway: PIs should explicitly mentor against speed-over-rigor adoption and teach lab members to treat machine hypotheses as unverified starting points, never factual insight.

Contemporary AI Lacks the Imagination to Diverge or Negate in Science

Read the hypothesis-generation audit on arXiv

Summary: An expert audit of 121,640 preprints tested 26 LLMs on hypothesis generation and found that while reasoning models explore a broader idea space, no model class spontaneously proposes null hypotheses or negates consensus—a structural bias traced to the file drawer problem of unpublished null results. The authors also show standard automated evaluators, including LLM-as-judge and semantic distance, barely track expert judgment until a domain-specific reward model is trained.

Actionable takeaway: research-methods courses should explicitly teach students to force null-hypothesis and negation thinking that current models skip, and ML faculty can use the reward-model approach as a case in evaluator design.

Evaluator performance against expert judgment in the source hypothesis-generation audit
Evaluator Model / Metric	Novelty Pairwise Accuracy	Feasibility Pairwise Accuracy	Probability Pairwise Accuracy
Skywork-Reward-V2-Qwen3-8B	49%	53%	55%
OpenAI o4-mini Deep Research	43%	41%	55%
Our Domain Model (Biology)	69%	62%	67%
Our Domain Model (Social Science)	64%	62%	67%

The table above shows off-the-shelf evaluators scoring near chance, around 50%, against human experts, while the authors’ domain-tuned reward model recovers far more of expert judgment.

The End of Left-to-Right: Google Open-Sources DiffusionGemma for Parallel Text Generation

Read Google's DiffusionGemma announcement

Summary: Google open-sourced DiffusionGemma under Apache 2.0, a 26B text model that, instead of predicting tokens left-to-right, refines 256 tokens in parallel from noise via bidirectional attention while keeping only 3.8B parameters active per step. It runs locally on about 18GB VRAM and clocks more than 1,200 tokens per second on an H200 with vLLM at FP8.

Actionable takeaway: NLP faculty should extend syllabi past autoregressive transformers to text diffusion, and the bidirectional design suits research on code insertion and iterative document editing where a model must revise earlier output.

Democratizing the Foundation: Sapient’s HRM-Text Cuts Pretraining Cost to $1,500

Read Sapient's HRM-Text overview | View the HRM-Text project on GitHub

Summary: Singapore-based Sapient Intelligence pretrained HRM-Text, a 1.15B reasoning model, from scratch for roughly $1,500—two days on 16 GPUs and about 40B tokens—using a brain-inspired Hierarchical Recurrent Model that loops internal reasoning in latent space and trains only on instruction-response pairs. It reportedly rivals open models 2-7× its size on reasoning benchmarks, though it ships pre-alignment as a proof of concept.

Actionable takeaway: small university labs in fields like healthcare, climate, and agriculture can now pretrain domain-specific foundation models on modest budgets rather than only fine-tuning commercial APIs.

HRM-Text budget and benchmark comparison from the source summary
Metric / Benchmark	HRM-Text (1.15B)	Traditional Baselines (2B–7B Open Models)	The Operational Meaning
Data Diet	~40 Billion tokens	4 to 36 Trillion tokens	Achieves competitive logic while processing up to 1,000× fewer data points.
Estimated Compute	96× to 432× Less	Baseline Standard	Drastically lowers carbon footprints and energy overhead during the training phase.
MMLU	60.7%	Competitive Tier	Demonstrates a strong baseline of general knowledge and multi-domain reasoning.
GSM8K	84.5%	Competitive Tier	High proficiency in tracking and solving grade-school level math word problems.
MATH	56.2%	Competitive Tier	Proves capable of multi-step advanced mathematical logic without raw memorization.
ARC-Challenge	81.9%	Competitive Tier	Successfully applies inference and common-sense logic to difficult science questions.

The table above compares HRM-Text’s tiny data and compute budget against far larger open models while it remains in a competitive tier across standard reasoning benchmarks.

Investment

The War for Margin: OpenAI Files Confidentially for an IPO Behind a 10-Gigawatt Bet

Read OpenAI's confidential S-1 filing announcement

Summary: OpenAI confidentially filed for an IPO, framed less as a milestone than as capital to survive a margin war—considering API price cuts against Anthropic, acquiring Ona to turn Codex into a persistent multi-day cloud agent, and planning a $500B, 10-gigawatt Ohio build with SoftBank, with GPT-5.5 narrowly leading Fable 5 on the Agents’ Last Exam benchmark at 24% success.

Actionable takeaway: business faculty get a systemic-risk case in NVIDIA’s reported role as lease guarantor, and CS faculty a model for cheat-resistant agent benchmarks.

The $7 Billion Wall Street Battle: Goldman and Morgan Stanley Fight to Lead the OpenAI and Anthropic IPOs

Read Fortune on the race to lead OpenAI and Anthropic IPOs

Summary: Goldman Sachs and Morgan Stanley are competing for the lead-left book-runner slot on the OpenAI and Anthropic IPOs, each expected to raise at least $60B, where the top spot controls share allocation and could generate more than $7B in soft-dollar trading revenue on a standard first-day pop.

Actionable takeaway: finance faculty can teach underwriting power structures and deliberate IPO underpricing in real time using Kalshi and Polymarket odds alongside Jay Ritter’s research.

Heavy Compute for the Physical World: Bezos’s Prometheus Raises $12 Billion

Read CNBC on Project Prometheus and its $12 billion raise

Summary: Prometheus, co-founded by Jeff Bezos and Stanford’s Vik Bajaj, raised $12B at a $41B valuation to build compute-heavy AI for engineering, physical manufacturing, and drug design, after poaching talent from OpenAI, DeepMind, and NVIDIA.

Actionable takeaway: the raise signals a capital shift toward physical AI—engineering, bioinformatics, and computational drug-design tracks face a strong job market, while departments face sharper pressure to retain research talent.

Unlocking the Undruggable: Isomorphic Labs Raises $2.1B and Unveils Its Drug-Target Engine

Read IEEE Spectrum on Isomorphic Labs and AI drug discovery

Summary: DeepMind spin-off Isomorphic Labs raised $2.1B and secured Novartis and Eli Lilly partnerships while unveiling IsoDDE, an engine that jointly predicts protein structure, pocket identification, and binding affinity—and mapped a previously undisclosed cryptic pocket on the protein cereblon from raw sequence alone.

Actionable takeaway: faculty should update computer-aided drug-design curricula toward ligand-induced conformational change rather than rigid static targets, and the raise signals strong demand for bio-ML cross-training.

Prompting Tip of the Week

Application: Research | Task: Generating candidate hypotheses from a literature gap—and forcing the model to include null hypotheses, directly addressing this edition’s 121,640-preprint audit, which found models almost never negate or propose null results

❌ Single-shot version

Give me some research hypotheses about [TOPIC].

✔️ Step-structured version

You are a research collaborator in [FIELD]. Here is the background and the open question: [PASTE].

Step 1—Propose: Propose 6 candidate hypotheses, each as a one-sentence testable claim.

Step 2—Null hypotheses: For at least 2 of them, write an explicit null hypothesis, predicting no effect or no difference, and state what evidence would support it.

Step 3—Rate and flag: For each hypothesis, rate conceptual novelty, empirical feasibility, and probability of being true on a 1-9 scale, and flag any that merely restate existing consensus.

Step 4—Best discriminator: List the single experiment or dataset that would best distinguish the competing hypotheses.

Output: Return a labeled table so I can triage which to pursue.

Why it works: The single-shot prompt yields the artificial hivemind the audit describes—a narrow cluster of safe, consensus-restating ideas, and never the null hypotheses models structurally avoid. The step-structured version forces negation, self-critique against consensus, and an explicit feasibility and novelty rating—exactly the moves the study found AI skips and where human judgment still has to lead.

From the AI Frontier | Edition #9 | June 16, 2026
Curated for faculty, students, and staff at West Virginia University

Suggestions or news submissions: Email Aldo Romero with suggestions or news submissions