site:www.marktechpost.com

Artificial Intelligence

In this tutorial, we use zeroentropy/zerank-2-reranker, a 4B Qwen3-based cross-encoder reranker, to improve retrieval quality. We start by setting up the runtime ...

marktechpost

Voice AI

Stability AI has released Stable Audio 3, a family of latent diffusion models for instrumental music and sound effects generation. The release includes open weights for the small and medium variants.

marktechpost

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

Microsoft Research’s AI Frontiers lab released Fara1.5. It is a family of computer-use agent (CUA) models for the browser. The release ships three sizes: Fara1.5-4B, Fara1.5-9B, and Fara1.5-27B. The ...

marktechpost

NVIDIA and the University of Maryland Researchers Released Audio Flamingo Next (AF-Next): A Super Powerful and Open Large Audio-Language Model

The model introduces Temporal Audio Chain-of-Thought — a reasoning paradigm that anchors intermediate reasoning steps to timestamps in long audio — and outperforms Gemini 2.5 Pro on long-audio ...

marktechpost

Google Launches Antigravity 2.0 at I/O 2026: A Standalone Agent-First Platform with CLI, SDK, Managed Execution, and Enterprise Support

Google used its I/O 2026 developer keynote to ship a meaningful architectural shift in how it packages AI-assisted development. The company announced Google Antigravity 2.0 — a standalone desktop ...

marktechpost

Zyphra Releases ZAYA1-8B-Diffusion-Preview: The First MoE Diffusion Model Converted From an Autoregressive LLM With Up to 7.7x Speedup

To understand why this matters, it helps to first understand how most language models generate text today. Standard large language models are autoregressive: they decode one token at a time in ...

marktechpost

Open Source

OmniVoice Studio runs voice cloning, video dubbing, real-time dictation, and speaker diarization entirely on your own hardware. No API keys, no cloud account, and no subscription required. The project ...

marktechpost

AI Agents

Microsoft Research introduces Webwright, a terminal-native browser agent framework that replaces click-trace web automation with reusable Playwright scripts. Using a single agent loop across three ...

marktechpost

Asif Razzaq

Asif Razzaq is an AI Journalist and Cofounder of Marktechpost, LLC. He is a visionary, entrepreneur and engineer who aspires to use the power of Artificial Intelligence for good. Asif’s latest venture ...

marktechpost

Meet ‘Kani-TTS-2’: A 400M Param Open Source Text-to-Speech Model that Runs in 3GB VRAM with Voice Cloning Support

The landscape of generative audio is shifting toward efficiency. A new open-source contender, Kani-TTS-2, has been released by the team at nineninesix.ai. This model marks a departure from heavy, ...

marktechpost

Microsoft AI Introduces LazyGraphRAG: A New AI Approach to Graph-Enabled RAG that Needs No Prior Summarization of Source Data

In AI, a key challenge lies in improving the efficiency of systems that process unstructured datasets to extract valuable insights. This involves enhancing retrieval-augmented generation (RAG) tools, ...

marktechpost

Google AI Releases MedGemma: An Open Suite of Models Trained for Performance on Medical Text and Image Comprehension

Developers can access MedGemma models through Hugging Face, subject to agreeing to the Health AI Developer Foundations terms of use. The models can be run locally for ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results