Skip to main content

Posts

The Context Trap: Why Scaling Laws Can’t Break the Ceiling of Uncertainty

We are currently living through the “ Post-Reasoning ” phase of the AI hype cycle. By now, models like Gemini 2.0 and the latest iterations of Gemini have normalized the idea that machines can “think”—or at least, simulate a chain of thought that feels indistinguishable from reasoning. But as we push these architectures to their absolute limits, we are starting to see a plateau. It isn’t a plateau of competence; the models are brilliant. It is a plateau of certainty . In building applications on top of these models, I’ve noticed a recurring pattern. Developers (myself included) often assume that if a model fails to predict the right outcome, it’s a failure of intelligence. We assume we need a larger parameter count, a longer context window, or better fine-tuning. But there is a ghost in the machine that scaling laws cannot exorcise. It is the fundamental difference between not knowing and not seeing . The Architecture of Doubt To understand why our models, even state...
Recent posts

Building Production AI Systems with Free Cloud GPUs

🚀 The Google Colab VS Code Extension Enterprise AI Without Enterprise Costs If you've been following my work on LLM-assisted code generation and AI reasoning , you know I'm always looking for ways to democratize AI development . During my recent work on cross-chain smart contract generation , I needed to rapidly prototype different transformer architectures for code translation. Previously, this meant juggling between local development, cloud instances, and Colab notebooks . Now? I'm running everything from VS Code with zero context switching. Here's what changed my workflow: Python - Loading Large Models on Free GPU Copy Code # Running this on a free T4 GPU in VS Code - no setup required from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load a 7B parameter model - impossible on most local machines model = AutoModelForCausalLM.from_pretrained( "codellama/ CodeLla...

🎤 SpeakWise: Build an AI Public Speaking Coach with Gemma 3n

SpeakWise: Build an AI Public Speaking Coach with Gemma 3n Project by: Rabimba Showcase for: Google Developer Expert (GDE) AI Sprint Imagine having a private, real-time AI coach that watches your presentations, listens to your speech, analyzes your slides and gestures, and provides actionable feedback to help you improve confidently. Using Google’s new Gemma 3n, we built exactly that: SpeakWise , an AI-powered public speaking coach that leverages multimodal understanding to transcribe, analyze, and critique your talks—all while keeping your data private. Github Code. 🚀 Why Gemma 3n? Gemma 3n is Google’s open multimodal model designed for on-device, privacy-preserving, real-time AI applications . It is uniquely capable of: 📡 Simultaneously processing audio, image, and text, forming a holistic understanding of your talk. 🗂️ Following advanced instructions (“Act as a world-class presentation coach” and structuring output into clear, actionable insights). ...

From Models to Agents: Shipping Enterprise AI Faster with Google’s MCP Toolbox & Agent Development Kit

This article is an expanded write-up of the talk I recently delivered as a Google Developer Expert during a talk in Denver. The full slide deck is embedded below for easy reference. Why another “agent framework”? Large-language models (LLMs) are superb at generating prose, but production-grade systems need agents that can reason, plan, call tools, and respect enterprise guard-rails. Traditionally, that means: Hand-rolling connectors to databases & APIs Adding authentication, rate-limits, and connection pools Patching in tracing & metrics later Hoping your YAML jungle survives the next refactor Google’s new duo— MCP Toolbox and the Agent Development Kit (ADK) —eliminates that toil so you can treat agent development like ordinary software engineering. MCP Toolbox in one minute ⏳ What Why it matters Open-source MCP server Implements the emerging Model Context Protocol ; any compliant age...

Deep Dive into the Google Agent Development Kit (ADK): Features and Code Examples

In our previous overview, we introduced the Google Agent Development Kit (ADK) as a powerful Python framework for building sophisticated AI agents. Now, let's dive deeper into some of the specific features that make ADK a compelling choice for developers looking to create agents that can reason, plan, use tools, and interact effectively with the world. 1. The Core: Configuring the `LlmAgent` The heart of most ADK applications is the LlmAgent (aliased as Agent for convenience). This agent uses a Large Language Model (LLM) for its core reasoning and decision-making. Configuring it effectively is key: name (str): A unique identifier for your agent within the application. model (str | BaseLlm): Specify the LLM to use. You can provide a model name string (like 'gemini-1.5-flash') or an instance of a model class (e.g., Gemini() ). ADK resolves string names using its registry. instruction (str | Callable): This is crucial for guiding the agent's be...

Build Smarter AI Agents Faster: Introducing the Google Agent Development Kit (ADK)

The world is buzzing about AI agents – intelligent entities that can understand goals, make plans, use tools, and interact with the world to get things done. But building truly capable agents that go beyond simple chatbots can be complex. You need to handle Large Language Model (LLM) interactions, manage conversation state, give the agent access to tools (like APIs or code execution), orchestrate complex workflows, and much more. Introducing the Google Agent Development Kit (ADK) , a comprehensive Python framework from Google designed to significantly simplify the process of building, testing, deploying, and managing sophisticated AI agents. Whether you're building a customer service assistant that interacts with your internal APIs, a research agent that can browse the web and summarize findings, or a home automation hub, ADK provides the building blocks you need. Core Concepts: What Makes ADK Tick? ADK is built around several key concepts that make agent development more s...

My Google I/O 2024 Adventure: A GDE's Front-Row Seat to the Gemini Era

Hey tech enthusiasts! Rabimba Karanjai here, your friendly neighborhood Google Developer Expert (GDE), back from an exhilarating whirlwind tour of Google I/O 2024. Let me tell you, this wasn't just your average tech conference – it was an AI-infused extravaganza that left me utterly mind-blown! And you know what made it even sweeter? I had front-row seats, baby! Huge shoutout to the GDE program for this incredible opportunity. Feeling grateful and a tad spoiled, I must admit. 😉 Gemini: The AI Marvel That's Stealing the Show Now, let's dive into the star of the show: Gemini . This ain't your grandpa's AI model – it's the multimodal powerhouse that's set to redefine how we interact with technology. Imagine an AI that doesn't just understand text, but images, videos, code, and even your wacky doodles. Yep, that's Gemini for you! Google's been cooking up this AI masterpiece, and boy, did they deliver! The keynote demo had us all gawk...