Google Unveils New AI Models and Gemini 1.5 Pro

PLUS: OpenAI’s Advanced Voice Mode, Apple’s AI-Driven iPhone Enhancements, and More AI Innovations from Top Tech Giants

👋 Hey and welcome to AI News Daily. 

Each week, we post AI Tools, Tutorials, News, and practical knowledge aimed at improving your life with AI. 

Read time: 7 minutes

Welcome to our weekly digest, your go-to source for the latest in AI innovations, must-try tools, and essential tutorials to keep you at the forefront of technology. Let's get started.👇

Highlight of the Week

Google recently announced the release of three new open-source AI models, marking a significant step in making advanced AI technology accessible to a broader audience. These models—Gemma 2 2B, ShieldGemma, and Gemma Scope—are designed with an emphasis on safety, efficiency, and transparency.

Source: Google

  • Gemma 2 2B is a lightweight model tailored for generating and analyzing text. It’s capable of running on various hardware, including laptops and edge devices. This versatility makes it suitable for both research and commercial applications. By making this model available on platforms like Google’s Vertex AI model library, Kaggle, and AI Studio, Google aims to foster innovation across different fields​​.

  • ShieldGemma focuses on safety, equipped with classifiers to detect and filter out toxic content, including hate speech, harassment, and explicit material. Built on the foundation of Gemma 2, it serves as a protective layer, ensuring that generative models produce safe and respectful outputs​.

  • Gemma Scope enhances the interpretability of AI models by allowing developers to delve into specific aspects of the Gemma 2 model. This capability makes the inner workings of the model more transparent and understandable, aiding in debugging and fine-tuning its behavior. This feature is particularly beneficial for researchers and developers looking to ensure the reliability and accuracy of their AI applications​​.

Open and Accessible - You can get the Gemma 2 2B model on the Hugging Face. You can also try Gemma 2 2B in Google AI Studio.

Thanks for reading AI News Daily! Subscribe for free to receive new posts and support my work.

Google's Gemini 1.5 Pro is now a major contender in the AI race, rivaling OpenAI's GPT-4. This model features an extensive context window handling up to 1 million tokens, allowing it to process large datasets, including long documents, code repositories, and videos.

The Gemini 1.5 Pro uses a Mixture-of-Experts (MoE) architecture, improving efficiency by activating only relevant pathways, which boosts performance and reduces computational demands. Its multimodal capabilities enable handling of text, images, audio, and video inputs.

With high benchmark scores in tasks like translation, coding, and reasoning, Gemini 1.5 Pro is a top-tier AI model. Google is also introducing Gemini 1.5 Flash for low-latency tasks and enhancing integration with Google Workspace, reinforcing its commitment to advancing AI technology.  

Weekly Highlights 🌍

OpenAI’s New Advanced Voice Mode

OpenAI has begun rolling out its advanced Voice Mode for ChatGPT to a select group of ChatGPT Plus users, with plans to make it more widely available by this fall. This new feature aims to provide more natural and dynamic interactions, allowing users to interrupt the AI during conversations and enabling the system to sense and respond to emotional cues​ ​.

However, the rollout hasn't been without its challenges. OpenAI had to delay the initial release due to unresolved safety and performance concerns. During its May demo, the feature faced criticism for using a voice that sounded strikingly similar to actress Scarlett Johansson's, leading to legal scrutiny and the eventual removal of that voice option​. This incident underscores the complexities and sensitivities involved in deploying advanced AI technologies in real-world applications.

Despite these hurdles, OpenAI remains committed to refining the Voice Mode. The company is conducting thorough internal checks to ensure the feature can reliably detect and filter out inappropriate content and is improving its infrastructure to support a large user base without compromising real-time response capabilities​​.

This development is part of a broader effort by OpenAI to enhance the accessibility and usability of its AI models, demonstrating the potential for more engaging and human-like interactions with technology.

Apple is currently facing two distinct paths for the future of the iPhone, driven by advancements in AI and evolving user demands.

 Justin Sullivan/Getty Images

AI Integration and Enhanced Features

Apple is set to introduce significant upgrades in its upcoming iPhone 16 lineup, particularly focusing on AI capabilities. The new models will feature the A18 Pro chip, which is designed to enhance AI performance. This will support advanced functionalities like a more intuitive Siri, generative AI tools for text and image processing, and smarter search capabilities within iOS 18. These AI-driven enhancements are aimed at providing a more personalized and efficient user experience​​.

Physical and Design Changes

The iPhone 16 Pro and Pro Max are expected to include larger displays (6.3 inches and 6.9 inches respectively) and a dedicated hardware button for camera controls. These models will also feature significant camera upgrades, including improved low-light performance and a 5x optical zoom. Additionally, the devices will incorporate the latest in AI-driven technology, such as real-time emotional recognition and interaction capabilities.

Hollis Johnson/Business Insider

Market and Consumer Demand

The excitement around the iPhone 16’s AI features has led to bullish forecasts from analysts. Bank of America and Wedbush Securities have raised their price targets for Apple stock, anticipating a strong demand for the new models. The integration of AI is expected to drive a significant upgrade cycle, with many consumers still using older iPhone models showing increased interest in upgrading​.

Future Prospects

Apple’s focus on AI with its Neural Engine in the A18 chip indicates a shift towards more AI-centric devices. This approach aims to enhance on-device processing, aligning with Apple's privacy-first philosophy by minimizing the need for cloud-based data processing. This strategy could extend to other Apple products, including iPads and Macs, suggesting a broader application of AI across the ecosystem​.

Yum! Brands, the parent company of Taco Bell, has announced a significant shift towards an "AI-first" strategy to enhance its fast-food operations. This approach will see the integration of artificial intelligence across various aspects of its restaurants, including drive-thru operations, kitchen management, and customer interactions.

Jeff Greenberg | Universal Images Group | Getty Images

Key Developments:

Voice AI for Drive-Thru Orders:

Taco Bell is expanding its use of voice AI technology in drive-thrus, aiming to improve order accuracy and speed. This initiative is currently being tested at over 100 locations across 13 states and is expected to roll out to hundreds more by the end of the year. The goal is to streamline back-of-house operations and elevate the customer experience by reducing wait times and improving service efficiency​​.

AI-Driven Kitchen and Inventory Management:

Yum! Brands is deploying Dragontail, an AI-driven kitchen management system, to optimize order sequencing for better freshness and accuracy. This system is already in use at over 4,000 locations, with plans to expand to nearly 6,000 more. Additionally, an AI inventory management system is being implemented across KFC and Taco Bell restaurants, set to cover over 3,000 additional locations by 2024. These technologies are designed to streamline inventory processes and reduce food waste, contributing to operational excellence.

Personalized Customer Interactions:

By leveraging customer data, Yum! Brands aims to offer personalized promotions and recommendations through its mobile apps. This customization is expected to drive customer loyalty and increase sales during off-peak times. The company is also exploring image-recognition AI to further enhance drive-thru service efficiency​.

OpenAI challenges Perplexity and Google with SearchGPT

With the start of what could become a meaningful threat to Google , OpenAI is enters the search engine market with, SearchGPT, an AI-powered search engine with real-time access to information across the internet.

Instead of plain links it tries to organize and make sense of them, the search engine provides summaries and detailed explanations. After receiving initial search results, users can ask additional questions to refine or expand on the information provided.SearchGPT includes a feature called "visual answers" that presents AI-generated videos, images, and other visual content relevant to the search query using OpenAI’s own model Sora.

OpenAI has partnered with major news organizations to ensure content is accurately attributed and linked back to the original sources in a blog post.

SearchGPT is currently a prototype and will be accessible to 10,000 test users initially. It aims to integrate directly into ChatGPT in the future. OpenAI plans to revolutionize the search market with this innovation.SearchGPT will be free at launch, with plans to develop monetization strategies as the service evolves.

  • Imagen 3 : Text - to - Video generation

  • Flatfile : Data Exchange

  • Ylopo : Proptech & Marketing

  • Podsum : Summarises Podcast

  • Thinkific: AI powered online learning products creator

Stay ahead of the curve by exploring these trending tools! Join our Telegram channel for more updates and insights. Explore more on @ainews_daily!

PS: I curate this AI newsletter every week for FREE, your support is what keeps me going. If you find value in your reading, share it with your friends by clicking the share button below!