Meet Gemini: The Most Advanced and Powerful AI Model We've Ever Created
Embracing AI: Shaping Our Future
Every time technology advances, it opens new doors to discovery, better ways of living, and overall progress for society. Now, with AI, we're witnessing something even bigger than previous breakthroughs like the internet and smartphones. AI has the power to create opportunities for innovation and economic growth worldwide. I’m deeply committed to making sure AI benefits everyone, no matter where they are or their background.
After almost eight years working in AI, our progress is moving faster than ever. Millions of people are using our AI tools, collaborating, and solving tough problems together. But we’re still just beginning to see what AI can truly achieve.
We’re dedicated to advancing AI in a responsible way. We're working closely with experts to reduce risks and maximize the positive impact of AI. Following our guiding principles, we’re investing in new tools and models that can make a difference across various industries.
Today, we’re launching Gemini, our most powerful AI model yet. It delivers incredible performance and represents a major leap forward in AI development. With Gemini, we’re taking a huge step toward realizing the vision we've had since the start of Google DeepMind. The potential it offers is transformative, and we’re excited about the possibilities it can bring to the world.
Best regards,
Sundar
Unveiling Gemini: A Note from Demis Hassabis, CEO and Co-Founder of Google DeepMind
AI has been my lifelong passion, shaping both my career and that of my talented team. From my early days coding AI for video games to my research into the human brain, I’ve always believed that smarter machines can truly transform our future for the better.
At Google DeepMind, this belief fuels our mission to responsibly unlock AI's potential. For years, we've been working to create a new kind of AI model, inspired by how humans think and interact. We envision AI as more than just software—it's an intuitive assistant seamlessly woven into our everyday lives.
Today, we’re excited to introduce Gemini, our most advanced model yet.
Gemini is the result of extensive teamwork across Google, including with our partners at Google Research. Designed to handle multiple types of information—like text, code, audio, images, and video—Gemini is a groundbreaking, multimodal AI.
It sets a new standard for adaptability, working smoothly from large data centers to small mobile devices. Its cutting-edge features are set to change how developers and businesses use AI on a grand scale.
We’re launching Gemini 1.0 with three distinct versions:
- Gemini Ultra — Our top-tier model, designed for complex tasks requiring exceptional performance.
- Gemini Pro — Versatile and scalable, ideal for a wide range of applications.
- Gemini Nano — Efficient and compact, optimized for top-notch performance on mobile devices.
With Gemini, we’re setting out to push the limits of what AI can do, opening up new possibilities for innovation and collaboration across various fields and industries.
Pushing the Limits of Performance
After thorough testing and careful evaluation, we’ve put our Gemini models through a range of challenging tasks. Whether it's interpreting images, audio, and video, or solving complex math problems, Gemini Ultra sets a new benchmark. It has outperformed existing top models on 30 out of 32 major academic benchmarks for large language models.
With an impressive score of 90.0%, Gemini Ultra has achieved a milestone by surpassing human experts in MMLU (massive multitask language understanding). This rigorous test spans 57 subjects, including mathematics, physics, history, law, medicine, and ethics, and measures both knowledge and problem-solving skills.
Gemini’s innovative approach to these benchmarks allows it to use advanced reasoning strategies, carefully analyzing complex questions before answering. This capability has led to significant improvements, surpassing our initial expectations for the model.
Breaking New Ground in Multimodal Performance
Gemini Ultra has scored an impressive 59.4% on the new MMMU benchmark, which evaluates a range of tasks requiring complex reasoning across different areas.
In our image tests, Gemini Ultra outperformed previous top models. Remarkably, it achieved these results without using optical character recognition (OCR) to pull text from images. These results underscore Gemini’s strong multimodal skills and provide an early look at its advanced reasoning capabilities.
Leading the Way in Multimodal Innovation
Traditionally, creating multimodal models meant training separate parts for different types of data, then trying to piece them together to achieve some level of functionality. While this method works for tasks like describing images, it often falls short in handling complex reasoning.
Gemini changes the game by using a native multimodal approach. Instead of building separate components, Gemini is pre-trained on various types of data from the start and then fine-tuned with more multimodal information to boost its performance. This unique method allows Gemini to seamlessly understand and reason across different types of inputs from the beginning, surpassing existing models with its advanced capabilities.
Enhanced Reasoning Abilities
Gemini 1.0 demonstrates outstanding multimodal reasoning skills, making it exceptionally good at understanding both complex text and visual content. It excels at picking up subtle details, especially in large datasets where finding valuable insights can be challenging.
With its unparalleled ability to analyze and interpret large amounts of information, Gemini is set to drive major advances in many areas, from scientific research to financial analysis, all at remarkable speeds.
Thorough Understanding Across Different Types of Data
In its first release, Gemini shows impressive expertise in understanding, explaining, and generating top-quality code in major programming languages like Python, Java, C++, and Go. Its exceptional ability to navigate through different languages and handle complex data makes it a leading model in the coding field.
Gemini Ultra performs exceptionally well in various coding tests, particularly in HumanEval, a key industry benchmark for coding skills, and Natural2Code, a dataset created from original content rather than web sources.
Gemini also forms the basis for advanced coding systems. Two years ago, we introduced AlphaCode, an AI-driven code generation system that performed well in programming contests. Building on that, we launched AlphaCode 2, which uses a specialized version of Gemini. AlphaCode 2 outperforms its predecessor, solving almost twice as many problems and surpassing about 85% of competitors, compared to AlphaCode’s 50%. It also improves usability by allowing programmers to customize code features for better performance.
We’re excited to see more AI models like Gemini being adopted as valuable tools for programmers. These models help with problem-solving, suggesting code designs, and streamlining the development process, speeding up app creation and enhancing the quality of services.
Improved Reliability, Scalability, and Efficiency
Gemini 1.0 was trained extensively using our advanced AI infrastructure, featuring Google's latest Tensor Processing Units (TPUs) v4 and v5e. Designed to be our most reliable and scalable model for training, and the most efficient for deployment, Gemini represents a major advancement in AI technology.
With TPUs, Gemini shows significant speed improvements over previous versions and smaller models. These custom AI accelerators are crucial for Google’s AI-driven products, supporting billions of users across services like Search, YouTube, Gmail, Google Maps, Google Play, and Android. They also help organizations worldwide train large AI models more affordably.
We’re excited to introduce Cloud TPU v5p, our most powerful, efficient, and scalable TPU system yet, built specifically for training cutting-edge AI models. This next-generation TPU will speed up Gemini's development, allowing developers and businesses to train large generative AI models faster than ever. As a result, this will help bring new products and features to users around the globe more quickly.
Built with Responsibility and Safety at the Heart
At Google, we are dedicated to advancing AI in a bold yet responsible way. Following our AI Principles and strong safety policies, we’re implementing new safeguards to address Gemini’s multimodal capabilities. From the beginning of development, we carefully consider potential risks and work to test and minimize them.
Gemini has undergone the most thorough safety evaluations of any Google AI model so far, including checks for bias and harmful content. We've conducted innovative research into risk areas like cyber threats, persuasion, and autonomy, using Google Research’s top adversarial testing techniques to spot critical safety issues before Gemini is released.
To uncover any gaps in our internal evaluations, we’re collaborating with a diverse group of external experts and partners to rigorously test our models on various issues.
During Gemini’s training, we’re using benchmarks like Real Toxicity Prompts—100,000 web-sourced prompts with different levels of toxicity, developed by experts at the Allen Institute for AI—to identify and address content safety concerns. More details on this will be available soon.
To reduce harm, we’ve developed specialized safety classifiers to detect and filter out content involving violence or harmful stereotypes. This layered approach, combined with strong filters, aims to make Gemini safer and more inclusive. We’re also addressing ongoing challenges related to factual accuracy, grounding, attribution, and corroboration.
Responsibility and safety are central to our work on AI models. This long-term commitment involves building collaboratively, so we’re partnering with the industry and broader ecosystem to establish best practices and safety standards through organizations like MLCommons, the Frontier Model Forum and its AI Safety Fund, and our Secure AI Framework (SAIF). We’ll continue working with researchers, governments, and civil society groups worldwide as we advance Gemini.
Bringing Gemini to the World
Gemini 1.0 is now being introduced across various products and platforms:
Gemini Pro in Google Products
Gemini is now reaching billions through Google products. Starting today, Bard will use an enhanced version of Gemini Pro for more advanced reasoning, planning, and understanding. This is the biggest upgrade Bard has seen since its launch. It will be available in English across more than 170 countries and territories, with plans to add new features and support more languages soon.
Gemini is also coming to Pixel. The Pixel 8 Pro is the first smartphone designed to run Gemini Nano, which will power new features like Summarize in the Recorder app and Smart Reply in Gboard, starting with apps like WhatsApp, Line, and KakaoTalk, with more messaging apps to follow next year.
In the coming months, Gemini will be added to more Google products and services, including Search, Ads, Chrome, and Duet AI.
Building with Gemini
Starting December 13, developers and businesses can access Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI. Google AI Studio is a free, web-based tool for quickly prototyping and launching apps with an API key. For a fully-managed AI platform, Vertex AI offers customization of Gemini with enhanced security, privacy, and data management features.
Android developers can also use Gemini Nano, our most efficient model for on-device tasks, through AICore, a new feature in Android 14, beginning with Pixel 8 Pro devices. You can sign up for an early preview of AICore.
Gemini Ultra Coming Soon
We’re currently finalizing Gemini Ultra, with extensive trust and safety checks, including red-teaming by external experts, and refining it with reinforcement learning from human feedback (RLHF). Gemini Ultra will first be available to select customers, developers, and safety experts for early testing and feedback before its broader release to developers and enterprises early next year.
Next year, we will also launch Bard Advanced, a new AI experience offering access to our top models and capabilities, starting with Gemini Ultra.
The Gemini Era: Paving the Way for Innovation
We're marking a major milestone in AI development with the launch of Gemini, ushering in a new era at Google where we continue to push the boundaries of innovation while advancing our models responsibly.
Gemini has already made impressive strides, and we're focused on expanding its capabilities even further in future versions. This includes improvements in planning and memory, as well as increasing the context window to process more information and provide even better responses.
We’re thrilled about the potential of a world empowered by AI, where innovation drives creativity, expands knowledge, advances science, and transforms how billions of people live and work globally.
Most Popular Terms
AI Jobs
Top 10 Best Demand & Awesome High Paying AI Jobs in 2024. The profound impact of the AI revolution is reshaping traditional roles while....
Janitor AI
Janitor AI: Transforming Chatbots with Personality and Innovation. Janitor AI sets itself apart by developing chatbots....
Gemini AI
How to use Google Gemini AI: Step-by-Step Complete Guide? Since its launch earlier this year, Google Gemini has quickly become a top choice among....