Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
  • AI
  • Tech
  • Cybersecurity
  • Finance
  • DeFi & Blockchain
  • Startups
  • Gaming
Dataconomy
  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
Subscribe
No Result
View All Result
Dataconomy
No Result
View All Result

Don’t be fooled by the size of Microsoft’s 1-Bit LLM

With its tiny size, intelligent operation and incredible energy conservation, Microsoft's 1-Bit LLM fits a huge library inside your pocket

byEmre Çıtak
March 6, 2024
in Artificial Intelligence
Home News Artificial Intelligence

With its new 1-bit LLM technology, Microsoft may have just cracked the code for creating powerful AI behind chatbots and language tools that can fit in your pocket, run lightning fast, and help save the planet.

Ok, ditch the planet part but it is a really big deal!

Traditional LLMs, the powerful AI models behind tools like ChatGPT and Gemini, typically use 16-bit or even 32-bit floating-point numbers to represent the model’s parameters or weights. These weights determine how the model processes information. Microsoft’s 1-bit LLM takes a radically different approach by quantizing (reducing the precision of) these weights down to just 1.58 bits.

Stay Ahead of the Curve!

Don't miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

With a 1-bit LLM, each weight can only take on one of three values: -1, 0, or 1.

This might seem drastically limiting, but it leads to remarkable advantages.

Microsoft 1-Bit LLM
Traditional AI models use bulky 16-bit (or more) numbers for calculations but Microsoft’s 1-bit LLM slims this down drastically (Image credit)

What’s the fuss about 1-Bit LLMs?

The reduced resource requirements of 1-bit LLMs could enable AI applications on a wider range of devices, even those with limited memory or computational power. This could lead to more widespread adoption of AI across various industries.

Smaller brains mean AI can run on smaller devices: Your phone, smartwatch, you name it.

The simplified representation of weights in a 1-bit LLM translates to faster inference speeds – the process of generating text, translating languages, or performing other language-related tasks.

Simpler calculations mean the AI thinks and responds way faster.

The computational efficiency of 1-bit LLMs also leads to lower energy consumption, making them more environmentally friendly and cost-effective to operate.

Less computing power equals less energy used. This is a major win for environmentally conscious tech, an ultimate step to make AI green.

Apart from all that, the unique computational characteristics of 1-bit LLMs open up possibilities for designing specialized hardware optimized for their operations, potentially leading to even further advancements in performance and efficiency.

Meet Microsoft’s BitNet LLM

Microsoft’s implementation of this technology is called BitNet b1.58. The additional 0 value (compared to true 1-bit implementations) is a crucial element that enhances the model’s performance.

BitNet b1.58 demonstrates remarkable results, approaching the performance of traditional LLMs in some cases, even with severe quantization.

Microsoft 1-Bit LLM
BitNet b1.58 can nearly match the performance of traditional AI models despite the simpler format (Image credit)

Breaking the 16-bit barrier

As mentioned before, Traditional LLMs utilize 16-bit floating-point values (FP16) to represent weights within the model. While offering high precision, this approach can be memory-intensive and computationally expensive. BitNet b1.58 throws a wrench in this paradigm by adopting a 1.58-bit ternary representation for weights.

This means each weight can take on only three distinct values:

  • -1: Represents a negative influence on the model’s output
  • 0: Represents no influence on the output
  • +1: Represents a positive influence on the output

Mapping weights efficiently

Transitioning from a continuous (FP16) to a discrete (ternary) weight space requires careful consideration. BitNet b1.58 employs a special quantization function to achieve this mapping effectively. This function takes the original FP16 weight values and applies a specific algorithm to determine the closest corresponding ternary value (-1, 0, or +1). The key here is to minimize the performance degradation caused by this conversion.

Here’s a simplified breakdown of the function

  • Scaling: The function first scales the entire weight matrix by its average absolute value. This ensures the weights are centered around zero
  • Rounding: Each weight value is then rounded to the nearest integer value among -1, 0, and +1. This translates the scaled weights into the discrete ternary system
Microsoft 1-Bit LLM
BitNet b1.58 cleverly uses components similar to the open-source LLaMA model for easy integration (Image credit)

See the detailed formula on Microsoft’s 1-Bit LLM research paper.

Activation scaling

Activations, another crucial component of LLMs, also undergo a scaling process in BitNet b1.58. During training and inference, activations are scaled to a specific range (e.g., -0.5 to +0.5).

This scaling serves two purposes:

  • Performance Optimization: Scaling activations helps maintain optimal performance within the reduced precision environment of BitNet b1.58
  • Simplification: The chosen scaling range simplifies implementation and system-level optimization without introducing significant performance drawbacks

Open-source compatibility

The LLM research community thrives on open-source collaboration. To facilitate integration with existing frameworks, BitNet b1.58 adopts components similar to those found in the popular LLaMA model architecture. This includes elements like:

  • RMSNorm: A normalization technique for stabilizing the training process
  • SwiGLU: An activation function offering efficiency advantages
  • Rotary Embeddings: A method for representing words and positions within the model
  • Removal of biases: Simplifying the model architecture

By incorporating these LLaMA-like components, BitNet b1.58 becomes readily integrable with popular open-source LLM software libraries, minimizing the effort required for adoption by the research community.


Featured image credit: Freepik.

Tags: FeaturedMicrosoft

Related Posts

OpenAI hardware chief calls for kill switches to counter devious AI models

OpenAI hardware chief calls for kill switches to counter devious AI models

September 16, 2025
DeepMind CEO says learning how to learn is the key skill for the AI era

DeepMind CEO says learning how to learn is the key skill for the AI era

September 16, 2025
OpenAI launches Grove program for early AI founders

OpenAI launches Grove program for early AI founders

September 15, 2025
AI agents can be controlled by malicious commands hidden in images

AI agents can be controlled by malicious commands hidden in images

September 15, 2025
There are more women using ChatGPT than men now

There are more women using ChatGPT than men now

September 15, 2025
Google Gemini tops App Store charts with Nano Banana tool

Google Gemini tops App Store charts with Nano Banana tool

September 15, 2025

LATEST NEWS

CrowdStrike and Meta launch open-source CyberSOCEval benchmark to test AI cybersecurity models

Microsoft rolls out free Copilot Chat sidebar to all Microsoft 365 business apps

All the new features of iOS 26

Shiny Hunters breach Kering, exposing 7.4M Gucci, Balenciaga, and Alexander McQueen customer records

Amazon schedules September 30 Fall Event to showcase Echo, Fire TV, and Kindle updates

OpenAI hardware chief calls for kill switches to counter devious AI models

Dataconomy

COPYRIGHT © DATACONOMY MEDIA GMBH, ALL RIGHTS RESERVED.

  • About
  • Imprint
  • Contact
  • Legal & Privacy

Follow Us

  • News
    • Artificial Intelligence
    • Cybersecurity
    • DeFi & Blockchain
    • Finance
    • Gaming
    • Startups
    • Tech
  • Industry
  • Research
  • Resources
    • Articles
    • Guides
    • Case Studies
    • Glossary
    • Whitepapers
  • Newsletter
  • + More
    • Conversations
    • Events
    • About
      • About
      • Contact
      • Imprint
      • Legal & Privacy
      • Partner With Us
No Result
View All Result
Subscribe

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy Policy.