DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself! : technology

[–] Tachanka@hexbear.net 23 points 8 months ago (1 children)

It is 2025

AI writes code to update itself

I still have to load the dishwasher by hand

I still have to change the baby's diapers

I still have to go to work tomorrow

Do things ever happen?

Some say nothing ever happens.

Others argue that everything always happens.

I love the real movement.

[–] yogthos@lemmygrad.ml 10 points 8 months ago

[–] Hohsia@hexbear.net 21 points 8 months ago (2 children)

A thing I’ve noticed with deepseek is that it operates in a very system-oriented manner (it carefully plans out how to answer your question when you use thinking mode and it’s actually quite interesting) whereas chatgpt just tells you how long it “thought” and ultimately regurgitates an output that it is statistically likely. So we actually get to see a bit of the black box in my view

[–] QuillcrestFalconer@hexbear.net 17 points 8 months ago

ChatGPT o1 hides it's chain of thought so you don't even really know what the 'reasoning' is

[–] yogthos@lemmygrad.ml 12 points 8 months ago

yeah it's fascinating to see how the sausage is made

[–] dat_math@hexbear.net 17 points 8 months ago* (last edited 8 months ago) (1 children)

the code for the boost was written by R1 itself!

Pretty neat, but this kind of thing will impress me a lot more when it's genuinely new and creative output, not just the result of being prompted to optimize an existing routine in a prescribed way (using simd instructions to calculate inner products)

[–] yogthos@lemmygrad.ml 19 points 8 months ago* (last edited 8 months ago)

I'm still impressed because it was able to look at the existing solution, recognize a bottleneck, and write the code to address it. Most code is very boring, you don't need genius solutions in it. And this could be of huge help for developers as well where you could have it analyze code and suggest where improvements can be made. It could be faster than profiling things.

[–] D61@hexbear.net 16 points 8 months ago (2 children)

Ask ChatGPT what to do and it will ask to have a gallon of movie concession stand nacho cheese flavored sauce poured directly onto the server rack.

[–] REgon@hexbear.net 11 points 8 months ago

Tbh that would be for the best, so I'll give it to chatGPT on a technicality

[–] Nakoichi@hexbear.net 4 points 8 months ago

I have no mouth and I must cream

[–] piggy@hexbear.net 10 points 8 months ago* (last edited 8 months ago) (2 children)

I'm going to say 2 things that are going to be very unpopular but kinda need to be heard.

DeepSeek is turning this place into /r/OpenAI but red, which is incredibly lame
If LLMs are significantly helping your development workflow, you are doing grunt work, you're not improving your skills, and you're not working on problems that have any significant difficulty beyond memorizing multiplication tables type recall but for tech.

This optimization is actually grunt work, it's not a new discovery, it's simply using SIMD instructions on matrices something that should have been done in the first place either by hand or by a compiler.

[–] yogthos@lemmygrad.ml 7 points 8 months ago (1 children)

The reality is that most code is very boring, and a lot of optimizations are a result of doing really basic things like this. A model being able to look through the code, notice patterns, and then let you know these kinds of obvious improvements is in fact very useful. It's not different than using a profiler to find bottlenecks. Having done development for over two decades, I don't feel like combing through the code to find these kinds of things is a really good use of my time or that it's improving my skills in any way.

[–] piggy@hexbear.net 4 points 8 months ago* (last edited 8 months ago) (1 children)

This type of tooling isn't new and doesn't require AI models. Performance linters exist in many languages. Rubocop perf, perlint in python, eslint perf rules etc. For C++, clang-tidy and cpp-perf exist.

The only reason LLMs are in this space is because there is a lack of good modern tooling in many languages. Jumping straight to LLMs is boiling the ocean (literally and figuratively).

Not only that but if we're really gonna argue that "most code is very boring". That already negates your premise, most boring code isn't really highly perf sensitive and unique enough to be treated individually through LLMs. Directly needing to point out SIMD instructions in your C++ code basically shows that your compiler tool chain sucks or you're writing your code in such a "clever" way that it isn't getting picked up. This is an optimization scenario from 1999.

Likewise if you're not looking through the code you're not actually understanding what the performance gaps are or if the LLM is making new ones by generating sub-optimal code. Sometimes the machine spirits react to the prayer protocols and sometimes they don't. That's the level of technology you're arguing at. These aren't functional performance translations being applied. Once your system is full of this kind of junk, you won't actually understand what's going on or how things practically work. Standard perf linters are already not side effects free in some cases but they publish their side effects. LLMs cannot do this by comparison. That's software engineering, it's mostly risk management and organization. Yes it's boring.

[–] yogthos@lemmygrad.ml 4 points 8 months ago (1 children)

While this type of tooling isn't new, what the LLM can do is qualitatively different from your basic linter. The power of the tool comes from being able to identify specific patterns in a particular code base. The linters, as the name implies, simply look for a set of common that were encoded in them.

Meanwhile power consumption, which is the one legitimate criticism of LLMs, is precisely what DeepSeek architecture addresses. And there's every reason to expect that we'll be seeing further progress here. In fact that's already happening as we speak https://www.reuters.com/technology/artificial-intelligence/alibaba-releases-ai-model-it-claims-surpasses-deepseek-v3-2025-01-29/

That already negates your premise, most boring code isn’t really highly perf sensitive and unique enough to be treated individually through LLMs.

No, that doesn't negate my premise at all. Just because code is boring doesn't mean it's easy to optimize or to notice problems. The thing that LLMs do well is looking at large volumes of data and identifying patterns within it.

Likewise if you’re not looking through the code you’re not actually understanding what the performance gaps are or if the LLM is making new ones by generating sub-optimal code.

That's just a straw man, because there's no reason why you wouldn't be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It's so weird to me how people always have this reaction whenever new technology shows up. Yes, there is a lot of hype around LLMs right now, they're not a panacea, but that doesn't mean we should throw the baby out with the bath water. There are legitimate uses for this tech, and it can save you time. Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely. You do you of course.

[–] piggy@hexbear.net 3 points 8 months ago* (last edited 8 months ago) (1 children)

That's just a straw man, because there's no reason why you wouldn't be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It's not a strawman because classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it's internal weights + input it doesn't guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

The performance gain here is coincidental simply because the generated code uses functions that call processor features directly rather than get optimized into processor features by a compiler. LLM classifiers are also statistically analyzing the AST for performance they aren't actually performing real static analysis of the AST or it's compiled version. It doesn't calculate a BigO or really know how to reason through this problem, it's just primed that when you write the for loop to sum, that's "slower" than using _mm_add_ps. It doesn't even know which cases of the for loop compile down to a _mm_add_ps instruction on which compilers and which optimization levels.

Lastly you injected this line of reasoning when you basically said "why would I do this boring stuff as a programmer when I can get the LLM to do it". It's nice that there's a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that's more performant. But not only is this not Software Engineering, but a performant dot product is a solved problem at EVERY level of abstraction. This programming equivalent of tech bros reinventing the train every 5 years.

The fact that this is needed is a problem in and of itself with how people are building this software. This is machine spirit communion with technojargon. Instead of learning how to vectorize algorithms you're feeding your garbage code through a LLM to produce garbage code with SIMD instructions in it. That is quite literally stunting your growth as a Software Engineer. You are choosing to ignore learning how things actually work because it's too hard to parse through the existing garbage. A SIMD dot product algo is literally a 2 week college junior homework assignment.

Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

[–] yogthos@lemmygrad.ml 3 points 8 months ago (1 children)

It’s not a strawman because not only is classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it’s internal weights + input it doesn’t guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

Yet, nothing stops you as a developer from reading this code. Furthermore, the real value of DeepSeek R1 is that it's explaining its reasoning for how it arrived at a solution. That itself can be useful to read through and can give you ideas on how you might want to tackle it. Here's concrete toy example for you:

First, I remember that in Java, there's a class called java.util.Date which represents dates and times. Maybe I can use that in Clojure as well. But wait, how do I call Java classes from Clojure? Oh right, you just refer to them using their fully qualified names.

So, I think I'll need to import java.util.Date or refer to it directly in my function. Then, maybe create a Date object which represents the current time. That should give me something to work with.

But Clojure has its own way of handling dates and times too, right? There's the clojure.java-time library that provides a more idiomatic way to handle dates in Clojure. I wonder if I should use that instead for better readability and modern features.

Assuming I decide to stick with the built-in Java Date class, how do I get the current time? Oh, there's a method called System.currentTimeMillis() which returns the current time in milliseconds since the epoch. Maybe I can use that to create a Date object.

So, putting that together, I might write something like (new java.util.Date()). But then, how do I print this date in a human-readable format? Java has SimpleDateFormat for formatting dates. So, I'd need to import that as well.

Wait, Clojure allows me to use Java's SimpleDateFormat directly. So, I can create a formatter and then format the Date object with it. The pattern string would determine how the date is displayed. For example, "HH:mm:ss" would give me hours, minutes, and seconds in 24-hour format.

Putting this all together into a Clojure function: I'll define a function that creates a Date object representing now, then use SimpleDateFormat to format it, and finally print it out using println.

But what about dependencies? Do I need to add any additional libraries beyond the standard Java libraries? For java.util.Date and SimpleDateFormat, I think those are part of the standard JDK, so no extra dependencies should be needed.

Alternatively, if I choose to use clojure.java-time, which is a more modern approach, I would need to include that library in my project. But for simplicity, maybe sticking with Java's utilities is better since they're already available.

So, step by step:

Create a Date object representing the current time.
Create a SimpleDateFormat with the desired format string.
Format the Date object using the formatter.
Print the formatted string to the console.

I should also consider any potential issues, like thread safety with SimpleDateFormat, but since this is a simple function and not part of a highly concurrent application, I think it's acceptable for now.

Testing the function in my mind: when I call it, it should print out the current time in HH:mm:ss format. Does that cover everything? Yes, I think so.

Wait, maybe I can also include some examples or documentation on how to use this function. That way, anyone reading the code will know what it does and how to invoke it.

So, putting it all together, here's what I have in mind for the function:

;; A Clojure function to print the current time
(defn print-current-time []
  ;; Create a Date object representing the current time
  ;; Create a SimpleDateFormat with HH:mm:ss format
  (let [current-date (java.util.Date.)
          formatter (new java.text.SimpleDateFormat "HH:mm:ss")]
    ;; Format the date and print it
    (println (.format formatter current-date))))

Example usage:

(print-current-time)

This function will output the current time in the "HH:mm:ss" format when called. You can invoke it by simply calling print-current-time in your Clojure environment.

Lastly you injected this line of reasoning when you basically said “why would I do this boring stuff as a programmer when I can get the LLM to do it”. It’s nice that there’s a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that’s more performant. But not only is this not Software Engineering, a dot product is a fucking solved problem at EVERY level of abstraction from bits all the way up to your favorite interpreted language.

Once again, what I actually said was that LLM can help you identify bits of code that might be interesting, and this is valuable in a large project. This is exactly the same task you'd use a profiler for.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

And I quite literally explained why the tool is still useful. In particular, the argument you keep making that you would just blindly copy/paste code the LLM produces is a complete straw man. A competent engineer will read what the LLM says and use it to inform a solution they understand.

[–] piggy@hexbear.net 3 points 8 months ago (1 children)

Okay let me ask this question:

Who is this useful for? Who is the target audience for this?

[–] yogthos@lemmygrad.ml 3 points 8 months ago (1 children)

It's useful for me, I'm the target audience for this. I'm working on a React project right now, and I haven't touched Js in close to a decade. I know what I want to do conceptually, and I have plenty of experience designing applications. However, I'm not familiar with the nitty gritty of how React works and how to do what I want with it. This tool saves me a ton of time googling these things and wasting hours on sites like stack overflow.

[–] piggy@hexbear.net 3 points 8 months ago* (last edited 8 months ago) (1 children)

I know what I want to do conceptually, and I have plenty of experience designing applications.

How does AI help you actually traverse the concepts of React that you admit you don't have nitty gritty knowledge of how they work in terms of designing your application? React is a batteries included framework that has specific ways of doing things that impact the design and concepts that are technically feasible within React itself.

For example React isn't really optimized to crunch a ton of data performantly so if you're getting constant data updates over a web socket from multiple points and you want some or all the changes to be reflected you're gonna have a bad time vs something that has finer grained change controls out of the box such as Angular.

How does AI help you choose between functional and class based React components? How much of your application is doing typical developer copy-pasta instead of creating HOCs for similar functionalities? How did AI help you with that? How is AI helping apply concepts like SOLID into the design of your component tree? How does AI help you decide how to architect components and their children that need to have a lifecycle outside of the typical change-binding flow?

This in my opinion is the crux of the issue, AI cannot solve this problem for you nor can it reasonably explain it in a technical way beyond parroting the vagaries of what I said above. It cannot confer understanding of complex abstract concepts that are fuzzy and have grey areas. It can tell you something may not work explicitly but it cannot educate you realistically on the tradeoffs.

It seems to me that your answer boils down to "code monkey stuff". AI might help you swing a pickaxe, but it's not good at explaining where the mine is going to collapse based on the type of rock you're digging in. Another way of thinking about it is that you could build a building to the "building code" but it will still collapse. AI can explain the building code and loosely verify that you built something to it, but it cannot validate that your building is going to stay standing nor can it practically tell you what you need to change.

My problem with AI tools boils down to this. Software is a medium of communication. It communicates the base of a problem and the technical process of solving it. Software Engineering is a field that attempts to create strong patterns of communication and practices in order to efficiently organize the production of Software. The software industry at large (where most programmers get exposed to the process of building software) often eschews this discipline because of scientific management (the idea you can simply manage a process through fiduciary/managerial knowledge rather than domain knowledge) and the need for instant development to maintain fictional competitive advantage and fictional YoY growth. The industry welcomes AI for 2 reasons:

It can code monkey...eventually. Why pay programmers when you can ask CahpGBT to do it?
It can fix the problem of needing to deliver without knowing what you're doing... eventually. It fixes the problem of communication without relying on building up the knowledge and practice of Software Engineering. In essence why have people know this discipline and its practical application when you can continue to have the blind leading the blind because ChadGTP can see for us?

This is a disservice to programmers everywhere especially younger ones because it destroys the social reproduction of the capacity to build scalable software and replaces it with you guessed it machine rites. In practice it's the apotheosis of Conway's Law in the software industry. We build needlessly complex software that works coincidentally, and soon that software will be analyzed, modified, and ultimately created by a tool that is an overly complex statistical model that also works through the coincidence of statistical approximations.

[–] yogthos@lemmygrad.ml 3 points 8 months ago (1 children)

How does AI help you actually traverse the concepts of React that you admit you don’t have nitty gritty knowledge of how they work in terms of designing your application?

It helps me by showing me the syntax and patterns that map to what I'm trying to do conceptually. By pointing me in the right direction, it saves me time searching for these things. I don't know why that's so difficult for you to understand.

For example React isn’t really optimized to crunch a ton of data performantly so if you’re getting constant data updates over a web socket from multiple points and you want some or all the changes to be reflected you’re gonna have a bad time vs something that has finer grained change controls out of the box such as Angular.

That's not a problem I'm solving, and in practice most UIs don't actually deal with a lot of data because the human user is the limiting factor. I'm working on an application that's doing fairly vanilla things here.

How does AI help you choose between functional and class based React components? How much of your application is doing typical developer copy-pasta instead of creating HOCs for similar functionalities? How did AI help you with that? How is AI helping apply concepts like SOLID into the design of your component tree? How does AI help you decide how to architect components and their children that need to have a lifecycle outside of the typical change-binding flow?

That's not a problem it's solving for me. As I've explained to you, I already have plenty of experience and I know how I like to structure applications. I'm used to using re-frame in Clojure, and I'm just looking how to do similar patterns in React. The AI does an excellent job of helping me discover them.

This in my opinion is the crux of the issue, AI cannot solve this problem for you nor can it reasonably explain it in a technical way beyond parroting the vagaries of what I said above. It cannot confer understanding of complex abstract concepts that are fuzzy and have grey areas. It can tell you something may not work explicitly but it cannot educate you realistically on the tradeoffs.

I don't need it to confer understanding of abstract concepts to me. I need it to show me common patterns within a particular library that map to the concepts I'm already familiar with. I don't need it to educate me on any trade offs.

Meanwhile, the problems you're fixating on are not inherent to AI in any way and have always existed in the software industry. Cargo culting is a term for a reason, no AI has been necessary for people do that, nor does absence of AI prevent this from happening. So, your whole argument is completely misdirected because AI is not the problem here. People who were going to cargo cult were gonna do that regardless of the tooling.

This is a disservice to programmers everywhere especially younger ones because it destroys the social reproduction of the capacity to build scalable software and replaces it with you guessed it machine rites.

That's absolute nonsense. It doesn't destroy the capacity to build scalable software any more than stack overflow does.

We build needlessly complex software that works coincidentally, and soon that software will be analyzed, modified, and ultimately created by a tool that is an overly complex statistical model that also works through the coincidence of statistical approximations.

You're saying this as if it wasn't the case long before AI showed up on the scene. You're making up a giant straw man of how you pretend software development works which is utterly divorced from what we see happening in the real world. The AI doesn't change this one bit.

[–] piggy@hexbear.net 1 points 8 months ago* (last edited 8 months ago) (1 children)

You're making up a giant straw man of how you pretend software development works which is utterly divorced from what we see happening in the real world. The AI doesn't change this one bit.

Commenting this under a post where an AI has spit out a dot product function optimization for an existing dot product function that's already ~150-250 lines long depending on architectural implementation of which there are about 6. The PR for which has an interaction that is two devs finger pointing about who is responsible for writing tests. The PR for which notes that the original and new function often don't give the correct answer. Just an amazing response. Chefs kiss.

What a wonderful way to engage with my post. You win bud. You're the smartest. This industry would never mystify a basic concept that's about 250 years old with a 716 line PR through its inability to communicate, organize and follow an academic discipline.

[–] yogthos@lemmygrad.ml 2 points 8 months ago (1 children)

What a wonderful way to engage with my post. You win bud. You’re the smartest.

Amazing counterpoint you've mustered there when presented with the simple fact that all the problems you're describing have already been happening long before AI showed up on the scene. Way to engage in good faith dialogue. Bravo!

[–] piggy@hexbear.net 2 points 8 months ago* (last edited 8 months ago) (1 children)

I've never said that AI is the cause of those problems that's words you're putting in my mouth. I've said that AI is being used as a solution to those problems in the industry when in reality the use of AI to solve those problems exacerbates them while allowing companies to reap "productive" output.

For some reason programmers can understand "AI Slop" but if the AI is generating code instead of stories, images, audio and video it's no longer "AI Slop" because we're exalted in our communion with the machine spirits! Our holy logical languages could never encode the heresy of slop!

[–] yogthos@lemmygrad.ml 2 points 8 months ago (1 children)

Ok, so if you agree the AI is not the source of those problems, then it's not clear what you're arguing about. Nobody is arguing for using the AI for problems you keep mentioning, and you keep ignoring that. I've given you concrete examples of how this tool is useful for me, you've just ignored that and continued arguing about the straw man you want to argue about.

The slop has always been there, and AI isn't really changing anything here.

[–] piggy@hexbear.net 3 points 8 months ago* (last edited 8 months ago) (1 children)

Nobody is arguing for using the AI for problems you keep mentioning, and you keep ignoring that.

This is absolutely not true. Almost every programmer I know has had their company try to "AI" their documentation or "AI" some process only to fail spectacularly because the basis of what the AI does to data is either missing or doesn't have enough quality. I have several friends at the Lead/EM level take too much time out of their schedules to talk down a middle manager from sapping resources into AI boondoggles.

I've had to talk people off of this ledge, and lead that works under me (I'm technically a platform architect across 5 platform teams) actually decided to try it anyway and burn a couple days on a test run and guess what the results were garbage.

Beyond that the problem is that AI is a useful tool in IGNORING the problems.

I've given you concrete examples of how this tool is useful for me, you've just ignored that and continued arguing about the straw man you want to argue about.

I started this entire comment thread with an actual critique, a point, that you have in very debate bro fashion have consistently called a strawman. If I were a feeling less charitable I could call the majority of your arguments non-sequitors to mine. I have never argued that AI isn't useful to somebody. In fact I'm arguing that it's dangerously useful for decision makers in the software industry based on how they WANT to make software.

If a piece of software is a car, and a middle manager wants that car to have a wonderful proprietary light bar on it and wants to use AI to build such a light bar on his wonderful car. The AI might actually build the light bar in a narrow sense to the basic specs the decision maker feels might sell well on the market. However the light bar adds 500lbs of weight so when the driver gets in the car the front suspension is on the floor, and the wiring loom is also now a ball of yarn. But the car ends up being just shitty enough to sell, and that's the important thing.

And remember the AI doesn't complain about resources or order of operations when you ask it do make a light bar at the same time as a cool roof rack, a kick ass sound system and a more powerful engine, and hey if the car doesn't work after one of these we can just ask it to regenerate the car design and then just have another AI test it! And you know what it might even be fine to have 1 or 2 nerds around just in case we have to painfully take the car apart only to discover we're overloading the alternator from both ends.

[–] yogthos@lemmygrad.ml 2 points 8 months ago (1 children)

I'm talking about our discussion here. AI can be misused just like any tool, there's nothing surprising or interesting about that. What I'm telling you is that from my experience, it can also be a useful tool when applied properly.

I started this entire comment thread with an actual critique, a point, that you have in very debate bro fashion have consistently called a strawman.

I've addressed your point repeatedly in this discussion.

In fact I’m arguing that it’s dangerously useful for decision makers in the software industry based on how they WANT to make software.

And I'm once again going to point out that this has been happening for a very long time. If you've ever worked at a large corporation, then you'd see that they take monkeys at typewriter approach to software development. These companies don't care about code quality one bit, and they just want to have fungible developers whom they can hire and fire at will. I've seen far more nightmarish code produced in these conditions than any AI could ever hope to make.

The actual problem isn't AI, it's capitalist mode of production and alienation of workers. That's the actual source of the problems, and that's why these problems exist regardless of whether people use AI or not.

[–] piggy@hexbear.net 3 points 8 months ago* (last edited 8 months ago) (14 children)

The way that you're applying the tool "properly" is ultimately the same way that middle managers want to apply the tool, the only difference is that you know what you're doing as a quality filter, where the code goes and how to run it. AI can't solve the former (quality) but there are people working on a wholesale solution for the latter two. And they're getting their data from people like you!

In terms a productive process there's not as much daylight between the two use cases as you seem to think there is.

load more comments (14 replies)

[–] Hohsia@hexbear.net 2 points 8 months ago

I’ve had similar conversations discussing some of your points in this thread (albeit at a much much higher level because I’m basically an analyst) with C-suites at my company, and what truly terrifies me is that they do not give a fuck as long as they get a passable finished product. And the bar for passable is now in hell.

My company doesn’t even have any software engineers because my manager convinced his boss (a c-suite) that we could do all our coding with ChatGPT scripts. Surprise surprise, now we have a ton of busted code in our GitHub because apparently copy and pasting from the LLM of the day is what passes.

[–] QuillcrestFalconer@hexbear.net 10 points 8 months ago (2 children)

My question is who is naming these functions 'qX_K_q8_K'

[–] yogthos@lemmygrad.ml 10 points 8 months ago (1 children)

C devs love cryptic names :)

[–] lilypad@hexbear.net 10 points 8 months ago* (last edited 8 months ago) (1 children)

Writing lisp:

(defun generate-eight-new-magic-numbers-for-system-x ()...)

Writing c:

struct mnums* g8_nmn_sx () {...}

[–] yogthos@lemmygrad.ml 6 points 8 months ago (2 children)

s-exps are the one true syntax and every other syntax was a mistake, I will die on this hill

[–] lilypad@hexbear.net 3 points 8 months ago

lea-w

[–] Nakoichi@hexbear.net 2 points 8 months ago (1 children)

s-exps is when you fuck your Playstation

[–] yogthos@lemmygrad.ml 2 points 8 months ago

🤣

[–] piggy@hexbear.net 1 points 8 months ago* (last edited 8 months ago)

This is a quantization function. It's a fairly "math brained" name I agree, but the function is called qX_K_q8_K because it quantizes a value with a quantization index of X (unknown) to one with a quantization index of 8 (bits) which correlates to the memory usage. The 0 vs K portions are how it does rounding, 0 means it does rounding by equal distribution (without offset), and K means it creates a distribution that is more fine grained around more common values and is more rough around least common values. e.g. I have a data set that has a lot of values between 4 and 5 but not a lot of 10s. I have lets say 10 brackets between 4 and 5 but only 3 between 5 and 10.

Basically it's a lossy compression for a data set into a specific enumeration (roughly correlates with size), so it's a way to given 1,000,000 numbers from 1-1000000, of putting their values into a range of numbers based on the q level How using different functions affects the output of models is more voodoo than anything else. You get better "quality" output from higher memory space, but quality is a complex metric and doesn't necessarily map to factual accuracy in the output, just statistical correlation with the model's data set.

An example of a common quantizer is an analog to digital converter. It must take continuous values from a wave that goes 0 to 1 and transform them into digital values of 0 and 1 with a specific sample rate.

Taking a 32 bit float and copying the value into 32 bit float is an identity quantizer.

[+] TheDrink@hexbear.net 2 points 8 months ago* (last edited 8 months ago) (1 children)

[deleted]

[–] Chump@hexbear.net 3 points 8 months ago (2 children)

Have you tried getting it from ollama? Don’t know the details of your error, but I suspect it can at least pick up after a discount e

[–] yogthos@lemmygrad.ml 2 points 8 months ago (1 children)

ollama doesn't seem to be smart enough to continue if it's interrupted and just restarts the whole download

[–] Chump@hexbear.net 2 points 8 months ago (1 children)

Damn that’s irritating

[–] yogthos@lemmygrad.ml 2 points 8 months ago

Particularly so given that client's whole job is to download multi gig files.

load more comments (1 replies)