• 1 Post
  • 199 Comments
Joined 4 months ago
cake
Cake day: September 9th, 2025

help-circle


  • From a quick reading of the actual law, here are some of the AI uses it prohibits that will apparently “stifle innovation”:

    …use of an AI system that exploits any of the vulnerabilities of a natural person or a specific group of persons due to their age, disability or a specific social or economic situation

    …to assess or predict the risk of a natural person committing a criminal offence, based solely on the profiling of a natural person or on assessing their personality traits and characteristics

    …the use of an AI system that deploys subliminal techniques beyond a person’s consciousness or purposefully manipulative or deceptive techniques

    …the use of AI systems that create or expand facial recognition databases through the untargeted scraping of facial images from the internet or CCTV footage

    …the use of biometric categorisation systems that categorise individually natural persons based on their biometric data to deduce or infer their race, political opinions, trade union membership, religious or philosophical beliefs, sex life or sexual orientation









  • At least in Star Trek, the robots would say things like, “I am not programmed to respond in that area.” LLMs will just make shit up, which should really be the highest priority issue to fix if people are going to be expected to use them.

    Using coding agents, it is profoundly annoying when they generate code against an imaginary API, only to tell me that I’m “absolutely right to question this” when I ask for a link to the docs. I also generally find AI search to be useless, even though DuckDuckGo as an example does link to sources, but said sources often have no trace of the information presented in the summary.

    Until LLMs can directly cite and include a link to a credible source for every piece of information they present, they’re just not reliable enough to depend on for anything important. Even with sources linked, it would also need to be able to rate and disclose the credibility of every source (e.g., is the study peer reviewed and reproduced, is the sample size adequate, etc.).







  • Where I find it useful instead is to push me past the initial block of starting something from scratch

    I think this is one of the highly understated benefits. I have to work in legacy codebases in programming languages I hate, and it used to be like pulling teeth to get myself motivated. I’d spend half the day procrastinating, and then finally write some code. Then I’d pull my hair out writing tests, only for CI to tell me I don’t have enough test coverage and there are 30 lint issues to fix. At that point, there would be yelling at the screen, followed by more procrastination.

    With AI, though, I just write a detailed prompt, go get some coffee, and come back to a pile of drivel that is probably like 70% of the way there. I look it over, suggest some refactoring, additional tests, etc., manually test it and have it fix any bugs. If CI reports any lint issues or test failures, I just copy and paste for AI to fix it.

    Yes, in an ideal world if I didn’t have ADHD and could just motivate myself to do whatever my company needs me to do and not procrastinate, I could write better quality code faster than AI. When I’m working on something I’m excited about, AI just gets in the way. The reality being what it is, though, AI is unequivocally a huge productivity boost for anything I’d rather not be working on.


  • The main thing that has stopped me from running models like this so far is VRAM. My server has a RTX 4060 with 8GB, and not sure that can reasonably run a model like this.

    Edit:

    This calculator seems pretty useful: https://apxml.com/tools/vram-calculator

    According to this, I can run Qwen3 14B with 4B quant and 15-20% CPU/NVMe offloading and get 41 tokens / s. It seems 4B quant reduces accuracy by 5-15%.

    The calculator even says I can run the flagship model with 100% NVMe offloading and get 4 tokens / s.

    I didn’t realize NVMe offloading was even a thing and not sure if it actually is supported or works well in practice. If so, it’s a game changer.

    Edit:

    The llama.cpp docs do mention that models are memory mapped by default and loaded into memory as needed. Not sure if that means that a MoE model like qwen3 235b can run with 8GB of VRAM and 16GB of RAM, albeit at a speed that is an order of magnitude slower like the calculator suggests is possible.