They/Them, agender-leaning scalie.

ADHD software developer with far too many hobbies/trades: AI, gamedev, webdev, programming language design, audio/video/data compression, software 3D, mass spectrometry, genomics.

Learning German (B2), Chinese (HSK 3-4ish), French (A2).

  • 2 Posts
  • 65 Comments
Joined 1 year ago
cake
Cake day: June 18th, 2023

help-circle


  • Note: For this guide, we’ll focus on functions that operate on the scalar preactivations at each neuron individually.

    Very frustrating to see this, as large models have shown that scalar activation functions make only a tiny impact when your model is wide enough.

    https://arxiv.org/abs/2002.05202v1 shows GLU-based activation functions (2 inputs->1 output) almost universally beat their equivalent scalar functions. IMO there needs to be more work around these kinds of multi-input constructions, as there are much bigger potential gains.

    E.g. even for cases where the network only needs static routing (tabular data), transformers sometimes perform magically better than MLPs. This suggests there’s something special about self-attention as an “activation function”. If that magic can be extracted and made sub-quadratic, it could be a paradigm shift in NN design.



  • You’re right. Everything is suspiciously wordy, substance is sparse, and every headline is clickbaity. It’s like they tuned the content specifically for google, not human readers…

    EDIT: Because my comment was also lacking substance: e.g. the Steam Deck review in “30 Best Retro Handhelds Of 2024 [All Reviewed]” says “Yes it’s big, and the battery life… pretty terrible”, then gives no further information about size or battery life, which seems extremely relevant to potential buyers. They wrote 8 paragraphs and shared only 3 shallow facts.



  • I’d say it’s more like they’re failing upwards. It’s certainly good for AMD, but it seems like it happened in spite of their involvement, not because of it:

    For reasons unknown to me, AMD decided this year to discontinue funding the effort and not release it as any software product. But the good news was that there was a clause in case of this eventuality: Janik could open-source the work if/when the contract ended.

    AMD didn’t want this advertised or released, and even canned this project despite it reaching better performance than the OpenCL alternative. I really don’t get their thought process. It’s surreal. Do they not want to support AI? Do they not like selling GPUs?





  • The website does a bad job explaining what its current state actually is. Here’s the GitHub repo’s explanation:

    Memory Cache is a project that allows you to save a webpage while you’re browsing in Firefox as a PDF, and save it to a synchronized folder that can be used in conjunction with privateGPT to augment a local language model.

    So it’s just a way to get data from browser into privateGPT, which is:

    PrivateGPT is a production-ready AI project that allows you to ask questions about your documents using the power of Large Language Models (LLMs), even in scenarios without an Internet connection. The project provides an API offering all the primitives required to build private, context-aware AI applications.

    So basically something you can ask questions like “how much butter is needed for that recipe I saw last week?” and “what are the big trends across the news sites I’ve looked at recently?”. But eventually it’ll automatically summarize and data mine everything you look at to help you learn/explore.

    Neat.


  • I agree that older commercialized battery types aren’t so interesting, but my point was about all the battery types that haven’t had enough R&D yet to be commercially mass-produced.

    Power grids don’t care much about density - they can build batteries where land is cheap, and for fire control they need to artificially space out higher-density batteries anyway. There are heaps of known chemistries that might be cheaper per unit stored (molten salt batteries, flow batteries, and solid state batteries based on cheaper metals), but many only make sense for energy grid applications because they’re too big/heavy for anything portable.

    I’m saying it’s nuts that lithium ion is being used for cases where energy density isn’t important. It’s a bit like using bottled water on a farm because you don’t want to pay to get the nearby river water tested. It’s great that sodium ion could bring new economics to grid energy storage, but weird that the only reason it got developed in the first place was for a completely different industry.




  • Are willing to pirate content from this niche 3D printing YouTube content creator that you enjoy? I’m not.

    I cleanse my conscience by supporting many of them on Patreon.

    Accidentally clicking on clickbait without an adblocker directly results in a spammer getting money, and that just makes me feel like crap. There’s so much spam out there that wouldn’t exist without ads, which makes it harder for quality creators to get attention and fair compensation. I feel I can only engage with the internet ethically by refusing to participate in the ad economy.

    It sucks that alternative payment models like Brave’s “Basic Attention Token” (or a fairer alternative) never got popular. The idea was to track the creators of websites/videos/etc. you visit and automatically split your monthly donation between them. IIRC it was proportional to the number of ads blocked for each creator, but you could tweak creators’ multipliers to deny profit to spam and reward higher-quality creators. I’d also accept microtransactions for individual videos, news articles, etc. but no platforms for these exist because the big players in internet monetization are all so focused on ads.