Music046 | Nigeria No1. Daily Updates | Contact Us

Tuesday, 11 March 2025

Show HN: Factorio Learning Environment – Agents Build Factories https://bit.ly/3DAtlxP

Show HN: Factorio Learning Environment – Agents Build Factories I'm Jack, and I'm excited to share a project that has channeled my Factorio addiction recently: the Factorio Learning Environment (FLE). FLE is an open-source framework for developing and evaluating LLM agents in Factorio. It provides a controlled environment where AI models can attempt complex automation, resource management, and optimisation tasks in a grounded world with meaningful constraints. A critical advantage of Factorio as a benchmark is its unbounded nature. Unlike many evals that are quickly saturated by newer models, Factorio's geometric complexity scaling means it won't be "solved" in the next 6 months (or possibly even years). This allows us to meaningfully compare models by the order-of-magnitude of resources they can produce - creating a benchmark with longevity. The project began 18 months ago after years of playing Factorio, recognising its potential as an AI research testbed. A few months ago, our team (myself, Akbir, and Mart) came together to create a benchmark that tests agent capabilities in spatial reasoning and long-term planning. Two technical innovations drove this project forward: First, we discovered that piping Lua into the Factorio console over TCP enables running (almost) arbitrary code without directly modding the game. Second, we developed a first-class Python API that wraps these Lua programs to provide a clean, type-hinted interface for AI agents to interact with Factorio through familiar programming paradigms. Agents interact with FLE through a REPL pattern: 1. They observe the world (seeing the output of their last action) 2. Generate Python code to perform their next action 3. Receive detailed feedback (including exceptions and stdout) We provide two main evaluation settings: - Lab-play: 24 structured tasks with fixed resources - Open-play: An unbounded task of building the largest possible factory on a procedurally generated map We found that while LLMs show promising short-horizon skills, they struggle with spatial reasoning in constrained environments. They can discover basic automation strategies (like electric-powered drilling) but fail to achieve more complex automation (like electronic circuit manufacturing). Claude Sonnet 3.5 is currently the best model (by a significant margin). The code is available at https://bit.ly/3FjYMx0 . You'll need: - Factorio (version 1.1.110) - Docker - Python 3.10+ The README contains detailed installation instructions and examples of how to run evaluations with different LLM agents. We would love to hear your thoughts and see what others can do with this framework! https://bit.ly/4hjuE1T March 11, 2025 at 01:02PM

Monday, 10 March 2025

Show HN: Seven39, a social media app that is only open for 3 hours every evening https://bit.ly/4ikPhvW

Show HN: Seven39, a social media app that is only open for 3 hours every evening I built this site as a quick test if a time boxed social media experience feels better than an endless one. So far I've just been using it with friends and it feels nice, but it seems like it is time to bring it to a larger audience. Let me know what you think! It is just based on EST for now, sorry. https://bit.ly/3Fgz1gS March 11, 2025 at 02:05AM

Show HN: Hot Design – Like Hot Reload, but a Runtime Visual Designer https://bit.ly/4bFCHox

Show HN: Hot Design – Like Hot Reload, but a Runtime Visual Designer Hi HN, Nick here, from the open-source Uno Platform team. You are likely familiar with Hot Reload , pioneered by Flutter. We’ve taken that concept further and built Hot Design , let me introduce it to you. Architecturally, Hot Design idea is simple: 1. In your IDE, pause the live, running app at runtime, turning it into a designer. 2. Modify the UI directly on the designer —add elements, adjust layouts, tweak bindings etc. 3. Resume the app without restarting or losing state. We built Hot Design to address the frustration of slow iteration cycles when building and tweaking UI or debugging data bindings in apps targeting multiple platforms. Here’s a detailed explanation and a video of Hot Design in action: https://bit.ly/4bFWrsa I can see potential criticism: It will get killed by AI, it’s another abstraction over code, it is .NET etc. Happy to respond to those comments if they come; we put a lot of thought into Hot Design and would love to hear it challenged! Nick https://bit.ly/4bFWrsa March 11, 2025 at 03:10AM

Show HN: Chrome Extension for ChatGPT to organize conversations into folders https://bit.ly/43EljP5

Show HN: Chrome Extension for ChatGPT to organize conversations into folders Hi HN, I'm Alex, a full-stack developer from Toronto, Canada. I recently built a Chrome extension that organizes ChatGPT conversations into folders, allowing users to sort and save important information for easy reference. The idea for this extension came from a friend who highlighted the lack of good (and affordable) ChatGPT organizers. Many existing tools were either low-quality or overpriced, so I decided to create one that was both reliable and accessible. I built the extension using plain JavaScript and developed a backend with Express to handle Google authentication. For storage, I used MongoDB, enabling all users with an account to save their folder structures and conversation data. Initially, I planned to charge $5 per month to cover costs since originally this extension was intended as a portfolio project addressing a real-world problem. However, just as I finished the main functionality and was about to implement payments, ChatGPT announced an official feature similar to one my extension was providing. Rather than continue competing in a market with an "official" solution, I decided to stop development. But I didn't want my work to go to waste, so I chose to release it for free, motivated by a desire to share it with the community. I made some changes to eliminate the backend. Now the extension stores all folder structures and content locally in Chrome storage. Luckily, I had some old code to reuse for this. The extension is now live on the Chrome Web Store. This project introduced me to a lot of new challenges with technologies I hadn’t used before, but I’m grateful for the experience and the skills I gained along the way. I hope you find it useful! Links to the extension and its website: https://bit.ly/4iFZRgL... https://bit.ly/3XJbaNd If you have any questions or suggestions, feel free to reach out in the comments or via email at georgepozdman@gmail.com. https://bit.ly/3XJbaNd March 11, 2025 at 12:11AM

Show HN: A Comprehensive, Compatible Open Source Alternative to Python Requests https://bit.ly/4bHzPHN

Show HN: A Comprehensive, Compatible Open Source Alternative to Python Requests https://bit.ly/4kDYvF4 March 10, 2025 at 08:05AM

Sunday, 9 March 2025

Show HN: Wordazzle – Become eloquent by mastering elegant words, powered by AI https://bit.ly/43xbgev

Show HN: Wordazzle – Become eloquent by mastering elegant words, powered by AI Wordazzle was born from my desire to learn as many elegant words as possible. The devil truly is in the details, and what constitutes "elegant" isn't exactly trivial to pin down. After a lot of prompt-tweaking, temperature fiddling, and experimenting with different AI models, I'm quite satisfied with the output, which I humbly present to HackerNews(again). Hope someone else finds this useful! https://bit.ly/41t60G8 March 10, 2025 at 12:39AM

Show HN: The first legal AI API https://bit.ly/41Pqmea

Saturday, 8 March 2025

Show HN: I am getting married Here's my wedding website https://bit.ly/3FgToe8

Show HN: I am getting married Here's my wedding website Hi HN, I am getting married soon, and being a software engineer, a wedding website, I thought, was a must. So here it is. I have open-sourced the code: https://bit.ly/3FcAsNs . It's a static website built with Astro and Starlight and deployed on Cloudflare Pages. I initially chose Github Pages, but then I thought why not try something new. I use Umami analytics as well for very basic analytics. I am pretty bad at CSS and styling, so I hope whatever is there looks just okay. Cheers! https://bit.ly/3FfCTyU March 9, 2025 at 04:48AM

Show HN: Syncing Govee lights with live sports https://bit.ly/4hjd1PK

Show HN: Syncing Govee lights with live sports Hello all! Last week, I made a post about making a website so we can sync govee lights with live sports scores. Y'all have been awesome and showed a lot of appreciation :). It's in a decent place to let some people give it a try, tell me what works, what doesn't etc. Currently, I made 2 scenes. Scene one is "game day morning", which will automatically turn your lights to the color of your team. This happens around 3am est, on game day. Scene two, is the classic scoring. Anytime your team scores, you can run a custom diy scene (that you created within the govee app) to play. This lasts 10 seconds then reverts back to your color. I have so many more "scenes" in the works and plan to release 1-2 a week. Im looking for beta testers to help get the timing down. Right now, it seems like sometimes the "scene" will run before a score is seen (especially if you're streaming the game), so I'm looking to make tweaks on the timing. These lights will only work with wifi controlled devices. If this sounds up your alley, please register at https://bit.ly/41Brv7O Note: after registering, you'll be brought to the dashboard where you can add your API key. There are instructions on that page how to do it. Please don't hesitate to reach out on here, or email [hello@stadium-weather.com] if you have any questions, feedback, etc. March 9, 2025 at 01:04AM

Show HN: I built an app to get daily wisdom from Mr. Worldwide https://bit.ly/4idJAjl

Show HN: I built an app to get daily wisdom from Mr. Worldwide Pitbull is coming to Stockholm. As a part of that prep, I built an app with glassmorphism style counting down to the big day https://bit.ly/3Fh3k7h March 9, 2025 at 01:04AM

Show HN: Can I run this LLM? (locally) https://bit.ly/4bF1zNd

Show HN: Can I run this LLM? (locally) One of the most frequent questions one faces while running LLMs locally is: I have xx RAM and yy GPU, Can I run zz LLM model ? I have vibe coded a simple application to help you with just that. https://bit.ly/3QRJkun March 9, 2025 at 12:08AM

Friday, 7 March 2025

Show HN: Mermaiditor – a free mermaid diagram editor https://bit.ly/43xdXNh

Show HN: Mermaiditor – a free mermaid diagram editor Hey HN, This is a mermaid editor with support for projects and multiple diagrams, export/import, and export as PNG/copy to clipboard as image. Built this thing as I like doing sequence diagrams in mermaid, but didn't find a free solution that would allow something flexible. Everything is stored in localstorage. https://bit.ly/3FdeKJ9 March 7, 2025 at 10:47PM

Show HN: Ming-wm: A 100% keyboard-operated desktop environment in Rust https://bit.ly/4i9s2F5

Show HN: Ming-wm: A 100% keyboard-operated desktop environment in Rust https://bit.ly/4hgZOqE March 7, 2025 at 07:54PM

Thursday, 6 March 2025

Show HN: OpenManus – open-source alternative of Manus AI https://bit.ly/3DoGM3U

Show HN: OpenManus – open-source alternative of Manus AI https://bit.ly/3FmPosc March 7, 2025 at 02:53AM

Show HN: Ariana – A time travel debugger for PY/JS right in VSCode https://bit.ly/43nIzAH

Show HN: Ariana – A time travel debugger for PY/JS right in VSCode Hello HN! I've recently released and open-sourced a time travel debugging VSCode extension for Python, Javascript & Typescript. https://bit.ly/43nIiOb It's born from the pain of spending hours reproducing bugs, struggling to read parallel streams of logging across client/server, and managing print/console.log statements. You can see a short video here: https://www.youtube.com/watch?v=M2gZv7IOo7s Basically its two parts: One part CLI called `ariana` that you install with npm/pip and run alongside your code's run command. For instance `ariana python main.py` or `ariana npm run dev`. It then instrumentizes your code using our specialized parsers & small language models (self-hosted version of the server that does that coming soon). The other part is a VSCode extension^(1). It picks up the traces left from running the code with the CLI. Then it lets you highlight the parts of the code that ran, and just by hovering any expression (or subpart of a complex expression), see which values it took. Our goals with this are: 1. Make time-travel debugging easy to use for new coders/vibe coders that would never use a normal debugger, let alone some advanced logging. 2. Allow debugging of across the stack, across components, across languages, parallel data flows super easily (typical pain point of maintaining AI agents codebases, multiplayer web games or RL training setups). In prod even some day when we have a more robust feature set. 3. Experiment with agents using time-travel debugging to fix code accurately in one shot without re-running the code or spending tokens producing print/log statements. 4. Make time-travel debugging applicable to fullstack & frontend development (we plan to sync your frontend's visual state with the traces). Some may ask why not interfacing with debuggers' APIs and instead rewriting code with tracing? I think it gives us maximal granularity and expressivity in the traces we get from the code to minimize performance issue and avoiding looking at non-sensical things. It also opens the door to using this in production in the future. Of course I'd be happy to discuss that further with you if you worked on similar projects in the past :) (1) https://bit.ly/3DbRs5Y... Thank you very much for your attention! https://bit.ly/43nIiOb March 7, 2025 at 12:32AM

Show HN: Uncloud – Uncomplicated container orchestration without control plane https://bit.ly/41J9e9P

Show HN: Uncloud – Uncomplicated container orchestration without control plane Hey HN, I'm building Uncloud — a lightweight clustering and container orchestration tool that lets you deploy and manage web apps across cloud VMs and bare metal with minimal cluster management overhead. After several years of managing and extending Kubernetes at a unicorn, I realised that I desperately needed a change. All those abstraction layers, unnecessary complexity, boilerplate… I wanted container orchestration to bring me joy again, the way Ansible did when I first tried it a decade ago, or Docker after that. That’s when I decided to start an experiment that is now called Uncloud. The core design principles I’ve focused on intentionally differ from the traditional container orchestrators like Kubernetes or Docker Swarm: - No control plane or master nodes – all machines are equal - P2P state synchronisation - Imperative operations over state reconciliation (fast feedback, easier troubleshooting) - Graceful handling of network partitions at the cost of eventual consistency - No advanced auto-healing or auto-scaling magic – predictable behavior instead I want well-designed building blocks that just work together. When a service needs high availability, I should be able to scale it across machines and know that if any machine goes down the remaining ones will continue serving traffic. When I deploy, I want immediate feedback, not wondering whether the reconciliation loop will eventually catch up. GitHub with more technical details and a demo: https://bit.ly/3DjeZBY It's not ready for production use yet, and I'd really love your feedback: 1. Am I alone in wanting a middle ground: something more sophisticated than basic Docker/Compose but without the operational complexity of Kubernetes? 2. If you've moved from platforms like EKS/Heroku/Render/Fly to self-hosting: what was the breaking point and what did you lose or gain in the transition? 3. If you're using tools like Kamal, Dokku, Coolify, or Dokploy, what are your biggest pain points? https://bit.ly/3DjeZBY March 6, 2025 at 11:35PM

Show HN: Testeranto – the AI driven test framework for TypeScript projects https://bit.ly/3Dj8saq

Show HN: Testeranto – the AI driven test framework for TypeScript projects Today I am introducing HN to my sideproject 'testeranto'. It is a test framework for TS projects which leverages Aider to automatically fix broken tests. tl;dr: https://www.youtube.com/watch?v=WvU5xMqGi6Q https://bit.ly/43osZVy March 7, 2025 at 12:15AM

Music046 | Nigeria No1. Daily Updates | Contact Us - +2349077287056