Rendered at 16:34:35 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
d4rkp4ttern 23 hours ago [-]
A related idea is to have the LLM quiz you, Socratic-style about a topic of interest. It persists in asking questions at deeper levels until you arrive at the answer yourself. This forces you to think hard about a problem, and this effort helps with understanding, learning and retention. Of course I made a Socratic-quiz skill for this, to use with any coding agent or similar:
For example I’ve used this to better understand counter-intuitive things about diabetes/insulin, dopamine and motivation, Claude’s implementations, etc (to combat so-called cognitive debt).
Strong LLMs are surprisingly good at this type of quizzing, they display a semblance of “theory of mind”.
OtherShrezzing 22 hours ago [-]
How do you deal with context length degradation here?
The harder questions will only arrive when the context is getting full.
danieljacksonno 21 hours ago [-]
If you have it question you for 1M tokens (aka the full length of the Wheel of Time series), I think your own context might get full before the LLMs.
Right, even a conservative 200k context length is on the order of 200 pages, which is more than enough context to arrive at an answer.
raincole 2 hours ago [-]
If it's a well-known concept (like pretty much anything you can find from undergraduate textbook), the LLM doesn't need the whole context to teach you.
If it's something actually novel, no matter how much context you provide it'll still hallucinate.
altmanaltman 13 hours ago [-]
I think this might be useful if you are supplementing your learning from actual sources. Like you casually said you understood counter-intuitive things about diabetes/insulin, dopamin and motivation etc but these are very complex topics and require a lot of study to fully "understand". Its okay if you just see it as curiosity-driven learning but I dont see this as a way to learn anything that is actually important or serious.
Traditional method of looking up stuff, going through guided lessons etc are just more streamlined and faster than this method.
tonyhart7 11 hours ago [-]
if your power is LLM, what are you without it ???
UltraSane 2 minutes ago [-]
If your power is literacy, what are you without it?
josmar 11 hours ago [-]
Without it, I was a stack overflow lurker, and before that a forum reader
visha1v 10 hours ago [-]
i suspect a lot of us started out as encyclopedia readers before we discovered that strangers on the internet would answer questions for free
tonyhart7 6 hours ago [-]
say wallahi
nomel 16 hours ago [-]
I think there's going to be the exact and precise range of people there has always been: some people are curious, and want/need to understand what they're doing, some people are not and just want to do. That want/need is a fundamental personality trait what makes an expert.
LLM are a dream come true for these curious types. They'll only be accelerated by them. I don't think there's any real "loss" out there, just a bunch of people that don't care boing things easier. Good for them, good for the curious people. Net win.
dtj1123 6 hours ago [-]
The issue is that there are now different expectations surrounding velocity and apparent productivity. Previously an individual new to web development might have genuinely needed to develop an understanding of web protocols, HTML and other stuff in that domain in addition to actually wanting to wrap their head around it. Nowadays spending the time required to understand how a website works carries a heavy opportunity cost, at least in the short term.
suprjami 5 hours ago [-]
An organisation which lets a person release a website without understanding how the web works is just asking for trouble. This was true before Transformer LLMs and is true after them.
LLMs probably just let those organisations make a larger and more load-bearing website faster, so that poor decisions speed the time to catastrophe.
dtj1123 1 minutes ago [-]
What's changed is the minimum amount of knowledge required to release a website.
I've seen things.
BoxFour 4 hours ago [-]
> Previously an individual new to web development might have genuinely needed to develop an understanding of web protocols, HTML, and other stuff in that domain
That hasn't really been my experience, at least not for a long time (decades+). At least since jQuery's been around. There's always been a large group of people whose approach to software development was basically "run these commands, don't worry about why they work".
The tendency to treat tools as black boxes over time isn't new, I don't think.
A silver lining here is that in my experience the non-curious tended to stagnate over time and the curious tended to succeed, but obviously there's more to it than just that.
embedding-shape 15 hours ago [-]
I guess I'm both people, depends on what it is and what other things I'm doing. Sometimes it's fun and interesting to delve deeply (oh no he didn't) into something for long periods of time, just for the sake of exploring and understanding. Sometimes I'm not curious about why something specifically doesn't work, I just need it to work so I'll hack around it any way possible so I can move on with what I really wanted to do.
I find that LLMs are useful for both cases in the end.
FeteCommuniste 14 hours ago [-]
Some people will probably be curious regardless of their environment but I think there are others who could be swayed into developing curiosity (or not) by life experiences. And an ever-present "just gimme the answer now" button could be a powerful force pulling them toward the "incurious" side.
dchuk 1 days ago [-]
I’ve been using this general pattern - a custom cli app for deterministic tasks, skills for the agent harness, run the skills in the agent and it produces artifacts for you by using the cli and its own agentic reasoning - a lot lately for work. Things like “give me an executive brief of the activity in these teams backlogs over the last month” and in 5-10 minutes I have a few page doc I can read that is cited with the tickets it analyzed and I don’t have to go bug people or ask them to do yet another task for me, just make sure your backlog is updated and detailed like normal practice. It’s awesome and really fits a useful spot between pure agent usage (which is hard to get consistent results from on repeat tasks) and not having to build/buy a full blown app for every random thing.
derefr 22 hours ago [-]
This approach works well, I agree. But I keep wishing that I could invert it. The architecture I feel like I keep yearning for, is a traditional CLI program that encodes most workflow knowledge/decisions as real code; but which does "just a little bit of coding agent invocation" during one specific workflow step.
Not sure how to accomplish this. Anyone have any suggestions? Are there libraries for this yet? (And how would they even work? It feels like, to do this right, there would have to be some background service that CLI software could expect to interact with via a well-known local IPC socket — similar to how e.g. the docker daemon works. But I'm unaware of any coding agent software/frameworks that expose such an IPC capability...)
didgeoridoo 22 hours ago [-]
I’m building this! It was originally designed for human accessibility for interactive CLIs, but it turned out to be really useful for giving agents the ability to follow structured workflows.
It runs as a background terminal that the agent can observe, and then exposes all interaction options as structured commands that can be run from the foreground CLI which then update the state of the background terminal via IPC. My hope is to establish a sort of “ARIA for terminals” standard to improve accessibility for both humans and agents. Email in profile, ping me if you’re interested in giving it a spin (just have plugins for Inquirer + Commander right now, hoping to broaden to other frameworks & TUIs soon).
devenjarvis 22 hours ago [-]
I reverted this due to impending billing changes, but Claude and most LLM providers to my knowledge do offer a way to directly fire a prompt to the LLM in a "headless" or non-interactive mode. Specifically "claude -p <your_prompt_here>" is the way to do it with Claude Code. It allows for using the agent to do a one-off command with a given structured prompt. Originally Lathe would use this from the Go application to allow you to extend a tutorial directly from the UI without directly interacting with the LLM.
You'd have to exec out, so it's alittle clunkier than an IPC, but I think you could achieve what you want with it.
derefr 19 hours ago [-]
That's almost it, yes.
But in my experience, to actually get where they're going quickly (as opposed to spending hours and hundreds of dollars stumbling around in the dark), coding agents generally need more interactive hand-holding than that. If you just fire off one non-interactive session and wait for it to come to a stop, the problem usually isn't fully+correctly solved at the point at the LLM decides to "finish." And if you then start another non-interactive session to continue the work, the new session will have lost the old session's state/memory/context, and so will stumble through many of the same mistakes / misapprehensions.
What you really want, for a CLI program with a "use coding agent to do X" workflow-step, is for the CLI program to play the role of a human in a temporary durable coding-agent conversation session: prompting the agent; then waiting for it to finish responding (and side-effecting); then either asking the agent itself to evaluate an "am I done yet" predicate with a constrained output syntax; or having the CLI program do its own out-of-band validation of the changes made to the shared state by the agent; where, in either case, if the agent isn't "done yet", then the workflow step must continue poking it — or prompt the human to make a decision on how to proceed (possibly involving providing direct input to the LLM, but this is not ideal; ideally the CLI "abstracts away" the need for the end-user to understand the intricacies of the conversation the program is having with the LLM. Even more ideally, the conversation just whizzes by and the human doesn't have to think about an LLM being involved at all.)
Basically, think of this not as the CLI program saying to an agent "answer me this question" or "edit this file for me", but rather, the CLI program popping open a mini "guided + 99%-of-the-time automated" TUI coding-agent micro-IDE "inside" the workflow, in about the same way that git pops open your EDITOR inside `git commit`.
frumiousirc 5 hours ago [-]
> Basically, think of this not as the CLI program saying to an agent "answer me this question" or "edit this file for me", but rather, the CLI program popping open a mini "guided + 99%-of-the-time automated" TUI coding-agent micro-IDE "inside" the workflow, in about the same way that git pops open your EDITOR inside `git commit`.
Isn't this simply having your mechanistic script call `claude "Prompt that is well honed to provide a mini, guided, 99%-of-the-time automated LLM action to $THE_THING"`? And, possibly including some `--allowed-tools`?
devenjarvis 24 hours ago [-]
I agree! I want to say I first saw this pattern in some work Simon Willison did (Rodney and Showboat). For certain workflows the pair of Skills + CLI give me a nice balance between the flexibility of LLMs and the consistency of a CLI.
edot 23 hours ago [-]
Can you give some examples of the deterministic tasks? So in your example, was the deterministic task “fetch this team’s backlog”? And then the LLM parts are “process each backlog” and “combine a summary”?
zuzululu 10 hours ago [-]
I think this is quite a refreshing idea
LLM's big point is that it is an excellent learning tool
Lot of people want to generate stuff from it
but perhaps overlooked is the knowledge you can gain from it
it is the best tutor you will ever have!
btw it sucks that you have to disclose if you are trying to make a buck from a project or not.
Making money shouldn't be vilified or frowned upon.
rdksu 22 hours ago [-]
I have updated the popular /grill-me skill for this exact purpose! I had a very insightful grilling session yesterday on what exactly happens when you try to load an extremely large dataset in pandas, covering everything down to the last detail !
smallerize 21 hours ago [-]
Do you have that version published anywhere?
andai 20 hours ago [-]
Hey this is neat!
I was telling my friend the other day. The way you learn programming is by typing code out by hand. And I suggested using LLMs to generate minimal educational examples aligned with his interests and needs.
I've tried the Zed Shaw method to learning programming (just typing out code examples by hand -- doing "studies", the same way you would with music or art). I tested it on a programming language I had been learning for a while and was struggling with. After just a few hours of typing my fluency had skyrocketed.
I realized that in several hours of typing I had written more code than in weeks of study. Because when you don't know a language yet, producing code is extremely slow and error prone. But typing out correct code is relatively straightforward.
So due to changing my approach to "just blindly typing", I got more practice (at least as far as reading and muscle memory goes) in a few hours than the previous few weeks.
Now of course understanding is important too, but it's a separate dimension, and largely comes after memory and fluency in my experience. (Understanding something theoretically and being able to use it are two very different things!)
The general principle here is Stephen Krashen's Input Hypothesis of language acquisition (https://en.wikipedia.org/wiki/Input_hypothesis) which says a baby learns language by just hearing stuff -- just being exposed to inputs -- and that adults can learn the same way too.
And I heard it on the excellent website (now defunct?) All Japanese All The Time, where the author tested the hypothesis on himself by mostly listening to a lot of Japanese and gained fluency in a year.
This is a very cool idea, feels like a sane way to use LLMs in this crazy time! Could be a very good way to break the ice when starting a new project and everything is friction.
devenjarvis 1 days ago [-]
Yea that’s definitely been a primary usecase for me! Easing the barrier to entry into a new project, and giving me the foundation to take it further on my own once I’m comfortable.
helixfelix 3 hours ago [-]
Can I use this to learn about a complex private repo or how it does things?
mmarian 21 hours ago [-]
I think you're tackling an interesting area. I was thinking of something similar for system design prep. I experimented with a couple of series of blog posts - one for designing Twitter, another for WhatsApp: https://prepcommons.com/.
Still, it took a lot more effort than just delivering the initial request. AI makes everyone produce something average but you still need taste to produce something good - I guess this applies to courses too.
schmorptron 1 days ago [-]
Cool project! I'll be trying it out. I've been a big fan of throwing whatever sources I have on a new topic i'm trying to get into into a llm "project" and then asking it to teach me, grounded on the actual content to speed things up.
But at the same time, I'm afraid getting everything laid out for you in exactly the way you want will erode some of the understanding you build by going through a primary source directly and figuring things out the hard way. So this having more focus on actually doing stuff by yourself seems right up my alley (while still tending to the LLM induced intellecutal laziness... ) .
ramon156 1 days ago [-]
What I'm more looking at is your own experience with a vibed tool. I cannot really tell from this introduction whether you actually use and like it (you mentioned you use it and sometimes push back, which is a learning strategy of its own?)
Also, I wouldn't say "have another model test the tutorial compiles" a feature, but also I do not expect a fool-proof tutorial from a one-shot, I guess.
Not sure why I would try this over a hand-written promot. Also wondering why ChatGPT Study mode failed, it seemed interesting.
devenjarvis 1 days ago [-]
I've been using it quite a bit and I like it a lot! You certainly could roll your own prompt for this. The value I'm seeing is in the reusable skill/prompt to structure tutorials in a way that help me think and learn a new concept (rather than Claude just giving me code to copy/paste), and the local UI that makes working through the tutorial much more pleasant than scrolling through Claude's markdown output. Plus tutorial series are persistent so I can easily come back around later with a `/lathe-extend` to explore an extension to a topic/tutorial I'm interested in.
That said, it's been a tool that's been helpful for me personally, but doesn't have to be for everyone! I've never used ChatGPT Study, I'll look into it more. Thanks for sharing!
tam159 5 hours ago [-]
Do you have any evaluations/ comparisons between small open source LLM and frontier LLM in the Lathe?
Galanwe 19 hours ago [-]
It's very cool, and I can really see myself use that, but not in that form of deliverable.
See the best place I learn and read through materials is when I'm commuting. Far away from a console.
Could you envision a way to deliver this as a web app linked to e.g. an OpenRouter/Anthropic/OpenAI API key?
te_chris 1 hours ago [-]
Nice. I've got Claude code teaching me maths at the moment. Some skills, file system learning log, oss text books and a full curriculum based on Math Academy (which I was doing before but got bored of) and UK high school and university. Teaching me concept-first though. So we start with something complex and go up and down until I get it all. It's not necessarily thorough for everything, but the depth of my intuition is much better and each time I use it I find myself unlocking another sector of the map. I love LLMs for this.
6 hours ago [-]
zjy71055 15 hours ago [-]
I've had claude write my rust tools for a year without learning any rust. gonna try this to finally learn it for real.
mobiuscog 19 hours ago [-]
I have been using a similar skill (built over a few iterations) that builds whatever I ask, through a series of milestones, and then creates a full tutorial to follow in markdown and uses zola to turn it into a full static site.
90% of my Claude usage is getting it to write me guides, that I can then spend most of my time following to build the end results.
Keeps the brain healthy and also provides bespoke learning, rather than a generic course off the internet. Definitely a great use of AI.
Arubis 20 hours ago [-]
For a somewhat hybrid approach here, have a look at https://github.com/DrCatHicks/learning-opportunities — the idea is to be used during “productive work” (so it’s not purely learning-oriented as with your repo here), and to interject as you work to ensure that you learn related concepts as you go.
23 hours ago [-]
visarga 21 hours ago [-]
I just use md files to plan questions, track my answers, and implement rehearsals for concepts that need more repetition from claude code. And I start from a good book or documentation as source material, first the agent reads the learning material and structures it for learning.
threecheese 23 hours ago [-]
Did you write the skill.md files yourself? I often wonder this; there’s so much text in most skills, and I can’t imagine it’s human generated.
I don’t write my own - I can’t optimize for the models understanding, and so I just give the skill-creator skill an outline and then have it refine until the output is what I want.
f311a 22 hours ago [-]
> If you can find resources to learn something that was written by a human, read that first. But Lathe is here to fill in the gaps when that isn't the case
Well, but it will still serve you content from humans, but without any attribution.
4b11b4 1 days ago [-]
I like the idea and I know you explicitly address this but wonder if still it could search for human made works for you to learn from first
If it does find some, maybe it could supplement them instead of just from scratch
james_marks 1 days ago [-]
Love this idea, can’t wait to try it. Thank you for sharing!
devenjarvis 1 days ago [-]
Thanks for checking it out!
TonyAlicea10 22 hours ago [-]
There is of course a degree of true usefulness to this. However I’ve been a technical educator for years and I’ve tried to do lots with LLMs.
Even now, LLMs are terrible educators. They do not make coherent progressive curriculums. They hallucinate details which the student will not have the knowledge to challenge.
If you use an LLM to make a tutorial you will get some benefit for sure, especially if you use it for Socratic sessions based on a corpus of data you provide (like a blog post or documentation).
Don’t expect it to teach you reliably though. It feels good to ask the LLM whatever you want, but if you’re learning a topic you don’t have the instinct to realize when it’s giving you a poorly chosen progression of information or teaching you something flat out untrue.
devenjarvis 22 hours ago [-]
Really appreciate your perspective here! I do _not_ have a background in technical education, and am certain you've used and seen the failure modes of LLMs in this space far more than I have so far.
A few thoughts based on my limited experience building and using lathe:
- Part of the lathe skills are to first find source materials to base curriculum on. It's not foolproof by any means, nor is it a novel approach, but it's helped ground the content in reality more than an open ended prompt usually does (in my experience)
- We're scoped to tutorials, over full blown curriculum. I found having lathe write one part of a tutorial series at a time, over the whole thing at once, usually gave me better results (and is why `/lathe-extend` is a thing)
- To your point about not having the instinct to realize when it's giving me a poor progression or untrue content, my experience is that by actually writing the program the tutorial walks me through, I get definitive proof of if the results are true or not. One of the most impactful (and all too frequent) answers I got as a young programmer was "write a program and find out" and it's still good advice today. Not at all proposing this makes lathe tutorials infallible, but in the context and scope of the project it seems to take the bite out of the worst failure modes here. That said, maybe that implies lathe is most useful and least dangerous in the hands of an experience developer looking to learn a new domain, over someone looking to build foundational technical (and technical learning) skills? I'll think that over!
I'm really curious what your experience would lead you to think about the above though? Are there critical failure modes for LLMs writing hands on technical education I just haven't tripped over yet?
TonyAlicea10 13 hours ago [-]
Source materials is great. Having the LLM write one part of a tutorial prevents you from asking it for a progressive curriculum which helps. If I give an LLM a sequence I want to learn, or an outline, it does much better.
Context and scope limitations are also helpful, as you mention. And yes, having experience in a domain makes learning with an LLM a dramatically different experience than from-scratch, since the LLM is nudged in different directions by our responses. When a novice uses an LLM to learn, the questions they ask the LLM can drive it in directions and hallucinations that would look obvious to an experienced person.
The worst failure mode is what I mentioned: the novice asking the wrong questions or driving the LLM in the wrong direction. Inference is strongly influenced by input tokens, and that's fairly unavoidable.
I don't mean to say your project doesn't have value though! I hope people use LLMs to help them learn (by directing them to good source materials from humans) rather than just asking it to do things for them and blindly trusting the results.
protocolture 17 hours ago [-]
Part of why old tech articles are so instructive, is that they tend to have 1 command or feature thats out of date. You need to then go off and figure out how to resolve the missing piece of the tutorial yourself. I see this as a likely common feature of LLMs
tmountain 21 hours ago [-]
I have been working on a language learning app for myself, and I am using a textbook that I like as the basis for an Anki inspired “learning tree”. This is working pretty well because I can build progressions from the original table of contents.
TonyAlicea10 13 hours ago [-]
Yes, generally if you get a decent progression from an LLM, I find it's copying directly from a source, like a book or course. Giving it a progression helps a lot.
28304283409234 1 days ago [-]
Nice! I do this now locally with LLMS and ollama and my own havky prompts. I could not find if this also supports ollama?
devenjarvis 24 hours ago [-]
Thanks for checking it out! ollama wasn't top of my list for support, just because I don't have a machine powerful enough to run decent local LLMs (I wish I did!). I'll look into it though, nothing here should be locked in to any one LLM, as long as it has the concept of a skill/slash command/reusable prompt.
Someone else asked about Gemini, so I think broader LLM support will be my focus for v0.4.0
urax 1 hours ago [-]
[dead]
Sathwickp 21 hours ago [-]
Maybe add voice to it so that it reads the tutorial out loud and listen to it lessons on the move?
kaeluka 1 days ago [-]
great, i'll try this. something like this has on my list and i'm super curious :)
mixtureoftakes 1 days ago [-]
We have notebooklm at home? Is there any comparison between these two, looks nice
devenjarvis 24 hours ago [-]
Thanks for sharing NotebookLM, I hadn't seen that! I'll take a look and add a comparison to the README if it's compelling.
troymc 22 hours ago [-]
In my opinion, the coolest thing in NotebookLM is the podcast-episode-generator. Each one sounds like two people having a conversation. It's fun to listen to a podcast episode about some niche topic (e.g. nuclear isomers, or the Weyl curvature tensor) while I'm cooking or driving.
I like this framing a lot: using LLMs to stay in contact with the material, not to skip past it.
In coding-agent work I see a similar pattern. The best outcomes usually happen when the agent is forced to study concrete source material first: real repos, real docs, real examples, and the constraints behind them. The worse outcomes happen when it generates a plausible path from a vague prompt and never has to reconcile that with existing practice.
For learning, I imagine the same thing matters: the LLM should help structure the path and explain the friction, but the learner still needs to touch the code and compare against sources.
The source-backed part feels more important than the generated tutorial part.
gnabgib 15 hours ago [-]
(Please) Don't post generated text or AI-edited text. HN is for conversation between humans.
https://pchalasani.github.io/claude-code-tools/plugins-detai...
For example I’ve used this to better understand counter-intuitive things about diabetes/insulin, dopamine and motivation, Claude’s implementations, etc (to combat so-called cognitive debt).
Strong LLMs are surprisingly good at this type of quizzing, they display a semblance of “theory of mind”.
The harder questions will only arrive when the context is getting full.
If it's something actually novel, no matter how much context you provide it'll still hallucinate.
Traditional method of looking up stuff, going through guided lessons etc are just more streamlined and faster than this method.
LLM are a dream come true for these curious types. They'll only be accelerated by them. I don't think there's any real "loss" out there, just a bunch of people that don't care boing things easier. Good for them, good for the curious people. Net win.
LLMs probably just let those organisations make a larger and more load-bearing website faster, so that poor decisions speed the time to catastrophe.
I've seen things.
That hasn't really been my experience, at least not for a long time (decades+). At least since jQuery's been around. There's always been a large group of people whose approach to software development was basically "run these commands, don't worry about why they work".
The tendency to treat tools as black boxes over time isn't new, I don't think.
A silver lining here is that in my experience the non-curious tended to stagnate over time and the curious tended to succeed, but obviously there's more to it than just that.
I find that LLMs are useful for both cases in the end.
Not sure how to accomplish this. Anyone have any suggestions? Are there libraries for this yet? (And how would they even work? It feels like, to do this right, there would have to be some background service that CLI software could expect to interact with via a well-known local IPC socket — similar to how e.g. the docker daemon works. But I'm unaware of any coding agent software/frameworks that expose such an IPC capability...)
It runs as a background terminal that the agent can observe, and then exposes all interaction options as structured commands that can be run from the foreground CLI which then update the state of the background terminal via IPC. My hope is to establish a sort of “ARIA for terminals” standard to improve accessibility for both humans and agents. Email in profile, ping me if you’re interested in giving it a spin (just have plugins for Inquirer + Commander right now, hoping to broaden to other frameworks & TUIs soon).
You'd have to exec out, so it's alittle clunkier than an IPC, but I think you could achieve what you want with it.
But in my experience, to actually get where they're going quickly (as opposed to spending hours and hundreds of dollars stumbling around in the dark), coding agents generally need more interactive hand-holding than that. If you just fire off one non-interactive session and wait for it to come to a stop, the problem usually isn't fully+correctly solved at the point at the LLM decides to "finish." And if you then start another non-interactive session to continue the work, the new session will have lost the old session's state/memory/context, and so will stumble through many of the same mistakes / misapprehensions.
What you really want, for a CLI program with a "use coding agent to do X" workflow-step, is for the CLI program to play the role of a human in a temporary durable coding-agent conversation session: prompting the agent; then waiting for it to finish responding (and side-effecting); then either asking the agent itself to evaluate an "am I done yet" predicate with a constrained output syntax; or having the CLI program do its own out-of-band validation of the changes made to the shared state by the agent; where, in either case, if the agent isn't "done yet", then the workflow step must continue poking it — or prompt the human to make a decision on how to proceed (possibly involving providing direct input to the LLM, but this is not ideal; ideally the CLI "abstracts away" the need for the end-user to understand the intricacies of the conversation the program is having with the LLM. Even more ideally, the conversation just whizzes by and the human doesn't have to think about an LLM being involved at all.)
Basically, think of this not as the CLI program saying to an agent "answer me this question" or "edit this file for me", but rather, the CLI program popping open a mini "guided + 99%-of-the-time automated" TUI coding-agent micro-IDE "inside" the workflow, in about the same way that git pops open your EDITOR inside `git commit`.
Isn't this simply having your mechanistic script call `claude "Prompt that is well honed to provide a mini, guided, 99%-of-the-time automated LLM action to $THE_THING"`? And, possibly including some `--allowed-tools`?
LLM's big point is that it is an excellent learning tool
Lot of people want to generate stuff from it
but perhaps overlooked is the knowledge you can gain from it
it is the best tutor you will ever have!
btw it sucks that you have to disclose if you are trying to make a buck from a project or not.
Making money shouldn't be vilified or frowned upon.
I was telling my friend the other day. The way you learn programming is by typing code out by hand. And I suggested using LLMs to generate minimal educational examples aligned with his interests and needs.
I've tried the Zed Shaw method to learning programming (just typing out code examples by hand -- doing "studies", the same way you would with music or art). I tested it on a programming language I had been learning for a while and was struggling with. After just a few hours of typing my fluency had skyrocketed.
I realized that in several hours of typing I had written more code than in weeks of study. Because when you don't know a language yet, producing code is extremely slow and error prone. But typing out correct code is relatively straightforward.
So due to changing my approach to "just blindly typing", I got more practice (at least as far as reading and muscle memory goes) in a few hours than the previous few weeks.
Now of course understanding is important too, but it's a separate dimension, and largely comes after memory and fluency in my experience. (Understanding something theoretically and being able to use it are two very different things!)
The general principle here is Stephen Krashen's Input Hypothesis of language acquisition (https://en.wikipedia.org/wiki/Input_hypothesis) which says a baby learns language by just hearing stuff -- just being exposed to inputs -- and that adults can learn the same way too.
And I heard it on the excellent website (now defunct?) All Japanese All The Time, where the author tested the hypothesis on himself by mostly listening to a lot of Japanese and gained fluency in a year.
https://web.archive.org/web/20080705194055/http://www.alljap...
Still, it took a lot more effort than just delivering the initial request. AI makes everyone produce something average but you still need taste to produce something good - I guess this applies to courses too.
But at the same time, I'm afraid getting everything laid out for you in exactly the way you want will erode some of the understanding you build by going through a primary source directly and figuring things out the hard way. So this having more focus on actually doing stuff by yourself seems right up my alley (while still tending to the LLM induced intellecutal laziness... ) .
Also, I wouldn't say "have another model test the tutorial compiles" a feature, but also I do not expect a fool-proof tutorial from a one-shot, I guess.
Not sure why I would try this over a hand-written promot. Also wondering why ChatGPT Study mode failed, it seemed interesting.
That said, it's been a tool that's been helpful for me personally, but doesn't have to be for everyone! I've never used ChatGPT Study, I'll look into it more. Thanks for sharing!
See the best place I learn and read through materials is when I'm commuting. Far away from a console.
Could you envision a way to deliver this as a web app linked to e.g. an OpenRouter/Anthropic/OpenAI API key?
90% of my Claude usage is getting it to write me guides, that I can then spend most of my time following to build the end results.
Keeps the brain healthy and also provides bespoke learning, rather than a generic course off the internet. Definitely a great use of AI.
I don’t write my own - I can’t optimize for the models understanding, and so I just give the skill-creator skill an outline and then have it refine until the output is what I want.
Well, but it will still serve you content from humans, but without any attribution.
If it does find some, maybe it could supplement them instead of just from scratch
Even now, LLMs are terrible educators. They do not make coherent progressive curriculums. They hallucinate details which the student will not have the knowledge to challenge.
If you use an LLM to make a tutorial you will get some benefit for sure, especially if you use it for Socratic sessions based on a corpus of data you provide (like a blog post or documentation).
Don’t expect it to teach you reliably though. It feels good to ask the LLM whatever you want, but if you’re learning a topic you don’t have the instinct to realize when it’s giving you a poorly chosen progression of information or teaching you something flat out untrue.
A few thoughts based on my limited experience building and using lathe:
- Part of the lathe skills are to first find source materials to base curriculum on. It's not foolproof by any means, nor is it a novel approach, but it's helped ground the content in reality more than an open ended prompt usually does (in my experience)
- We're scoped to tutorials, over full blown curriculum. I found having lathe write one part of a tutorial series at a time, over the whole thing at once, usually gave me better results (and is why `/lathe-extend` is a thing)
- To your point about not having the instinct to realize when it's giving me a poor progression or untrue content, my experience is that by actually writing the program the tutorial walks me through, I get definitive proof of if the results are true or not. One of the most impactful (and all too frequent) answers I got as a young programmer was "write a program and find out" and it's still good advice today. Not at all proposing this makes lathe tutorials infallible, but in the context and scope of the project it seems to take the bite out of the worst failure modes here. That said, maybe that implies lathe is most useful and least dangerous in the hands of an experience developer looking to learn a new domain, over someone looking to build foundational technical (and technical learning) skills? I'll think that over!
I'm really curious what your experience would lead you to think about the above though? Are there critical failure modes for LLMs writing hands on technical education I just haven't tripped over yet?
Context and scope limitations are also helpful, as you mention. And yes, having experience in a domain makes learning with an LLM a dramatically different experience than from-scratch, since the LLM is nudged in different directions by our responses. When a novice uses an LLM to learn, the questions they ask the LLM can drive it in directions and hallucinations that would look obvious to an experienced person.
The worst failure mode is what I mentioned: the novice asking the wrong questions or driving the LLM in the wrong direction. Inference is strongly influenced by input tokens, and that's fairly unavoidable.
I don't mean to say your project doesn't have value though! I hope people use LLMs to help them learn (by directing them to good source materials from humans) rather than just asking it to do things for them and blindly trusting the results.
Someone else asked about Gemini, so I think broader LLM support will be my focus for v0.4.0
In coding-agent work I see a similar pattern. The best outcomes usually happen when the agent is forced to study concrete source material first: real repos, real docs, real examples, and the constraints behind them. The worse outcomes happen when it generates a plausible path from a vague prompt and never has to reconcile that with existing practice.
For learning, I imagine the same thing matters: the LLM should help structure the path and explain the friction, but the learner still needs to touch the code and compare against sources.
The source-backed part feels more important than the generated tutorial part.
> https://news.ycombinator.com/newsguidelines.html