Monday 30 October 2023

Show HN: EnfinBref- {GPT3-5|Mistral-7B} YouTube summaries, segment by segment https://bit.ly/3FEGxz6

Show HN: EnfinBref- {GPT3-5|Mistral-7B} YouTube summaries, segment by segment A neat (in my opinion) little side-project I've been working on, both to get somewhat basic React skills going, and to work with LLMs on even more cool projects to build. It should work for most major languages and output English summaries (or French summaries, if using the main https://bit.ly/3FAzw2t page instead of the /en/ subpage), no matter the input language. Currently planning on expanding in various directions, including some nice new features like choosing a summary type, better video type identification and LLM routing, and bullet points exec summaries. Pretty basic on functionalities at the moment, and relying on a few tricks. The key stack: - FastAPI + Python backend, with some extra libs for type validation (Pydantic), translation and YouTube transcript fetching. - Chained LLM calls with logic. id video type w/ a light model, break down into segments and sections, parallelise as much as can be, general high level summaries. - Models are a mix of Mistral fine-tune and GPT-3.5, with prompts tailored to the identified type of content and the current context. - Front-end is my first foray into React + Tailwind, with my last front-end experience before that being jQuery. Inspired by a post a while back about Summary Cat, but with a more in-depth approach: all summaries are segment-by-segment to get a more in-depth view at potentially complex videos. Segments are defined as being 3mn long for short videos, 5mn for longer ones. Anything above 45mn is broken down into 45 minute sections, both for ease of context length handling (solidly into gpt-3.5-16k territory, which is already more annoying to run than Mistral-7B, and any further would require GPT-4) and because things get a bit murkier to handle in terms of clarity when going above that limit. (the name is from a common French idiom for "anyway") https://bit.ly/46T0luB October 31, 2023 at 12:56AM

No comments:

Post a Comment