Open-source Document Parser to Markdown with OCR/LLMs
Megaparse is a file parser optimized for LLM Ingestion. It can parse PDFs, DOCX, PPTX in a format that is ideal for LLMs. All of that accessible from a python package, an API, or a queue.
Hi everyone,
Today I’d like to introduce you to the new Quivr project. It a simple python package, API that helps you take in documents such as PDFs, Docx, PPTx, ... and turn them into Markown
It has several new abilities:
* OCR
* Vision Models
* Table Optimization in the extraction
* Open-source
You can use it in any of your products where you need to parse file to then send them to an LLM or simply store it
Here is how to get started:
* Go to https://github.com/QuivrHQ/MegaP...
* pip install megaparse
* Have fun
Give it a try! We’d love to hear your feedback and ideas in the comments.
This is part of Supabase mega Launch Week -> https://launchweek.dev/HOME
About Megaparse [LW24] on Product Hunt
“Open-source Document Parser to Markdown with OCR/LLMs”
Megaparse [LW24] launched on Product Hunt on December 3rd, 2024 and earned 306 upvotes and 18 comments, placing #10 on the daily leaderboard. Megaparse is a file parser optimized for LLM Ingestion. It can parse PDFs, DOCX, PPTX in a format that is ideal for LLMs. All of that accessible from a python package, an API, or a queue.
On the analytics side, Megaparse [LW24] competes within Developer Tools and GitHub — topics that collectively have 552.2k followers on Product Hunt. The dashboard above tracks how Megaparse [LW24] performed against the three products that launched closest to it on the same day.
Who hunted Megaparse [LW24]?
Megaparse [LW24] was hunted by Stan Girard. A “hunter” on Product Hunt is the community member who submits a product to the platform — uploading the images, the link, and tagging the makers behind it. Hunters typically write the first comment explaining why a product is worth attention, and their followers are notified the moment they post. Around 79% of featured launches on Product Hunt are self-hunted by their makers, but a well-known hunter still acts as a signal of quality to the rest of the community. See the full all-time top hunters leaderboard to discover who is shaping the Product Hunt ecosystem.
For a complete overview of Megaparse [LW24] including community comment highlights and product details, visit the product overview.